Databasebigdata
The 4 Universities Data Set. From the description: "This data set contains WWW-pages collected from computer science departments of various universities in January 1997 by the World Wide Knowledge Base(Web->Kb) project of the CMU text learning group. The 8,282 pages were manually classified into the following categories: student (1641), faculty (1124), staff (137), department (182), course (930), project (504), other (3764)." This MLcomp dataset includes the pages from cornell, texas, washington and misc in the training set and uses the wisconsin pages as the test set. The MIME headers and HTML tags were removed. See: [ http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/ ]
DocumentClassification
wer
20M
processed
open
Login required!
7019
1263
7


Run a program on this dataset Arrow_right


Existing runs on bigdata 1-2 of 2   Action_refresh_blue
ID Program Dataset Tuned hyper. User Updated Status Total time Memory Error >>
Run #46714 icsiboost-bigram bigdata no wer 323d1h ago failed 4m49s 37M
Run #46715 icsiboost bigdata no wer 323d1h ago failed 4m41s 38M


Processing details Arrow_right


Comments:


Must be logged in to post comments.