Note: this page autoupdates while a run is in progress
(see end of log file)
===== MAIN: learn based on training data =====
=== START program1: ./run learn ../dataset2/train
rm: cannot remove `task.*': No such file or directory
WARNING: feature count cutoff is experimental, at icsiboost.c:1887
task.data: 0ERROR: wrong number of columns (2), "doc2045,sean coffei se&aacut ; n coffei se&aacut ; n ( aka john t. ) coffei is an associ professor at the univers of michigan <comma> ann arbor <comma> depart of electr engin and comput scienc <comma> and a member of the commun and signal process laboratory. e-mail : scoffei @ eecs.umich.edu mail : rm. 4238 eec build <comma> univers of michigan <comma> 1301 beal avenu <comma> ann arbor mi 48109-2122 <comma> usa tel. : ( 313 ) 764-5215 fax : ( 313 ) 763-1503 research activ i pursu a wide rang of topic in inform theori and channel code theory. i am current most interest in the develop of a gener theori cover effici context-depend retriev of inform from large-scal databas <comma> and in particular in the applic of idea from inform and code theori to thi problem. i am also interest in all aspect of the structur <comma> properti <comma> and decod of channel codes. select public j.t. coffei <comma> r.m. goodman <comma> and p.g. farrel <comma> " new approach to reduced-complex decod <comma> " discret appli mathemat <comma> vol. 33 <comma> nos. 1-3 <comma> pp. 43-60 <comma> octob 1991. [ abstract ] j.t. coffei <comma> t. herbsman <comma> and s. sechrest <comma> " inform theori approach to inform retriev <comma> " in commun theori and applic ii <comma> hw commun ltd. <comma> lancast <comma> u.k. <comma> 1994. [ abstract ] a.b. kieli and j.t. coffei <comma> " on the capac of a cascad of channel <comma> " ieee transact on inform theori <comma> vol. 39 <comma> no. 5 <comma> pp.1031-1037 <comma> septemb 1993. [ abstract ] a.b. kieli <comma> j.t. coffei <comma> and m.r.bell <comma> " optim inform bit decod of linear block code <comma> " ieee transact on inform theori <comma> vol. 41 <comma> no. 1 <comma> pp. 130-140 <comma> januari 1995. [ abstract ] j.t. coffei and a.b. kieli <comma> " the capac of code system <comma> " to appear in ieee transact on inform theory. [ abstract ] research group class eec 401 : probabilist method in engin ( last taught fall '95 ) eec 453 : analog commun signal and system ( last taught winter '94 ) eec 455 : digit commun signal and system ( last taught winter '95 ) eec 501 : probabl and random process ( last taught winter '92 ) eec 550 : inform theori ( last taught fall '94 ) eec 650 : channel code theori ( last taught winter '96 ) www link ieee inform theori societi nasa/jpl telecommun and data acquisit progress report nsf network & commun research : 1994 report on research direct u.s. feder commun commiss galileo mission to jupit collect of comput scienc bibliographi all electr engin program worldwid updat tuesdai <comma> august 6 <comma> 1996 by sean coffei ", line 432 in task.data, at icsiboost.c:1078
=== END program1: ./run learn ../dataset2/train --- FAILED [291s]
supervised-learning: Main entry for supervised learning for training and testing a program on a dataset.
(learner:Program) icsiboost-bigram: Adaboost on single-level deicision trees / tokenizer+stemmer+2gram bag-of-words
(dataset:Dataset) bigdata: The 4 Universities Data Set.
From the description: "This data set contains WWW-pages collected from computer science departments of various universities in January 1997 by the World Wide Knowledge Base(Web->Kb) project of the CMU text learning group. The 8,282 pages were manually classified into the following categories: student (1641), faculty (1124), staff (137), department (182), course (930), project (504), other (3764)."
This MLcomp dataset includes the pages from cornell, texas, washington and misc in the training set and uses the wisconsin pages as the test set. The MIME headers and HTML tags were removed.
See: [ http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/ ]
(stripper:Program[Strip]) document-classification-utils: Inspects DocumentClassification datasets and evaluates DocumentClassification performance.
(evaluator:Program[Evaluate]) document-classification-utils: Inspects DocumentClassification datasets and evaluates DocumentClassification performance.
Go to the page for the run and look at the log file for signs of the responsible error.
You can also download the run and run it locally on your machine (a README file should
be included in the download which provides more information).
We said that a run was simply a program/dataset pair, but that's not the full story.
A run actually includes other helper programs such as the evaluation program and
various programs for reductions (e.g., one-versus-all, hyperparameter tuning).
More formally, a run is a given by a run specification,
which can be found on the page for any run.
A run specification is a tree where each internal node represents a program
and its children represents the arguments to be passed into its constructor.
For example, the one-versus-all program takes your binary classification program
as a constructor argument and behaves like a multiclass classification program.