Participation in the NIPS 2003 Challenge Theodor Mader ETH Zurich, tmader@student.ethz.ch

SUMMARY • ARCENE: cancer diagnosis • 10000 features, 100 training examples, 100 validation examples • Very small number of training examples, merging training and validation set improved performance significantly. • We empiricaly found that the probe method performed better than signal to noise ratio • Optimal number of features found by crossvalidation: • GISETTE: digit recognition, 5000 features, 6000 training examples, 1000 validation examples • Garbage features were easily removed by feature selection according to signal to noise ratio • Applying gaussian blur to the input images brought significant improvement. • Shrinkage could further improve the results In the course on feature extraction of the winter semester 2005-2006 the students had to participate in the NIPS 2003 feature selection challenge. They experimented with different feature selection and classification methods in order to achieve high rankings. BACKGROUND Feature Selection is a vital part for any classification procedure. Thus it is very important to understand common procedures and algorithms and see how they perform on real world data. The students had the chance to apply the knowledge acquired during class to real world datasets, learning about the strengths and weaknesses of the individual approaches. my_classif=svc({'coef0=1', 'degree=4', 'gamma=0', 'shrinkage=0.5'}); my_model=chain({convolve(’dim1=5’,’dim2=5’), normalize, my_classif}) Submission: my_classif=svc({'coef0=1','degree=3','gamma=0','shrinkage=0.1'}) my_model=chain({probe(relief,{'p_num=2000', ‘f_max=2400'}), normalize, my_classif}) MADELON: artificial data 5000 features, 6000 training examples, 1000 validation examples The relief method performed extremely well: With only 20 features the classification error could be reduced to below 7% Shrinkage proved to be nearly as important as kernel width for the gaussian kernel. • DEXTER: text categorization, 5000 features, 300 training examples, 300 validation examples. • Also very small number of training examples => we merged training and validation set • Initial experiments with principal component analysis showed no improvement • We sticked to the baseline model and optimized the shrinkage and number of features parameter DATASETS • Five Datasets were provided for experiments: • ARCENE: cancer diagnosis • DEXTER: text categorization • DOROTHEA: drug discovery • GISETTE: digit recognition • MADELON: artificial data • All problems were two-class Problems • Our final submission: • Feature selection by signal to noise ratio (1210 features kept) • Normalization of features • Support vector classifier with linear kernel my_classif=svc({'coef0=1', 'degree=0', 'gamma=0.5', 'shrinkage=0.3'}) my_model=chain({probe(relief,{’f_max=20', 'pval_max=0'}), standardize, my_classif}) METHODS The students were provided with a Matlab machine learning package containing a variety of classification and feature extraction algorithms like supportvector machines, random probe method,... The results were ranked according to the balanced error rate (BER), i.e. the error rate of both classes weighted by the number of examples of each class. For this purpose a test set was used which was not available to the students. • DOROTHEA: drug discovery • Size: 5000 features, 300 training examples, 300 validation examples • Our Results indicate: feature selection by TP statistic was already sufficient, the relief criterion didn’t give much improvement. • We optimized the number of features in order to beat the baseline model. CONCLUSIONS Participating in the NIPS 2003 challenge was a very interesting opportunity. It could be clearly seen that in most cases there was no need to use very complicated classifiers, rather simple techniques (when applied correctly) and combined with powerful feature selection methods were faster and more successful. The importance of feature selection was hencee shown very clearly. Only using a fraction of the original features improved classification results and speed significantly. Submission: my_model=chain({TP('f_max=1400’),naive, bias}) Participation in the NIPS 2003 Challenge Theodor Mader ETH Zurich, tmader@student.ethz.ch RESULTS (BER in %)

Participation in the NIPS 2003 Challenge Theodor Mader ETH Zurich, tmader@student.ethz.ch

Participation in the NIPS 2003 Challenge Theodor Mader ETH Zurich, tmader@student.ethz.ch

Presentation Transcript

“Debriefing” your participation in the GLOBAL MANAGEMENT CHALLENGE

2003 Future Energy Challenge “Energy Challenge in the Home”

ETH Zurich www.ethz.ch

Filter + Support Vector Machine for NIPS 2003 Challenge Jiwen Li

Tonya DelSontro Eawag & ETH Zurich , Switzerland

Giovanna Davatz, ETH Zurich

Beat Gfeller, Elias Vicari ETH Zurich, Switzerland

The LOCANET task (Pot-luck challenge, NIPS 2008)

Managing Research Dynamics ETH Zurich

ETH Zurich – Distributed Computing – disco.ethz.ch

Christian Maier, ETH Zurich

Institute of Astronomy ETH Zurich

iGEM 2007 ETH Zurich

Systems Group Dept. Computer Science ETH Zurich - Switzerland

ETH 321 RANK Active Participation / eth321rank.com

The Participation Challenge

NIPS

Beat Gfeller, Elias Vicari ETH Zurich, Switzerland

The Participation challenge

ETH Zurich PIRE Program Student Mobility Options

Alexander Gorban ETH Zurich, Switzerland,

Participation in the NIPS 2003 Challenge Theodor Mader ETH Zurich, tmader@student.ethz.ch