1 / 33

Machine Learning with Weka

Machine Learning with Weka. Cornelia Caragea. Thanks to Eibe Frank for some of the slides. Weka: A Machine Learning Toolkit Preparing Data Building Classifiers. Outline. WEKA: the software. Machine learning/data mining software written in Java (distributed under the GNU Public License)

blaverty
Télécharger la présentation

Machine Learning with Weka

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides

  2. Weka: A Machine Learning Toolkit Preparing Data Building Classifiers Outline

  3. WEKA: the software Machine learning/data mining software written in Java (distributed under the GNU Public License) Used for research, education, and applications Main features: Comprehensive set of data pre-processing tools, learning algorithms and evaluation methods Graphical user interfaces (incl. data visualization) Environment for comparing learning algorithms

  4. WEKA: versions There are several versions of WEKA: WEKA: “book version” compatible with description in data mining book WEKA: “GUI version” adds graphical user interfaces (book version is command-line only) WEKA: “development version” with lots of improvements

  5. WEKA: resources API Documentation, Tutorials, Source code. WEKA mailing list Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations Weka-related Projects: Weka-Parallel - parallel processing for Weka RWeka - linking R and Weka YALE - Yet Another Learning Environment Many others…

  6. Weka: web site http://www.cs.waikato.ac.nz/ml/weka/

  7. WEKA: launching java -jar weka.jar

  8. Outline Weka: A Machine Learning Toolkit Preparing Data Building Classifiers

  9. @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... WEKA only deals with “flat” files Flat file in ARFF format

  10. @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... WEKA only deals with “flat” files numeric attribute nominal attribute

  11. Explorer: pre-processing the data Data can be imported from a file in various formats: ARFF, CSV, binary Data can also be read from a URL or from an SQL database (using JDBC) Pre-processing tools in WEKA are called “filters” WEKA contains filters for: Discretization, normalization, resampling, attribute selection, …

  12. Outline Weka: A Machine Learning Toolkit Preparing Data Building Classifiers

  13. Explorer: building “classifiers” Classifiers in WEKA are models for predicting nominal or numeric quantities Implemented learning schemes include: Decision trees, support vector machines, perceptrons, neural networks, logistic regression, Bayes nets, … “Meta”-classifiers include: Bagging, boosting, stacking, …

  14. Outline Machine Learning Software Preparing Data Building Classifiers

  15. To Do Try Naïve Bayes and Logistic Regression classifiers on a different Weka dataset Use various parameters Try Linear regression

More Related