1 / 33

Introduction to machine learning Juan López González University of Oviedo

Introduction to machine learning Juan López González University of Oviedo Inverted CERN School of Computing, 24-25 February 2014. General overview. Lecture 1 Machine learning Introduction Definition Problems Techniques. Lecture 2 ANN SOMs Definition Algorithm Simulation

crwys
Télécharger la présentation

Introduction to machine learning Juan López González University of Oviedo

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to machine learning Juan López González University of Oviedo Inverted CERN School of Computing, 24-25 February 2014

  2. General overview • Lecture 1 • Machine learning • Introduction • Definition • Problems • Techniques • Lecture 2 • ANN • SOMs • Definition • Algorithm • Simulation • SOM basedmodels

  3. LECTURE 1 Introduction to machine learning and data mining

  4. 1.1. Somedefinitions1.2. Machine learning vs Data mining1.3. Examples1.4. Essence of machine learning1.5. A learningpuzzle 1. Introduction

  5. 1.1 Somedefinitions • To learn • To use a set of observations to uncover an underlying process • To memorize • To commit to memory • Itdoesn’t mean to understand

  6. 1.2 Machine learning vs Data mining • Machine learning(Arthur Samuel) • Study, designand development of algorithmsthatgivecomputerscapability to learnwithoutbeingexplicitlyprogrammed. • Data mining • Extractknowledgeorunknownpatternsfrom data.

  7. 1.3 Examples • Creditapproval • Gender, age, salary, years in job, currentdebt… • Spam filtering • Subject, From… • Topicspotting • Categorizearticles • Weatherprediction • Wind, humidity, temperature…

  8. 1.4 Essenceof machine learning • A patternexists • Wecannot pin itdownmathematically • Wehave data onit

  9. 1.5 A learningpuzzle

  10. 2.1. Components2.2. Generalization and representation2.3. Types of learning 2. Definition

  11. 2.1 Components • Input (customerapplication) • Ouput(aprove/rejectcredit) • Ideal function(f: X ↦ Y) • Data: (a1,b1,..,n1), (a2,b2,..,n2) … (aN,bN,..,nN) (historical records) • Result: (y1), (y2) … (yN) (loan paidornotpaid) • Hypothesis(g: X ↦ Y)

  12. 2.1 Components

  13. 2.2 Generalization and representation • Generalization • Thealgorithm has to build a general model • Objective • Generalize from experience • Ability to performaccuratelyforunseenexamples • Representation • Resultsdependon input • Input dependsonrepresentation • Pre-processing?

  14. 2.3 Types of learning • Supervised • Input and output • Unsupervised • Only input • Reinforcement • Input, output and grade of output

  15. 3.1. Regression3.2. Classification3.3. Clustering3.4. Association rules 3. Problems

  16. 3.1 Regression • Statistical process for estimating the relationships among variables • Could be usedforprediction

  17. 3.2 Classification • Identify to which of a set of categories a new observation belongs • Supervisedlearning

  18. 3.3 Clustering • Grouping a set of objects in such a way that objects in the same group are more similar • Unsupervisedlearning

  19. 3.4 Associationrule • Discovering relations between variables in large databases • Basedon ‘strong rules’ • Ifordermatters -> Sequentialpatternmining frequent itemset lattice

  20. 4.1. Decisiontrees4.2. SVM4.3. Monte Carlo4.4. K-NN4.5. ANN 4. Techniques

  21. 4.1 Decisiontrees • Uses tree-like graph of decisions and possible consequences • Internalnode: attribute • Leaf: result

  22. 4.1 Decisiontrees • Resultshuman readable • Easilycombinedwithothertechniques • Possiblescenarios can be added • Expensive

  23. 4.2 SupportVector Machine (SVM) • Separates the graphical representation of the input points • Constructs a hyperplanewhich can be usedforclassification • Input spacetransformationhelps • Non-human readableresults

  24. 4.3 Monte Carlo • Obtain the distribution of an unknown probabilistic entity • Random sampling to obtain numerical results • Applications • Physics • Microelectronics • Geostatistics • Computationalbiology • Computergraphics • Games • …

  25. 4.4 K-Nearestneighbors (K-NN) • Classifies by getting the class of the K closest training examples in the feature space K=1 K=5

  26. 4.4 K-Nearestneighbors (K-NN) • Easy to implement • naiveversion • High dimensional data needsdimensionreduction • Largedatasetsmakeitcomputationalexpensive • Many k-NN algorithms try to reduce thenumber of distanceevaluationsperformed

  27. 4.5 Artificial neural networks (ANN) • Systems of interconnectedneuronsthat compute from inputs

  28. 4.5 Artificial neural networks (ANN)

  29. 4.5 Artificial neural networks (ANN) Example:

  30. 4.5 Artificial neural networks (ANN) • Perceptron • single-layer artificial network with one neuron • calculates the linear combination of its inputs and passes it through a threshold activation function • Equivalent to a linear discriminant

  31. 4.5 Artificial neural networks (ANN) • Perceptron • Equivalent to a linear discriminant

  32. 4.5 Artificial neural networks (ANN) • Learning • Learntheweights (and threshold) • Samples are presented • If output isincorrectadjusttheweights and thresholdtowardsdesired output • Ifthe output iscorrect, do nothing

  33. Q & A

More Related