1 / 55

A Logic Based Classification Technique

A Logic Based Classification Technique. General-to-Specific Ordering. Logic Based. Like Decision Tree. Tree questions Sky? Sunny, ok, Wind? Strong, ok yes enjoy sport. Candidate Elimination. With candidate elimination object is to predict class through the use of expressions.

hailey
Télécharger la présentation

A Logic Based Classification Technique

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Logic Based Classification Technique General-to-Specific Ordering

  2. Logic Based Like Decision Tree Tree questions Sky? Sunny, ok, Wind? Strong, ok yes enjoy sport Logic Based Classification

  3. Candidate Elimination With candidate elimination object is to predict class through the use of expressions ?’s are like wild cards Expressions represent conjunctions Expression <Sunny,?,?,Strong,?,?> Means will enjoy sport only when sky is sunny andwind is strong, don’t care about other attributes Logic Based Classification

  4. First Approach Finding a maximally specific hypothesis Start with most restrictive (specific) one can get and relax to satisfy each positive training sample Ø’s mean nothing will match it Most general (all dimensions can be any value) <?,?,?,?,?,?> Most restrictive (no dimension can be anything <Ø, Ø, Ø, Ø, Ø,Ø> Logic Based Classification

  5. That pesky Ø What if a relation has a single Ø? (remember, the expression is a conjunction) Ø Logic Based Classification

  6. Find-S Algorithm Initialize h to most specific hypothesis in H (<Ø, Ø, Ø, Ø, Ø, Ø>) For each positive training instance x For each attribute constraint ai in h If the constraint ai is satisfied by x then do nothing Else replace ai in h by the next more general constraint that is satisfied by x Return h Order of generality ? is more general than a specific attribute value which is more specific than Ø Logic Based Classification

  7. Set h to <Ø, Ø, Ø, Ø, Ø, Ø> First positive (x) <Sunny,Warm,Normal,Strong,Warm,Same> Which attributes of x are satisfied by h? None? Replace each ai with a relaxed form from x <Sunny,Warm,Normal,Strong,Warm,Same> Example Logic Based Classification

  8. h is now <Sunny,Warm,Normal,Strong,Warm,Same> Next positive <Sunny,Warm,High,Strong,Warm,Same> Which attributes of x are satisfied by h? Not humidity Replace h with <Sunny,Warm,?,Strong,Warm,Same> Example Logic Based Classification

  9. h is now <Sunny,Warm,?,Strong,Warm,Same> Next positive <Sunny,Warm,High,Strong,Cool,Change> Which attributes of x are satisfied by h? Not water or forcast Replace h with <Sunny,Warm,?,Strong,?,?> Example Return <Sunny,Warm,?,Strong,?,?> Can one use this to “test” a new instance? Logic Based Classification

  10. Next: Version Space What if want all hypotheses that are consistent with a training set (called a version space) A hypothesis is consistent with a set of training examples if and only if h(x)=c(x) for each training example <?, Warm, ?, Strong, ?, ?> <Sunny, ?, ?, Strong, ?, ?> <?,Warm,?,?,?,?> <Sunny,Warm,?,Strong,?,? > <Sunny,?,?,?,?,?> <Sunny, Warm, ?, ?, ?, ?> <?,?,?,?,?,Same> Logic Based Classification

  11. List-Then-Eliminate Algorithm a list containing every hypothesis in For each training example Remove from any hypothesis for which Output the list of hypotheses in • Number of hypotheses 5,120 that can be represented (5*4*4*4*4*4) • But a single Ø represents an empty set • So semantically distinct hypotheses 973 Exhaustive Logic Based Classification

  12. Next: Candidate Elimination More compact representation Just those hypotheses at the extreme ends Those that are the most general and those that are the most specific All else between would necessarily be in the Process of Elimination Logic Based Classification

  13. Definitions And now for something totally formal: The general boundary G, with respect to hypothesis space consistent with , is the set of maximally general members of consistent with . G is identical to the set of all g that are members of H such that g is consistent with D and there does not exist a g’ in H such that it is more general than g and it (g’) is consistent with the training data Logic Based Classification

  14. Definitions The specific boundary S, with respect to hypothesis space consistent with , is the set of minimally general members of consistent with . S is identical to the set of all s that are members of H such that s is consistent with D and there does not exist a s’ in H such that it is more specific than s and it (s’) is consistent with the training data Logic Based Classification

  15. Example All yes’s are sunny, warm, and strong But “strong” isn’t enough to identify a yes S:{<Sunny, Warm, ?, Strong, ?, ?>} <Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?> G: {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> } 3 ?’s 4 ?’s 5 ?’s Logic Based Classification

  16. Approach Start with two extremes Most general (all dimensions can be any value) <?,?,?,?,?,?> Most restrictive (no dimension can be anything <Ø, Ø, Ø, Ø, Ø, Ø> Slowly work inward Specific General Logic Based Classification

  17. Algorithm Initialize G to the set of maximally general hypotheses in H Initialize S to the set of maximally specific hypotheses in H For each training example d, do If d is a positive example Remove from G any hypothesis inconsistent with d For each hypothesis s in S that is not consistent with d Remove s from S Add to S all minimal generalizations h of s such that h is consistent with d and some member of G is more general than h Remove from S any hypothesis that is more general than another hypothesis in S If d is a negative example Remove from S any hypothesis inconsistent with d For each hypothesis g in G that is not consistent with d Remove g from G Add to G all minimal specializations h of g such that h is consistent with d, and some member of S is more specific than h Remove from G any hypothesis that is less general than another hypothesis in G Logic Based Classification

  18. Example Initialize • S0: <Ø, Ø, Ø, Ø, Ø, Ø> • G0: {<?,?,?,?,?,?>} Logic Based Classification

  19. Example First record • S1: {<Sunny,Warm,Normal,Strong,Warm,Same>} • G0 G1:{<?,?,?,?,?,?>} Logic Based Classification

  20. Example Second Modify previous S minimally to keep consistent with d • S2: {<Sunny,Warm, ? ,Strong,Warm,Same>} • G0G1G2: {<?,?,?,?,?,?>} Logic Based Classification

  21. Example Third Replace {<?,?,?,?,?,?>} with all one member expressions (minimally specialized) • S2S3: {<Sunny,Warm, ? ,Strong,Warm,Same>} • G3:{<Sunny,?,?,?,?,?>, <?,Warm,?,?,?,?>, <?,?,?,?,?,Same>} Logic Based Classification

  22. Example Fourth Back to positive, replace warm and same with “?” and remove “Same” from General • S4: {<Sunny,Warm, ? ,Strong, ? , ? >} • G3G4: {<Sunny,?,?,?,?,?>, <?,Warm,?,?,?,?>, <?,?,?,?,?,Same>} Then can calculate the interior expressions <Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?> Logic Based Classification

  23. What if Have two identical records but different classes? If positive shows up first it, first step in evaluating a negative states “Remove from S any hypothesis that is not consistent with d” (S is now empty) For each hypothesis g in G that is not consistent with d Remove g from G (all ?’s is inconsistent with No, G is empty) Add to G all minimal specializations h of g such that h is consistent with d, and some member of S is more specific than h No matter what add to G it will violate either d or S (remains empty) Both are empty, broken. Known as converging to an empty version space Established by first positive • S1: {<Sunny,Warm,Normal,Strong,Warm,Same>} • G0 G1:{<?,?,?,?,?,?>} Logic Based Classification

  24. What if Have two identical records but different classes? If negative shows up first it, first step in evaluating a positive states “Remove from G any hypothesis that is not consistent with d” This is all of them, leaving an empty set For each hypothesis s in S that is not consistent with d Remove s from S Add to S all minimal generalizations h of s such that h is consistent with d and some member of G is more general than h No minimal generalization exists except <?,?,?,?,?,?> • S0: <Ø, Ø, Ø, Ø, Ø, Ø> • G0G1:{<Rainy,?,?,?,?,?>, <Cloudy,?,?,?,?,?>, • <?,Cold,?,?,?,?>,<?,?,High,?,?,?>,<?,?,?,Light,?,?>, • <?,?,?,?,Cool,?>,<?,?,?,?,?,Change>} Established by first negative Logic Based Classification

  25. Brittle Bad with noisy data Similar effect with false positives or negatives Logic Based Classification

  26. Will it converge? Yes provided There are no errors in the training examples There is some hypothesis in H that correctly describes the target concept For example: if the target concept is a disjunction () of feature attributes and the hypothesis space supports only conjunctions Logic Based Classification

  27. Classifying Never before seen data Vote All training samples were strong wind No • S4: {<Sunny,Warm, ? ,Strong, ? , ? >} • G3G4: {<Sunny,?,?,?,?,?>, <?,Warm,?,?,?,?>, <?,?,?,Strong,?,?>} Proportion can be a confidence metric <Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?> No No Yes No Yes Yes Logic Based Classification

  28. A UnanimousVote Same confidence as if already converged to the single correct target concept Regardless of which hypothesis in the version space is eventually found to be correct, it will be positive for at least some of the hypotheses in the current set, and the test case is unanimously positive 100% as good as most specific match Logic Based Classification

  29. Best for… Discrete data Binary classes Logic Based Classification

  30. Now for… Have seen 4 classifiers Naïve Bayesian KNN Decision Tree Candidate Elimination Now for some theory Logic Based Classification

  31. Have already… Curse of dimensionality Overfitting Lazy/Eager Radial basis Normalization Gradient descent Entropy/Information gain Occam’s razor Logic Based Classification

  32. Biased Hypothesis Space Another way of measuring whether a hypothesis captures the learning concept Candidate Elimination Conjunction of constraints on the attributes Logic Based Classification

  33. In regression Biased toward linear solutions Naïve Bayes Biased to a given distribution or bin selection KNN Biased toward solutions that assume cohabitation of similarly classed instances Decision Tree Short trees Biased Hypothesis Space Logic Based Classification

  34. Unbiased learner? Must be able to accommodate every distinct subset as class definition 96 distinct instances (3*2*2*2*2*2) Sky has three possible answers–rest two Number of distinct subsets 296 Think binary: 1 indicates membership Logic Based Classification

  35. Number of hypotheses 5,120 that can be represented (5*4*4*4*4*4) But a single Ø represents an empty set So semantically distinct hypotheses 973 Each hypothesis represents a subset (due to wild cards) 1+(4*3*3*3*3*3) Search Space • Candidate elimination can represent 973 different subsets • But 296 is the number of distinct subsets • Very biased • S0: <Ø, Ø, Ø, Ø, Ø, Ø> • G0: {<?,?,?,?,?,?>} Logic Based Classification

  36. I think of bias as inflexibility in expressing hypotheses Or, alternatively, what are the implicit assumptions of the approach Bias Inflexibility Implicit Assumptions Logic Based Classification

  37. Next term: inductive inference The process by which a conclusion is inferred from multiple observations What we’ve been doing Training data Classifier Make prediction on New data Logic Based Classification

  38. The Hypothesis Inductive learning hypothesis Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples Logic Based Classification

  39. Next Term Concept learning Automatically inferring the general definition of some concept, given examples labeled as members or nonmembers of the concept Roughly equate “Concept” to “Class” Logic Based Classification

  40. is the set of all possible hypotheses that the learner may consider regarding the choice of hypothesis representation. In general, each hypothesis in represents a boolean-valued function defined over ; that is, . Note that this is for a two class system The goal of the learner is to find a hypothesis such that for all in is the target concept Hypotheses Logic Based Classification

  41. Target Concept In regression The various “y” values of the training instances Function approximation Naïve Bayes, KNN, and Decision Tree Class Logic Based Classification

  42. Hypotheses In regression Line; the coefficients (or other equation members such as exponents) Naïve Bayes Class of an instance is predicted by determining most probable class given the training data. That is, by finding the probability for each class for each dimension, multiplying these probabilities (across the dimensions for each class) and taking the class with the maximum probability as the predicted class KNN Class of an instance is predicted by examining an instance’s neighborhood Decision Tree Tree itself Candidate Elimination Conjunction of constraints on the attributes Logic Based Classification

  43. Something Else We’ve Been Doing Supervised Learning Supervision from an oracle that knows the classes of the training data Is there unsupervised learning? Yes, covered in pattern rec Seeks to determine how the data are organized Clustering PCA Edge detection Logic Based Classification

  44. Definition of Machine Learning Machine learning addresses the question of how to build computer programs that improve their performance at some task through experience. Finally Logic Based Classification

  45. Learning Checkers All about representation Out representation End game is to develop function that returns the best next move Logic Based Classification

  46. chooseNextMove Look at every legal move Determine goodness (score) of resultant board state Return the highest score (argmax) Logic Based Classification

  47. How to Assess a Board State Score function, we will keep it simple Work with a polynomial with just a few variables X1: the number of black pieces on the board X2: the number of red pieces on the board X3: the number of black kings on the board X4: the number of red kings on the board X5: the number of black pieces threatened by red X6: the number of red pieces threatened by black Logic Based Classification

  48. Score(b) Gotta learn them weights But how? X1: the number of black pieces on the board X2: the number of red pieces on the board X3: the number of black kings on the board X4: the number of red kings on the board X5: the number of black pieces threatened by red X6: the number of red pieces threatened by black Logic Based Classification

  49. Training A bunch of board states (a series of games) Use them to jiggle the weights Must know the current real “score” vs. “predicted score” using polynomial Train the scoring function Logic Based Classification

  50. A trick If my predictor is good then it will be self-consistent That is, the score of my best move should lead to a good scoring board state If it doesn’t maybe we should adjust our predictor Precognition Logic Based Classification

More Related