130 likes | 625 Vues
A SSOCIATION R ULES & THE A PRIORI A LGORITHM. BY : J OE C ASABONA. I NTRODUCTION. Recap Data Mining Three types Association Rules Apriori Algorithm. A SSOCIATION R ULES. Most apparent form of Data Mining Objective: Find all co-occurrence relationships among data items
E N D
ASSOCIATION RULES & THE APRIORI ALGORITHM BY: JOE CASABONA
INTRODUCTION • Recap • Data Mining • Three types • Association Rules • Apriori Algorithm
ASSOCIATION RULES • Most apparent form of Data Mining • Objective: Find all co-occurrence relationships among data items • Strength: Support & Confidence
SUPPORT • Those who buy X buy Y, where X and Y are sets • X => Y • .count = number of occurences • n = number of total transactions • Number produced is % of all transactions (T)
CONFIDENCE • % of transactions where X also contains Y • Determines predictability of the rule • Min Support and Confidence Determined.
EXAMPLE • AR 1: Xbox ---> Controller • Support: 5/8 • Confidence: 3/5 • AR 2: COD4 ---> Xbox • Support: 5/8 • Confidence: 2/5 • AR 1 passes, AR 2 fails
APRIORI ALGORITHM • Generate all frequent item sets • All item sets with min support • Generate all confident ARs from frequent item sets • Downward Closure Property
GENERATE FREQUENT ITEM SETS • Count supports of each individual item • Create a set F with all individual items with min support • Creates "Candidate Set" C[k] based on F[k-1]. • Check each element c in C[k] to see if it meets min support • Return set of all frequent item sets.
GENERATE CANDIDATE SETS • Create two sets differing only in the last element, based on some seed set • Join those item sets into c • Compare each subset s of c to F[k-1]- if s is not in F[k-1], delete it. • Return final candidate set
RULE GENERATE • Take Frequent Item Set F • If {F[1], F[2],...F[k-1]} => {F[k]}meets some min confidence, make it a rule • Remove last element from antecedent, insert into consequent, check again
OTHER ALGORITHMS • Eclat algorithm • FP-Growth algorithm • One-attribute-rule • Zero-attribute-rule
SAMPLE DATA • Xbox, Controller, COD4 • Xbox, COD4 • Xbox, Controller • Controller, COD4 • Xbox, Rock Band, Controller • Xbox, PS3 • COD4, COD5, Rock Band • COD4, Rock Band • Min Support: 60% • Min Confidence: 50%
RERERENCES The Book I am using: Liu, Bing. Web Data Mining, Chapter 2: Association Rules and Sequential Patterns. Springer, December, 2006 Wikipedia: "Apriori Algorithm." http://en.wikipedia.org/wiki/Apriori_algorithm March 23, 2009 "Association rule learning." http://en.wikipedia.org/wiki/Association_rulesMarch 25, 2009