120 likes | 242 Vues
This tutorial explores association rule mining with a focus on the Apriori algorithm. The goal is to identify rules that meet user-defined minimum support (minsup) and minimum confidence (minconf). Initially applied in market basket analysis, this method finds relationships among purchased items using categorical data. The Apriori algorithm consists of two main steps: identifying frequent itemsets and generating association rules. We illustrate the process with supermarket transaction examples and suggest implementation in Weka for practical application.
E N D
Association rule mining • Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence(minconf). • Assume all data are categorical. • No good algorithm for numeric data. • Initially used for Market Basket Analysis to find how items purchased by customers are related.
Association rule IF A B Support (AB)= #of tuples containing both (A,B) Total # of tuples • IF A B • Confidence (AB)= #of tuples containing both (A,B) Total # of tuples containing A
The Apriori algorithm • The best known algorithm. • Two steps: • Find all itemsets that have minimum support (frequent itemsets, also called large itemsets). • Use frequent itemsets to generate rules.
Example • Five transactions from a supermarket
Minimum support • Minimum support=2/5= 40%
example {Egg, Milk} , {Egg, butter} {Egg,Milk,butter} After that check all possible pairs in L2: {Egg,Milk} ok {Egg,Butter} ok {Milk,butter} No Remove it
cont • Minimum support=2/5= 40% min confidence=70%
Results • Egg Butter Support: 60% confidence:75% • Butter Egg Support: 60% confidence:75% • Milk Egg Support: 40% confidence:100% • Baby Powder Butter Support: 40% confidence:100%
Insert the same example to weka. • Try the same example in Weka, insert marketing-list.csv
Reference: • “Association Rules Apriori Algorithm”, https://dspace.ist.utl.pt/bitstream/2295/55704/1/licao_9.pdf