280 likes | 472 Vues
Learning Fuzzy Association Rules and Associative Classification Rules. Jianchao Han Computer Science Department California State University Dominguez Hills. Agenda. Introduction Traditional Association Rules Positive and Negative Fuzzy Association Rules An Illustrative Example
E N D
Learning Fuzzy Association Rules and Associative Classification Rules Jianchao Han Computer Science Department California State University Dominguez Hills
Agenda • Introduction • Traditional Association Rules • Positive and Negative Fuzzy Association Rules • An Illustrative Example • Positive and Negative Fuzzy Associative Classification Rules • Implementation Algorithms • Conclusion WCCI 2006
Introduction • Association • a relationship between data items • Sales data association • If a set of items A occurs in a sale transaction, then another set of items B will likely also occurs in the same transaction • Limitations • Data are described in binary attribute values • Only positive associations are pursued • Solutions • Fuzzy attribute values • Negative associations WCCI 2006
Traditional Association Rules • Basket data • I={I1, I2 , … , Im}, a set of possible items • D={t1, t2 , … , tn}, a database of transactions • t∈D is represented as a binary vector, with • t[Ik]=1 if t contains Ik • t[Ik]=0 if t does not contain Ik • Support of itemset • ∀X⊂I, t satisfies X, if ∀Ik∈I, t[Ik]=1 • The support of X in D is defined as • Supp(X) = |{t∈D| t satisfies X}| • That is the number of transactions that satisfy X WCCI 2006
Traditional Association Rules • Itemset (binary) association rules • For any X, Y⊂I, X⋂Y=Ф, XY is an association rule if • The support of the rule Supp(XY) is the probability of occurrence of X⋃Y in D • The confidence of the rule Conf(XY) is the conditional probability of Y given X • Mining association rules • Look for all possible associations XY such that Supp(XY) ≥ α – a given threshold and Conf(XY) ≥ β– another given threshold WCCI 2006
Association Rules Mining Algorithm • Two steps • Discovering all frequent itemsets that have the support ≥α • Generating association rules • Partition each frequent itemset into two parts, X and Y • Test the Conf(XY) • Level-wise algorithm • Observation: if X is a frequent itemset, its all subsets are • Test all 1-item itemsets • Test all 2-item itemsets that are the superset of frequent 1-item itemsets • Repeat until no new frequent itemsets are found WCCI 2006
Fuzzy Association Rules • Binary value is extended to the interval [0,1] • Example -- Item Tomato belongs to Vegetable in some degree, say 0.7 • Itemset A={A1, A2 , … , Al}⊂I, where Ai is a fuzzy subset of I • Support of an itemset A is defined as • Support of a rule AB is • Confidence of a rule AB is WCCI 2006
Positive vs. Negative Association Rules • Positive association rules • Like AB • Negative association rules • Like ¬AB, ¬A¬B, A¬B • Different rule-interest measures exist for negative association rules, e.g. • Negative example of AB is positive example of BA • A¬B, if • A⋃B is infrequent • A⋃¬B is frequent • Supp(A⋃¬B) – Supp(A)Supp(¬B)≥α • Supp(A⋃¬B)/Supp(A) ≥β WCCI 2006
Fuzzy Positive Association Rules • Simple fuzzy extension to traditional association rules • AB is a fuzzy positive association rule, if • A⋂B = Ф WCCI 2006
Fuzzy Negative Association Rules • A¬B is a negative association rule if • A⋂B = Ф • Supp(A) ≥α • Supp(B) ≥α • Supp(AB) < WCCI 2006
Fuzzy Negative Association Rules • ¬AB is a negative association rule if • A⋂B = Ф • Supp(A) ≥α • Supp(B) ≥α • Supp(AB) < WCCI 2006
Fuzzy Negative Association Rules • ¬A¬B is a negative association rule if • A⋂B = Ф • Supp(A) ≥α • Supp(B) ≥α • Supp(AB) < WCCI 2006
Algorithm for Mining both Positive and Negative Fuzzy Rules • Two steps • Generating all frequent and infrequent itemsets • Extracting fuzzy association rules • Positive rules are extracted from the frequent itemsets • Negative rules are extracted from the infrequent itemsets WCCI 2006
An Example Transaction Database Frequent vs. Infrequent Itemsets With support threshold 40% WCCI 2006
An Example: Positive Fuzzy Association Rules Support threshold: 50% Confidence threshold: 70% Support threshold: 40% Confidence threshold: 75% WCCI 2006
An Example: Negative Fuzzy Association Rules Support threshold: 25% Confidence threshold: 70% WCCI 2006
Associative Classification Rules • Associative classification rules are a special subset of association rules whose right-hand-side is restricted to the class labels. • In classification, data attributes are partitioned into two categories: condition attributes and decision attributes. • For simplicity, decision attributes are converted into decision attribute-value pairs that are indicated as class labels. • Thus, class labels are also items in the database, but separate from condition items. WCCI 2006
Two Constraints • the left-hand-side of classification rules must be frequent itemsets of condition attributes, or the negation of infrequent conditional itemsets • the class labels that appear in the right-hand-side of classification rules must also be frequent 1-itemsets WCCI 2006
Positive Fuzzy Associative Classification Rules • Let AI be an itemset, and c C be a class label. The relationship Ac is a positive fuzzy associative classification rule, if the following conditions hold: • A{c} is a frequent itemsets in D,Supp(A{c})/|D| minsupp 2)A c is confident, Conf(Ac}=Supp(A{c})/Supp(A) minconf WCCI 2006
Negative Fuzzy Associative Classification Rules • We only consider the format Ac • where A is a frequent itemset, • {c} is a frequent class label, • A{c} is infrequent • Ac is a negative fuzzy associative classification rule if 1 Supp(A) ≥ minsupp; 2 Supp({c}) ≥ minsupp; 3 Supp(A{c})/|D| < minsupp; 4 Supp(¬A{c})/|D| ≥ minsupp; 5 Conf(Ac)=Supp(¬A{c})/Supp(¬A)≥minconf. WCCI 2006
Learning Algorithm • Step 1:Finding the set of frequent conditional itemsets for associative classification rules • Step 2: Inducing both positive and negative fuzzy associative classification rules • add each frequent class label c to each frequent itemset X • If X {c} is still frequent, then test if Xc is a positive fuzzy association rule; • If X {c} is infrequent, thentest if Xc is a negative fuzzy association rule. • a frequent itemset Y is partitioned into two subsets A and B, and the associations ABc and ABc are tested against the support threshold and confidence threshold. WCCI 2006
Conclusion • Traditional association rules • Fuzzy extensions and negative rules • Fuzzy associative classification rules • An example • Algorithms WCCI 2006