150 likes | 190 Vues
732A02 Data Mining - Clustering and Association Analysis. Constrained frequent itemset mining. ………………… Jose M. Peña jose.m.pena@liu.se. Constraints. A constraint C(.) is Monotone If C(A) then C(B) for all A, B st A B . E.g. A’ A. Antimonotone
E N D
732A02 Data Mining -Clustering and Association Analysis • Constrained frequent itemset mining ………………… Jose M. Peña jose.m.pena@liu.se
Constraints • A constraint C(.) is • Monotone • If C(A) then C(B) for all A, B st A B. • E.g. A’ A. • Antimonotone • If C(A) then C(B) for all A, B st B A. • Or, if not C(B) then not C(A) for all A, B st B A. • E.g. support ≥ min_support. • The apriori property applies to any antimonotone constraint.
Constraints • sum(S.Price) v is monotone (positive prices). • min(S.Price) v is monotone. • range(S.Price) 15 is monotone. • Itemset ab satisfies C • So does every superset of ab
Constraints • sum(S.Price) v is antimonotone (positive prices). • sum(S.Price) v is not antimonotone. • range(S.Price) 15 is antimonotone. • Itemset ab violates C • So does every superset of ab
Apriori algorithm + any constraint Database D L1 C1 Scan D C2 C2 L2 Scan D L3 C3 Scan D Constraint: Sum{S.price} < 5, where item price equals item id
Apriori algorithm + antimonotone constraint Prune search space Database D L1 C1 Scan D C2 C2 L2 Scan D L3 C3 Scan D Constraint: Sum{S.price} < 5, where item price equals item id
Apriori algorithm + monotone constraint Does not prune search space but avoids constraint checking Database D L1 C1 Scan D C2 C2 L2 Scan D ☺ Not in the output, since they don’t satisfy the constraint ☺ ☺ L3 C3 Scan D ☺ Constraint: Sum{S.price} ≥ 5, where item price equals item id
FP grow algorithm + antimonotone constraint • Remove items that do not satisfy the constraint. • If the conditioning itemset α does not satisfy the constraint, then do not generate α nor its conditional database. • Let β denote the frequent items in the conditional database of α. If α U β satisfies the constraint, then do not check the constraint in the conditional database of α. Similar to Apriori (prune search space) Specific of FP grow (avoids constraint check)
FP grow algorithm + monotone constraint • If the conditioning itemset α satisfies the constraint, then do not check the constraint in its conditional database.
Constraints • avg(S.Price) v and avg(S.Price) ≥v are neither monotone nor antimonotone. • Convertible monotone • If there exists an item order R such that • If C(A) then C(B) for all A and B respecting R such that A is a suffix of B. • E.g. avg(S.Price) ≥v wrt decreasing price order. • Convertible antimonotone • If there exists an item order R such that • If C(A) then C(B) for all A and B respecting R such that B is a suffix of A. • Or, if not C(B) then not C(A) for all A and B respecting R such that B is a suffix of A. • E.g. avg(S.Price) ≥v wrt to increasing price order.
Constraints • avg(X) 25 is convertible monotone wrt descending item price order R: < a, f, g,d, b, h, c, e> • If an itemset d satisfies a constraint C, so do itemsets fd and afd, which have d as a suffix. • avg(X) 25 is convertible antimonotone wrt ascending item price item order R-1: < e, c, h, b, d, g, f, a > • If an itemset dfa satisfies a constraint C, so do itemsets fa and a, which are suffixes of dfa. • Thus, avg(X) 25 is strongly convertible. • Check that avg(X) 25 is also strongly convertible.
Constraints Monotone Antimonotone Strongly convertible Convertible antimonotone Convertible monotone Inconvertible avg(S)-median(S)=0