1 / 24

Anomalous Association Rules

Anomalous Association Rules. Máster Oficial en Soft Computing y Sistemas Inteligentes Universidad de Granada. Introduction. Association Rule: X  Y Supp(X Y) ≡ Supp(X  Y) ≥ ε (5%) Conf(X  Y) = ≥ θ (80%). frequent. confident.

Télécharger la présentation

Anomalous Association Rules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Anomalous Association Rules Máster Oficial en Soft Computing y Sistemas Inteligentes Universidad de Granada

  2. Introduction Association Rule: X  Y Supp(X Y) ≡ Supp(X  Y) ≥ ε (5%) Conf(X  Y) = ≥ θ (80%) frequent confident Find all the frequent and confident associations Applications  Market basket, CRM, etc.

  3. Introduction Problem: Thousands of rules are found. Unmanageable for any user! There are too many spurious associations. Possible solutions: • Subjective measures • Objective measures The main problem is the type of knowledge an association rule represents

  4. Introduction The crucial problem is to determine which kind of events we are interested in, so that we can appropriately characterize them. It is often more interesting to find surprising non-frequent events than frequent ones. The type of interesting events is application dependent

  5. Introduction • Infrequent itemsets in intrusion detection systems • Exceptions to associations for the detection of conflicting medicine therapies • Unsual short sequences of Nucleotides in genome sequencing • Etc.

  6. Introduction Our Objective To introduce the concept of anomalous association rule as a confident rule representing homogeneous deviations from common behavior.

  7. Related Work Suzuki, Hussain & Suzuki: “Exception Rules” X Y is an association rule X  I ¬ Y is the exception rule I is the “Interacting” itemset X  I is the reference rule Too many exceptions

  8. Our Definition X usually implies Y (dominant rule) X Y frequent and confident When X does not imply Y, then it usually implies A (the Anomaly) X ¬Y  A Anomalous association rule confident X Y  ¬A confident

  9. Our Definition

  10. Our Definition X Y is the dominant rule

  11. Our Definition X A when ¬ Y is the anomalous rule

  12. Our Definition some overlapping cases may appear

  13. Our Definition If symptons-X then disease-Y If symptons-X then disease-A when not disease-Y disease-A does not occur at the same time of symptons-X and disease-Y

  14. Algorithm Based on TBAR “Tree based association rules” Data & Knowledge Engineering (2001) Berzal, Cubero, Marín, Serrano

  15. A #7 B #9 C #7 D #8 D #5 B #6 C #6 D #7 D #5 D #5 D #5 Algorithm (assoc. rules) Possible Items:A, B, C, D, E, F L1 7 instances wih A 6 inst. withAB L2 5 inst. withAD 6 inst. withBC 5 inst. withABD L3

  16. A#7 AB#6 AC#4 AD#5 AE#3 AF#3 B #9 C #7 D #8 A#7 B #6 D #5 A#7 A* Non frequent Algorithm (anomalous rules) Possible Items:A, B, C, D, E, F First scan Second scan

  17. B #9 D #5 C #7 D #8 A#7 B #6 A#7 A* B #9B* C #7C* D #8D* C #6 D #7 D #5 Algorithm (anomalous rules) Possible Items:A, B, C, D, E, F First scan Second scan Candidate generation

  18. Algorithm (anomalous rules) Rule generation: Inmediate from the frequent items

  19. Experimentation El “Núcleo” de X  Y|A es Y|A

  20. Usual consequent “Anomaly” Experimentation X Y if X then A when not Y X ¬Y  A

  21. Experimentation Nursery: if NURSERY:very_crit and HEALTH:priority then CLASS:priority (9 out of 9) when not CLASS:spec_prior “Anomaly” Usual consequent

  22. Experimentation Census: “Anomaly” if WORKCLASS: Local-gov then CAPGAIN: [99999.0 , 99999.0] (7 out of 7) when not CAPGAIN: [0.0 , 20051.0] Usual consequent

  23. Conclusions We have introduced an alternative type of interesting knowledge: anomalous association rules We have given an efficient algorithm to detect all the anomalies

  24. Conclusions Future Work: To complete experimentation To filter the anomalies, eliminating redundant rules To introduce measures of interest for the anomalies, allowing their ordering

More Related