1 / 19

Mining General Temporal Association Rules for Items with Different Exhibition

Mining General Temporal Association Rules for Items with Different Exhibition. Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international Conference on Data Mining(ICDM’02) Adviser: Jia-Ling Koh Speaker: Yu-ting Kung. Introduction.

carolynv
Télécharger la présentation

Mining General Temporal Association Rules for Items with Different Exhibition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining General Temporal Association Rules for Items with Different Exhibition Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international Conference on Data Mining(ICDM’02) Adviser: Jia-Ling Koh Speaker: Yu-ting Kung

  2. Introduction • In this paper, explore a new model of mining general temporal association rules from large database where the exhibition periods of the items are allowed to be different from one to another. (see next page)

  3. Introduction (Cont.) • What’s wrong on conventional mining algorithm applied in this database? • For example: • Min_support = 30%, min_conf= 75% • By conventional mining, only {A}, {B}, {C} and {F} are frequent itemsets • No association rule discovered But some rules do exist in this database!!

  4. Introduction (Cont.) • What’s the problem of conventional mining algorithm? • It doesn’t take the individual exhibition periods of items into consideration.

  5. itemset earliest-exhibition-end-time Latest-exhibition-start time Introduction (Cont.) • For allowing to have different exhibition periods, now define three basic definition: • Maximal common exhibition period (MCP) • MCP(X) = [p, q] • For example: (in Figure1) MCP(BC) = [2,3]

  6. Introduction (Cont.) • Relative support • For example: (in Figure1) • Confidence • For example: (in Figure1)

  7. Introduction (Cont.) • Based on the definition above, the frequent general temporal association rules in this database are:

  8. Introduction (Cont.) • In this model, the “downward closure” property is no longer valid. • For example: (In Figure1) itemset BCD is frequent in [2,2], but BC, BD and CD are “not” all frequent in their corresponding MCP!! ex: BC’s relative support is only 25% (< 30%)

  9. Problem Description • Maximal temporal itemset • For example: • BCD2,2 ( ) • BD2,2 ( ) • BC2,2 ( X ) • Temporal sub-itemset of the maximal temporal itemset • For example: • BCD2,2 is a maximal temporal itemset BD2,2 , BC2,2 and CD2,2 are the temporal sub-itemset of BCD2,2

  10. Problem Description (Cont.) • Maximal temporal itemset is frequent • For example: (XMCP(X) is a maximal TI) If supp(XMCP(X)) >= min_supp, thenXMCP(X) is a frequent • Property: All temporal sub-itemsets of a frequent maximal temporal itemset are frequent • General temporal association rule • It will be frequent iff

  11. Mining General Temporal Association Rule ─ SPF Algorithm • SPF consists of “two” major procedures: • Segmentation (ProcSG) • Progressively Filtering (ProcPF) • First, SPF divide the database into partitions according to the time granularity imposed. • Second, SPF employs ProcSG • Third, SPF utilizes ProcPF • Then, generate all candidate k-itemsets from (k-1)-itemset  transform to TIs, generate SIs • Finally, scan database to determine all frequent TIs and SIs

  12. SPF Algorithm ─ ProcSG • Segment the database into sub-database that items in each will have either the common starting time or the common ending time • db1,6 db1,3, db4,4 and db5,6

  13. SPF Algorithm ─ ProcPF • After the entire database is segmented by ProcSG, ProcPF is to progressivly filter candidate 2-itemsets from one partition to another in each sub-database

  14. An Illustrative Example (SPF) • Illustrative Example: Figure1 • Min_supp = 30%, min_conf=75% • Use ProcSG: database  sub-databases • db1,4 db1,2 and db3,4 (two sub-segments)

  15. An Illustrative Example (SPF) • Use ProcPF: progressively filter the candidate 2-itemsets

  16. An Illustrative Example (SPF) • After the 1st database scan, • C2= {AB, BC, BD, CD, CF, EF} • Generate C3, C3={BCD} • Transform to TI and generate SI • After the 2nd database scan, • Frequent T1={AB2,4, BD2,2, CF1,3, EF3,3 BCD2,2}

  17. Experiment • Data • |D| = the number of transactions • |T| = average size in each transaction • |N| = the number of different items • |L| = the number of potential frequent itemsets • Algorithms to compare • SPF • AprioriIP

  18. Experiment (Cont.)

  19. Experiment (Cont.)

More Related