1 / 18

A Parameterised Algorithm for Mining Association Rules

Department of Information & Computer Education, NTNU. A Parameterised Algorithm for Mining Association Rules. Nuansri Denwattana, and Janusz R Getta, Database Conference 2001 (ADC 2001) Proceedings. 12th Australasian, 29 Jan.-2 Feb. 2001 , pp. 45-51. Advisor : Jia-Ling Koh

niyati
Télécharger la présentation

A Parameterised Algorithm for Mining Association Rules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Department of Information & Computer Education, NTNU A Parameterised Algorithm for Mining Association Rules Nuansri Denwattana, and Janusz R Getta, Database Conference 2001 (ADC 2001) Proceedings. 12th Australasian, 29 Jan.-2 Feb. 2001, pp. 45-51. Advisor:Jia-Ling Koh Speaker:Chen-Yi Lin

  2. Department of Information & Computer Education, NTNU Outline • Introduction • Problem Definition • Finding Frequent Itemsets • Experimental Results • Conclusion

  3. Department of Information & Computer Education, NTNU Introduction (1/2) • Majority of the algorithms finding frequent itemsets counts one category of itemsets, e.g. Apriori algorithm. • The quality of association rule mining algorithms is determined: • the number of passes through an input dataset • the number of candidate itemsets

  4. Department of Information & Computer Education, NTNU Introduction (2/2) • One of the objectives is to construct an algorithm that makes a good guess. • the parameterised (n, p) algorithm finds all frequent itemsets from a range of n levels in itemset lattice in p passes (n>=p) through an input data set.

  5. Department of Information & Computer Education, NTNU Problem Definition • Positive candidate itemset • It is assumed (guessed) to be frequent. • Negative candidate itemset • It is assumed (guessed) to be not frequent. • Remaining candidate itemset • candidates verified in another scan.

  6. Department of Information & Computer Education, NTNU Finding Frequent Itemsets (Guessing Candidate Itemsets) Statistics table T Initial DB scan scan

  7. Department of Information & Computer Education, NTNU apriori_gen Statistics table T Item frequency threshold = 80% m-element transaction threshold = 5 Number of levels to traverse (n) = 3 Number of passes through an input data set (p) = 2 3-element transactions: 5*80%=4  {B}  4-element transactions: 2*80%=2  {ABC} 5-element transactions: 3*80%=3  {BCEF}

  8. Department of Information & Computer Education, NTNU apriori_gen apriori_gen pruning all subsets of positive superset

  9. Department of Information & Computer Education, NTNU scan DB (1) generate remaining candidate itemsets Finding Frequent Itemsets (Verification of Candidate Itemsets) Minimum support=20%

  10. Department of Information & Computer Education, NTNU apriori_gen scan DB scan DB (2)

  11. Department of Information & Computer Education, NTNU Finding Frequent Itemsets

  12. Department of Information & Computer Education, NTNU Experimental Results (1/6) • Parameters: • ntrans-number of transactions in a database • tl-average transaction length • np-number of patterns • sup-minimum support

  13. Department of Information & Computer Education, NTNU Experimental Results (2/6) A comparison of no. database scans between Apriori and (n, p) algorithm

  14. Department of Information & Computer Education, NTNU Experimental Results (3/6) Performance of Apriori and (n, p) with tl=10 np=10 sup=20%

  15. Department of Information & Computer Education, NTNU Experimental Results (4/6) Performance of Apriori and (n, p) algorithm with tl=14 np=10 sup=20% Performance of Apriori and (n, p) algorithm with tl=20 np=100 sup=10%

  16. Department of Information & Computer Education, NTNU Experimental Results (5/6) A performance of (n,3) with increasing ratio of (n/p)

  17. Department of Information & Computer Education, NTNU Experimental Results (6/6) A performance of (8,p) with increasing parameter p

  18. Department of Information & Computer Education, NTNU Conclusion • The important contribution is the reduction of number scans through a data set.

More Related