1 / 12

A Tree-based Approach for Frequent Pattern Mining from Uncertain Data

A Tree-based Approach for Frequent Pattern Mining from Uncertain Data. Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008. Outline . Motivation UF-Growth algorithm Construction of the UF-Tree Mining of Frequent Patterns from the UF-Tree

kiril
Télécharger la présentation

A Tree-based Approach for Frequent Pattern Mining from Uncertain Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Tree-based Approach for Frequent Pattern Mining from Uncertain Data Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008

  2. Outline • Motivation • UF-Growth algorithm • Construction of the UF-Tree • Mining of Frequent Patterns from the UF-Tree • Improvements to UF-Growth algo. • Experimental Results • Conslusion

  3. Motivation • Over the past decade, there have been numerous studies on mining frequent patterns from precise data. • However, there are situations in which users are uncertain about the presence or absence of some items. suspicion

  4. UF-Growth Algorithm • The algorithm consists of two operations: • The construction of UF-tree • The mining of frequent patterns from UF-tree

  5. Construction of the UF-Tree minsup = 1 Scan DB Scan DB 1 1 1

  6. Mining of Frequent Patterns from the UF-Tree • expSup({a,e}) = (1*0.72*0.9)+(2*0.71875*0.9) =1.94175 • expSup({d,e}) = (1*0.72*0.71875)+(2*0.71875*0.72) =1.5525 • {a,e} and {d,e} are frequent {e}-projected DB

  7. (Cont.) • expSup({d,e}) in {d,e}-projected DB is 0.5175=0.71875*0.72 • expSup ({a,d,e})=3*0.5175*0.9=1.39725 • {a}, {a,d}, {a,d,e}, {a,e}, {b}, {b,c}, {c}, {d}, {d,e}, and {e} {d,e}-projected DB {e}-projected DB

  8. Improvements to UF-Growth Algorithm • The UF-tree above may appear to require a large amount of memory • Improvement • To increase the chance of path sharing, we discretize and round the expected support of each tree node up to kdceimal places

  9. (Cont.) • The iprovedUF-growth does not need to bulid subsequent UF-trees for any non-singleton patterns. • To enumerate all its subsets {a,e}, {a,d,e}, {d,e} with their expected supports equal 0.648, 0.46575 and 0.5175 so far. {e}-projected DB To enumerate all its subsets and {a,e}, {a,d,e}, {d,e} with their accumulative expected supports equal 1.94175, 1.39725 and 1.5525

  10. Experimental Results

  11. (Cont.)

  12. Conclusion • Improvement 1. method may cause false positive.

More Related