1 / 22

Mining Favorable Facets

Mining Favorable Facets. Presenter : Wei-Hao Huang Authors : Raymond Chi-Wing Wong, Jian Pei, Ada Wai-Chee Fu, Ke Wang SIGKDD, 2008. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.

Télécharger la présentation

Mining Favorable Facets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Favorable Facets Presenter : Wei-Hao Huang Authors : Raymond Chi-Wing Wong, Jian Pei, Ada Wai-Chee Fu, Ke Wang SIGKDD, 2008

  2. Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

  3. Motivation • The importance of dominance and skyline analysis in multi-criteria decision making applications. • Fixed order v.s. different customers may have different preferences on nominal attributes. • Finding favorable facets.

  4. Objectives • Propose to minimal disqualifying condition (MDC) which can summarize favorable facets and is meaningful to the user. • Develop two algorithms: • Computing MDC On-the-fly (MDC-O) • A Materialization Method (MDC-M) • Use real data sets and synthetic data set to verify effectiveness and efficiency

  5. Methodology • Skyline analysis • Naïve Method • Minimal Disqualifying Conditions(MDC) • MDC On-the-fly (MDC-O) • A Materialization Method (MDC-M)

  6. Skyline analysis

  7. Naïve Method: Lattice Search

  8. Minimal Disqualifying Conditions • Used to summarize favorable facets effectively. R’={(T,M)} R’’={(H,M)} MDC(f)={(T,M),(H,M)}

  9. MDC-O: Computing MDC On-the-fly Point: P Data Set: D Template: R Process MDC(P)

  10. MDC-M: A Materialization Method Data Set: D Template: R Process SKY(R) MDC

  11. Indexing for Speed-up • Use R-tree index structure • An R-tree can be built the totally ordered attributes T • Find points that quasi-dominates p, a range search is conducted on the R-tree

  12. Experiments • Synthetic Data Set • Dimension • Numeric attributes • Nominal attributes • Tuples • Template Size • Cardinality of Nominal Attributes • Zipfian Parameter • Real Data Set • Nursery • Automobile

  13. Synthetic Data Set-Dimension(numeric attributes)

  14. Synthetic Data Set-Dimension(nominal attributes)

  15. Synthetic Data Set-Tuples 500k -> 1000k

  16. Synthetic Data Set-Template Size

  17. Synthetic Data Set-Cardinality of Nominal Attributes

  18. Real Data Set • Nursery Data Set • There are 12,960 instances and 8 attributes. • The results in the performance are similar to synthetic data sets. • Automobile Data Set • Computation times were negligibly small. • Honda, Mitsubishi and Toyota.

  19. Conclusions • MDC is effective in summarizing the favorable facets. • The experimental results show proposed methods are efficacious. • Future work is used to dynamic data and ordering is an interesting topic.

  20. Comments • Advantages • Finding favorable facets which has not been studied before. • Effectiveness and the efficiency of the mining. • Applications • Information retrieval

More Related