1 / 36

Mining Probabilistically Frequent Sequential Patterns in Uncertain Databases

Mining Probabilistically Frequent Sequential Patterns in Uncertain Databases. Zhou Zhao, Da Yan and Wilfred Ng The Hong Kong University of Science and Technology. Outline. Background Problem Definition Sequential-Level U-PrefixSpan Element-Level U-PrefixSpan Experiments Conclusion.

truda
Télécharger la présentation

Mining Probabilistically Frequent Sequential Patterns in Uncertain Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Probabilistically Frequent Sequential Patterns in Uncertain Databases Zhou Zhao, Da Yan and Wilfred Ng The Hong Kong University of Science and Technology

  2. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  3. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  4. Background • Uncertain data are inherent in many real world applications • Sensor network • RFID tracking Prob. = 0.9 Sensor 2: AB Readings: Prob. = 0.1 Sensor 1: BC

  5. Background • Uncertain data are inherent in many real world applications • Sensor network • RFID tracking t1: (A, 0.95) Reader A t2: (B, 0.95), (C, 0.05) Reader B Reader C

  6. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  7. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  8. Problem Definition

  9. Pruning rules for p-FSP

  10. Early Validating • Suppose that pattern α is p-frequent on D’ ⊆ D, then α is also p-frequent on D If α is p-FSP in D11, then α is p-FSP in D.

  11. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  12. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  13. Sequence-level probabilistic model DB: Possible World Space:

  14. Prefix-projection of PrefixSpan B A D|A D|AB D

  15. SeqU-PrefixSpan Algorithm • SeqU-PrefixSpan recursively performs pattern-growth from the previous pattern α to the current β = αe, by appending an p-frequent element e ∈ D |α • We can stop growing a pattern α for examination, once we find that α is p-infrequent

  16. Sequence Projection si A B si|A si|B

  17. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  18. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  19. Element-level probabilistic model DB: Possible World Space:

  20. Possible world explosion # of possible instances is exponential to sequence length

  21. ElemU-PrefixSpan Algorithm

  22. Sequence Projection B

  23. Sequence Projection

  24. Sequence Projection A

  25. Sequence Projection A

  26. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  27. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  28. Efficiency of SeqU-PrefixSpan • Efficiency on the effects of • size of database • number of seq-instances • length of sequence

  29. Efficiency of ElemU-PrefixSpan • Efficiency on the effects of • size of database • number of element-instances • length of sequence

  30. ElemU-PrefixSpan v.s. Full Expansion • Efficiency on the effects of • size of database • number of element-instances • length of sequence

  31. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  32. Outline • Background • Problem Definition • Sequential-Level U-PrefixSpan • Element-Level U-PrefixSpan • Experiments • Conclusion

  33. Conclusion • We formulate the problem of mining p-SFP in uncertain databases. • We propose two new U-PrefixSpan algorithms to mine p-FSPs from data that conform to our probabilistic models. • Experiments show that our algorithms effectively avoid the problem of “possible world explosion”.

  34. Thank you!

More Related