1 / 22

Summarizing Sequential Data with Closed Partial Orders

Summarizing Sequential Data with Closed Partial Orders. Gemma Casas-Garriga Proceedings of the SIAM International Conference on Data Mining (SDM'05) Advisor : Jia-Ling Koh Speaker : Chun-Wei Hsieh 03/10/2006. Introduction.

latson
Télécharger la présentation

Summarizing Sequential Data with Closed Partial Orders

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Summarizing Sequential Data with Closed Partial Orders Gemma Casas-Garriga Proceedings of the SIAM International Conference on Data Mining (SDM'05) Advisor:Jia-Ling Koh Speaker:Chun-Wei Hsieh 03/10/2006

  2. Introduction • Closed patterns is a compact and significative set • The number of closed patterns may be still quite large • Summarizing closed patterns with post-processing

  3. <(A)(C)(C)(C)(A)>,<(C)(A)(C)(C)(A)> Which is better than the other ? Motivation

  4. Main steps • Grouping Closed Sequential Patterns • Obtaining Closed Partial Orders

  5. Grouping Closed Sequential Patterns • A valid pair (S, T ) • S ⊆CS is a nonredundant set of closed sequences, whose tid lists are at least T • T ⊆ D is the maximal set of transactions where all s ∈ S are contained.

  6. Grouping Closed Sequential Patterns

  7. <(C)(A)(C)>? • The naive way may miss some element • Ex: <(C)(A)(C)> Grouping Closed Sequential Patterns • A naive approach is to group closed sequences with the same tid list

  8. Grouping Closed Sequential Patterns • Let (S, T ) be a valid pair, then we have that S = t • for all s ∈ S we have that tid(s) is at least T • It has to use the transactions of the database

  9. (S′, T ′) (S, T ) Grouping Closed Sequential Patterns • Given two valid pairs (S′, T′) and (S, T ), if T ⊆ T′ then for all s′∈ S′ there exists s ∈ S s.t. s′⊆ s.

  10. Grouping Closed Sequential Patterns

  11. Grouping Closed Sequential Patterns

  12. Obtaining Closed Partial Orders • obtain a compact representation from each valid pair (S, T ) • A partial order can be modelled as a triple p = (V,E, l)

  13. Obtaining Closed Partial Orders • Given a set of sequences S and let s, s′ ∈ S be two sequences s = , = • if − = ; and, − head (s, I ) ⋄ tail ( , j + 1) ⊆ , for some ∈ S; and, − head ( , j ) ⋄ tail ( s , i + 1) ⊆ , for some ∈ S. then that position i of s matches with position j of ; note it by p[i] ∼ q[j].

  14. CCCA ACACCA ACC CA CACCA CAC CA ACCCA Obtaining Closed Partial Orders • S={<(A)(C)(C)(C)(A)>,<(C)(A)(C)(C)(A)>} AC CCA C ACCA

  15. Obtaining Closed Partial Orders

  16. Obtaining Closed Partial Orders

  17. Obtaining Closed Partial Orders • Using the transitivity property to improve the algorithm • Transitivity: Given a valid pair (S, T ) let s, , ∈ S, if s[i] ∼ [j] and [j] ∼ [k], then s[i] ∼ [k].

  18. Simultaneity condition of input sequences

  19. Experiment • 3 different sequential database • Synthetic data (1000 transactions) • The command history of a unix computer user (607 transactions) • The first chapter of the book “1984” by George Orwell (340 transactions)

  20. Experiment

  21. Experiment

  22. Experiment

More Related