1 / 29

Mining Trajectory Profiles for Discovering User Communities

Mining Trajectory Profiles for Discovering User Communities. Chih-Chieh Hung, Chih-Wen Chang, Wen-Chih Peng. Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03. Outline. Motivation Goal Framework Preprocess Construct User’s Profiles

Antony
Télécharger la présentation

Mining Trajectory Profiles for Discovering User Communities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Trajectory Profiles for Discovering User Communities Chih-Chieh Hung, Chih-Wen Chang, Wen-ChihPeng Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03

  2. Outline • Motivation • Goal • Framework • Preprocess • Construct User’s Profiles • Formulate Distance function • Identify Community • Experiments • Conclusion

  3. Motivation (1/2) • Rapid development of positioning techniques, users can easily collect their trajectories • GPS Logger, smart phones and navigation devices

  4. Motivation (2/2) • Many GPS community sites are established • Users can share their own trajectories • Users can search trajectories Query Every Trail My tracks

  5. Goal • Mine user communities from raw trajectories • User Communities • Sets of users who have similar moving behaviors • Applications • Find new friends • Recommendation • Rank of trajectories

  6. 3. Identify users communities 2. Formulate distance function 1. Construct User’s Profile Profile Community 1 Profile Measure Distance Between Users Community 2 Profile

  7. Outline • Motivation • Goal • Framework • Preprocess • Construct User’s Profiles • Formulate Distance function • Identify Community • Experiments • Conclusion

  8. Framework Preprocess Construct User’s Profile Measure Distance Between Users Identify Community

  9. Preprocessing • Step 1: • Find frequent regions • Input: all trajectories of users • Output: frequent regions • Density-based approach • Step 2: • Transform trajectories into sequences of frequnet region id • T1 : <A, B, D>

  10. Framework Preprocess Construct User’s Profile Measure Distance Between Users Identify Community

  11. Construct User’s Profiles (1/2) • User’s Profile • Probabilistic Suffix Tree (abbreviated as PST) • Find and organize trajectory patterns • Record the probability of next movements Frequently moving sequence Conditional tables (next possible movements)

  12. Construct User’s Profiles (2/2) • Construct PST • Level by level • Two operations: • Create a child node • The counts of Before symbol > MinSup • Add a symbol into the related conditional table • The counts of After symbol > MinSup ABE ABA AC B ADF H JHI EDH ABE ABA AC B ADF H JHI EDH ABE ABA AC B ADF H JHI EDH root MinSup = 0.2 B A After symbol A : 1  1/2 = 0.5 E : 1  1/2 = 0.5 B:0.375 A:0.5 B:0.375 Before symbol A : 2  2/3 × 0.375 = 0.25 A AB:0.25

  13. Framework Preprocess Construct User’s Profile Measure Distance Between Users Identify Community

  14. Formulate Distance function (1/3) • Determine distance of users • Transform the PST into Moving Sequence List Each element in moving sequence list is a branch of PST with their probability L1 [1..2] = <[(A,0.5)],[(B,0.375)(AB,0.33)]>

  15. Formulate Distance function (2/3) • Define the distance between PSTs • Find the minimal dist(Li[1..m], Lj[1..n]) • Use three editing operations • Insertion L1={m1:0.3,m2:0.2,m3:0.3} L2={m1:0.3,m2:0.2} T1 Cost = 0.3 T2 0.2 L1={m1:0.3,m2:0.2,m3:0.3} L2={m1:0.3,m2:0.2,m3:0.3} 0.1 Insert

  16. Formulate Distance function (3/3) • Deletion • Replacement L1={m1:0.2,m2:0.2,m3:0.2} L2={m1:0.2,m2:0.2,m3:0.2} T1 T2 0.3 Cost = 0.3 Replace L1={m1:0.2,m2:0.2,m3:0.2} L2={m1:0.2,m2:0.2,m4:0.3} L1={m1:0.2,m2:0.3} L2={m1:0.2,m2:0.3,m3:0.3} L1={m1:0.2,m2:0.3} L2={m1:0.2,m2:0.3,____} Delete Cost = 0.3+0.2 = 0.5 T1 T2 0.3 0.2 0.2

  17. Framework Preprocess Construct User’s Profile Measure Distance Between Users Identify Community

  18. Identify Community (1/4) • User community • The same community: δMLS(Ti,Tj) < thresholdδ • The number of communities is minimal • Transform the relation between PSTs into a graph • A vertex represents a user • An edge exists between two vertices when δMLS(Ti,Tj) < thresholdδ O1 O4 O2 O3 O5

  19. Identify Community (2/4) • Model as a minimum clique problem • A clique is a set of pair-wise adjacent vertices Example O4 O1 O5 O2 O3

  20. Identify Community (3/4) • Select a representative PST for each community • Represent all PSTs in the same community • Advantages • Reduce the overhead of storages • Speed up query processing • Identify new users for their communities Add into ? Representative PST

  21. Identify Community (4/4) • Two factors • Sizeof representative PST • The number of tree nodes, denoted as N(Ti) 2. Distance between the selected PST and others in the same community • The error sum, denoted as ES - Sum of the distance between selected PST and others • Representative PST • Minimize

  22. Outline • Motivation • Goal • Framework • Preprocess • Construct User’s Profiles • Formulate Distance function • Identify Community • Experiments • Conclusion

  23. Experiments (1/4) • Simulator Model • Use real trajectories from CarWebto simulate the group mobility of users • Total : 2400 trajectories

  24. Experiments (2/4) • Compare to General Sequential Pattern mining algorithm (GSP) • Set of sequential patterns Ex. sp1, sp2, ..., spn • Trajectory profile of a user represented as a • Distance function between profiles • Cosine similarity measurement, similarity(Vi, Vj) = Example Similarity : <1,1,0,0> . <0,1,1,1> |<1,1,0,0>||<0,1,1,1>|

  25. Experiments (3/4) • Impact of Trajectory Profiles GSP are always larger than PST Especially in MinSup smaller than 0.15 Storage Prediction

  26. Experiments (4/4) • Impact of the thresholdδ and MinSup • Smaller thresholdδ will find more number of communities Storage Prediction

  27. Outline • Motivation • Goal • Framework • Preprocess • Construct User’s Profiles • Formulate Distance function • Identify Community • Experiments • Conclusion

  28. Conclusion • Explore the problem of mining communities from trajectories Preprocess Find frequent regions Replace trajectories by region ids Construct User’s Profile Build probabilistic suffix tree (abbreviated as PST) Measure Distance Between Users Formulate distance function Identify Community Cluster users by distance function Select Representative PSTs

  29. Thank you!

More Related