1 / 23

GDC: Group Discovery using Co-location Traces

GDC: Group Discovery using Co-location Traces. Steve Mardenfeld Daniel Boston Susan Juan Pan Quentin Jones † Adriana Iamntichi ‡ Cristian Borcea Department of Computer Science, New Jersey Institute of Technology † Department of Information Systems, NJIT

clovis
Télécharger la présentation

GDC: Group Discovery using Co-location Traces

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GDC: Group Discovery using Co-location Traces Steve Mardenfeld Daniel Boston Susan Juan Pan Quentin Jones† Adriana Iamntichi‡ Cristian Borcea Department of Computer Science, New Jersey Institute of Technology †Department of Information Systems, NJIT ‡Department of Computer Science, USF

  2. Physical Groups Informally: groups of people that meet face to face Formal definition: Homans’ sociology book “The Human Group” Groups can be used in social or socially aware applications Recommender systems: recommend concerts to people who go to concerts together Data forwarding in delay-tolerant ad hoc networks: give priority to members of same group as destination when selecting next hop How to detect groups automatically?

  3. Group Detection Using Location Traces Users carry mobile phones and upload location to central server Server analyzes location traces to detect groups In previous work, we developed an algorithm for group/place detection Achieved 96% accuracy with low false positives Problems: Location privacy Battery power

  4. A B C GDC: Use Bluetooth Co-location Traces • Advantages • Improved location privacy • Low power consumption • Practicality due to Bluetooth ubiquity in mobile phones • Accuracy due to Bluetooth transmission range INTERNET

  5. Challenges • Attendance at a group is variable • People may be merely passing near a group, not remaining part of it • Group members spend different lengths of time with the group • Sampling frequency and user mobility can affect data completeness • Each user may have a different perspective on the same meeting

  6. Outline GDC Algorithm User Study Results Distributed GDC Conclusions

  7. GDC in a Nutshell • Transform raw Bluetooth records into meeting records between pairs of users • Discover and record all combinations of users appearing at the same meeting (user clusters) • Resolve differences in user perspectives on shared clusters • Select all significant clusters and output as user groups

  8. Creating Pair-wise Meeting Records Decreasing Meeting Granularity (MG) from 5 min to2 ½ min produces noticeable changes

  9. Creating User Clusters

  10. Creating Global Clusters • Resolve Perspective Differences • Use Minimum Group Time (MGT) • Use Minimum Group Meeting Frequency (MGMF)

  11. Selecting the User Groups • Identify and remove subgroups of significant groups • Keep a subgroup if it meets double the time of the group that includes it

  12. Complexity Analysis R - total number of Bluetooth records N - total number of users in the dataset L - maximum number of users in a group Small value because relatively few users are in the transmission range (10m) Our experiments: max = 15, avg = 6.8

  13. Evaluation • Goals • Analyze effect of group meeting frequency and time • Compare GDC and K-Clique • K-Clique uses a time threshold to select graph edges and analyzes the graph for k-cliques • Experiments • Collect data from mobile phones carried by 100+ volunteer students on campus for one month • Run GDC and K-Clique on collected data • Also tested on Reality Mining data from MIT • Ask users to rank groups using Likert Scale • 1 to 5, 5 is best

  14. Data Collection Details • 78 users each contributed less than 24 hours of recorded data • Sparse data: random volunteers, many students are commuters • Demographics: 72% male, 28% female, 25% graduate, 75% undergraduate

  15. Detection accuracy increases significantly with meeting frequency and total meeting time Effect of Meeting Time and Frequency

  16. GDC vs. K-Clique • Overall, GDC groups rated 30% better than the popular K-Clique algorithm • GDC groups are guaranteed to meet • Not all K-Clique groups meet • Some GDC groups are rated poorly because members don’t know their names GDC: MGT = 2000s MGMF = 2 K-Clique: Threshold 2000s

  17. GDC Groups: NJIT Dataset vs. Reality Mining Dataset • Group distributions as a function of size are relatively similar despite the fact that Reality Mining is a denser dataset • NJIT: MGT = 2000s, MGMF = 1 • Reality Mining: MGT = 18000s, MGMF = 9 (normalized for 9 months)

  18. Outline GDC Algorithm User Study Results Distributed GDC Conclusions

  19. Distributed GDC (D-GDC) • GDC executed on the phones • Benefits • Better privacy • Avoid “Big Brother” scenario • Ability to control message exchange on a per-case basis • Resiliency: no bottleneck & no single point of failure • Flexibility: each user controls how often to runD-GDC

  20. D-GDC Implementation • Collect Bluetooth records locally through message exchange • No global aggregation like in GDC • Control exchange with heuristic policies • These policies can be specified by users • Allows greater individual privacy control • Run remainder of GDC device-local • Evaluated using replay simulation over our real traces

  21. Preliminary Results • Overall similarity: compute similarity of each user’s GDC groups against the closest matches in D-GDC and average the results • Compared D-GDC with a version running only on data collected locally by phones • D-GDC performs significantly better than local-only version

  22. Conclusion • Physical groups enable new socially-aware features in applications • GDC: practical, high-accuracy, no location collection • Validated by users and outperforms K-Clique by 30% • Higher accuracy can be achieved by increasing frequency and time parameters • A decentralized version improves privacy and produces promising results

  23. Thank You! • Mobius project: http://www.cs.njit.edu/~borcea/mobius/ • Acknowledgement: NSF grants CNS-0831753 and CNS-0834585

More Related