1 / 22

Mining Bulletin Board Systems Using Community Generation

Mining Bulletin Board Systems Using Community Generation. Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter : Che-Wei, Liang Date: 2008.07.10. Outline. Introduction General Model Interest-Sharing Group Identification Predicting User Behavior Using Generated Community

gerard
Télécharger la présentation

Mining Bulletin Board Systems Using Community Generation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10

  2. Outline • Introduction • General Model • Interest-Sharing Group Identification • Predicting User Behavior Using Generated Community • Experiment

  3. Introduction • Bulletin Board System (BBS) • Information exchanging and sharing platform • Consists of a number of boards • Users can read/post messages on different topics • Users with similar interests may have similar actions • Effective discovery of relationships between users of a BBS is essential

  4. General Model • Consider the posted messages, • Use title to fully determine the topics of message • Extracted key words of titles • Mapped to collected topics • A BBS user tends to join in a discussion on topics that he or she is interested • Messages that users posted may reflect users’ interests • Users’ interests are time-dependent • Frequency of messages posted should also be assessed

  5. General Model • Access pattern of BBS users • View of Topics • A set of topics and user access frequencies of the messages posted to different boards by different users along the timeline • View of Boards • A set of boards and frequencies of messages posted to the boards along the timeline

  6. General Model • BBS model • A collection of users, each being represented by two timelines of actions on Boards view and Topics view

  7. Interest-Sharing Group Identification

  8. Interest-Sharing Group Identification • Given two timelines of actions X and Y of two users idx and idy • A Straight forward way • Similarity between Xi and Yj =

  9. Interest-Sharing Group Identification • Average frequency differences of actions • Local similarity between Xi and Yj

  10. Interest-Sharing Group Identification • Hybrid similarity between Xi and Y • Global similarity between X and Y

  11. Predict User Behavior Using Generated Community • Given a user idi, • Predict what action idi may take in the near future • Actions that have been taken by idi may be closely related to idi’s future actions • Possible solution • Compute posterior probability

  12. Predict User Behavior Using Generated Community • Resolved with interest-sharing groups • Similar users may take similar actions at some time instants

  13. BPUC algorithm

  14. Experiment • Data Set • BBS of Nanjing University • messages collected from January 1st, 2003 to December 1st, 2005 on 17 most popular boards. • 4512 topics of 17 boards, 1109 users. • Evaluation set • 42 volunteers, 18 users interested in modern weapons, 12 users are fond of programming skills; rest of users are interested in computer games

  15. Experiments on Community Generation • Neighborhood accuracy • Describes how accurate the neighbors of a user in a generated community share similar interests to that of the user • Component accuracy • Measures how well these generated groups represent certain interests that are common to the individuals of the groups

  16. Experiments on Community Generation • Example • A generated community, 7 links between similar users, 10 links between dissimilar users • Neighborhood accuracy = (7+10)/21 = 0.810 Component accuracy = (7+0)/21 = 0.333

  17. Experiments on Community Generation • Compare with CORAL

  18. Experiments on Community Generation

  19. Experiments on Community Generation • Running time comparison

  20. Experiments on User Behavior Prediction • 1056 days for training the probability model • Last 10 days for testing

More Related