220 likes | 367 Vues
This report explores methods for community generation and predicting user behavior in Bulletin Board Systems (BBS). It introduces a general model focused on analyzing user activity over time, allowing for the identification of interest-sharing groups. By extracting and mapping keywords from message titles, the model assesses user interests and predicts future actions based on past behavior. The experiments, conducted on data from Nanjing University's BBS, show the efficacy of collaborative filtering techniques in predicting user behavior and community dynamics.
E N D
Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10
Outline • Introduction • General Model • Interest-Sharing Group Identification • Predicting User Behavior Using Generated Community • Experiment
Introduction • Bulletin Board System (BBS) • Information exchanging and sharing platform • Consists of a number of boards • Users can read/post messages on different topics • Users with similar interests may have similar actions • Effective discovery of relationships between users of a BBS is essential
General Model • Consider the posted messages, • Use title to fully determine the topics of message • Extracted key words of titles • Mapped to collected topics • A BBS user tends to join in a discussion on topics that he or she is interested • Messages that users posted may reflect users’ interests • Users’ interests are time-dependent • Frequency of messages posted should also be assessed
General Model • Access pattern of BBS users • View of Topics • A set of topics and user access frequencies of the messages posted to different boards by different users along the timeline • View of Boards • A set of boards and frequencies of messages posted to the boards along the timeline
General Model • BBS model • A collection of users, each being represented by two timelines of actions on Boards view and Topics view
Interest-Sharing Group Identification • Given two timelines of actions X and Y of two users idx and idy • A Straight forward way • Similarity between Xi and Yj =
Interest-Sharing Group Identification • Average frequency differences of actions • Local similarity between Xi and Yj
Interest-Sharing Group Identification • Hybrid similarity between Xi and Y • Global similarity between X and Y
Predict User Behavior Using Generated Community • Given a user idi, • Predict what action idi may take in the near future • Actions that have been taken by idi may be closely related to idi’s future actions • Possible solution • Compute posterior probability
Predict User Behavior Using Generated Community • Resolved with interest-sharing groups • Similar users may take similar actions at some time instants
Experiment • Data Set • BBS of Nanjing University • messages collected from January 1st, 2003 to December 1st, 2005 on 17 most popular boards. • 4512 topics of 17 boards, 1109 users. • Evaluation set • 42 volunteers, 18 users interested in modern weapons, 12 users are fond of programming skills; rest of users are interested in computer games
Experiments on Community Generation • Neighborhood accuracy • Describes how accurate the neighbors of a user in a generated community share similar interests to that of the user • Component accuracy • Measures how well these generated groups represent certain interests that are common to the individuals of the groups
Experiments on Community Generation • Example • A generated community, 7 links between similar users, 10 links between dissimilar users • Neighborhood accuracy = (7+10)/21 = 0.810 Component accuracy = (7+0)/21 = 0.333
Experiments on Community Generation • Compare with CORAL
Experiments on Community Generation • Running time comparison
Experiments on User Behavior Prediction • 1056 days for training the probability model • Last 10 days for testing