Learning Profiles from User Interactions

The University of Texas at Dallas Learning Profiles from User Interactions Pelin Atahan and Sumit Sarkar School of Management, The University of Texas at Dallas pxa041000@utdallas.edu, sumit@utdallas.edu

The University of Texas at Dallas Introduction • Personalization systems tailor content and services to individuals • Consider vendor selling products through its website • Personalize recommendations • Learn profiles based on links visited by a user • user visits a link (l) to which 70% of visitors are male • predict user is male with probability 0.7 and • revise this probability as the user navigates through the website, i.e., clicks on other links

Research Framework • Learn profiles for targeting purposes • personal profiles – demographic, psychographic, geographic attributes • predetermined set of attributes, e.g., gender, income, risk taker • Profile representation –attribute values with relevant probabilities • for attribute “gender” (G) profile maybe represented as P(G=m│l)=0.7, and P(G=f │l)=0.3 • for attribute “risk taker” (R) with values risk taker (r), conservative (c), P(R=r│l)=0.6, P(R=c│l)=0.4 • Probabilistic representation

Data Requirements • Data requirements – link level statistics only (for all links) • examples: • P(G=m│”finance” link)=0.7, P(G=f│”finance” link)=0.3 • P(R=r│”finance” link)=0.6 and P(R=c│”finance” link)=0.4 • Data can be acquired from one of the following sources • registered users, if available • sampling – explicitly asking a subset of users • professional market research agencies like comScore, Claritas, and Nielsen/Net Ratings.

The University of Texas at Dallas Research Problems • Learn the personal profile of a user based on links traversed during a session • Two types of learning considered • Learning profiles passively by observing links traversed • Learning profiles quickly by dynamically determining links available on a page

The University of Texas at Dallas Literature Review • Primarily study profiling in information retrieval context • user interests • identifying interesting pages based on pages visited. • profiles represented as feature (term) vectors • Montgomery (2001) address learning demographic profiles from websites visited by a user • approach is faulty (conditioning is incorrect) • Baglioni et al. (2003) address identifying the gender of a user based on links visited • consider a subset of pages • apply several classification models

The University of Texas at Dallas Passive Learning • Consider, Yahoo wants to learn the gender of a user who is traversing its website • user clicks on the following links • the “finance” link (l1) • the “investing ideas” link (l2) • the “insurance” link (l3) • the “sports” link (l4) • problem: To determine the probability that the visitor is male (or female) given this clickstream { l1, l2, l3, l4} • P(G=m│l1’ l2, l3, l4) • In general, for attribute (A) and clickstream { l1, l2, …, ln} • P(A=ai│l1’ l2, …, ln)

The University of Texas at Dallas Passive Learning Cont’d • Use Bayes formula where • Assume conditional independence, i.e., probability of clicking a link is independent of the probability of clicking another link, when the user profile is known

The University of Texas at Dallas Passive Learning Cont’d • After algebraic manipulations, we get: • We can learn customer profile from simple link statistics • The process is not computationally intensive

The University of Texas at Dallas Illustrative Example • Consider the following site priors and link probabilities • P(m│l1’ l2, l3, l4)=0.91 and P(f│l1’ l2, l3, l4)= 0.09.

The University of Texas at Dallas Learning Profiles in Real Time • What happens when the user clicks on a new link? • NBA scoreboard link (l5) • Incremental belief revision • LH – denotes the link history (links clicked prior to the last click)

The University of Texas at Dallas Incremental Revision Example • P(m│LH, l5)=? • P(m│LH)=P(m│l1’ l2, l3, l4)= 0.91 and P(f│LH)=0.09 • P(m│l5)=0.65 and P(f│l5)=0.35 • P(m│LH,l5)= 0.96 and P(f│LH,l5)= 0.04

The University of Texas at Dallas Active Learning of User Profiles • By learning profiles quickly, websites start getting the benefits sooner • Learning is the reduction in uncertainty of profile attributes • Our objective: Learn profiles quickly by carefully selecting the links to offer at each page (offer set) • Information value of an offer set is measured as the expected information gain • The number of links to offer (n) is predetermined • Assume the user will click one of the links available • Stop learning when expected additional information is not statistically significant

The University of Texas at Dallas Click Probabilities Conditional on an Offer Set • Offer set O={o1,o2,…,on} • We estimate P’(lj│ai) for each attribute value and each link in the offer set. • From Bayes rule: • We need some measure of the likelihood of a link being clicked, P(lj). • does not need to be absolute, a relative measure is sufficient • e.g., number of clicks a link gets per month

The University of Texas at Dallas Belief Revision Conditional on an Offer Set • Belief revision • Manipulating the above expression we get: • P’(ai│LH) corresponds to the prior on the attribute value at each iteration

The University of Texas at Dallas Information Gain Given a Link is Clicked • Information gain: Defined as the reduction in entropy of attribute’s distribution given a link is clicked • Entropy prior to a click • Entropy given a link is clicked

The University of Texas at Dallas Expected Information Gain Given an Offer Set • When n links are offered • P’(lj│LH) is the probability of a link being clicked given the offer set

The University of Texas at Dallas Optimal Offer Set-One Step Look Ahead • Prior entropy is constant given the link history • We can determine optimal offer set that minimizes the expected entropy

The University of Texas at Dallas Illustrative Example • The user has visited the “finance” link and • There are three possible links to consider • Offer set size n=2 • Three possible offer sets: O1={o1, o2}, O2={o1, o3}, O3={o2, o3}. • EI(G│lj, O1)=0.06 • EI(G│lj, O2)=0.18 • EI(G│lj, O3)=0.04 • Offering O2 is optimal

The University of Texas at Dallas Determining the Optimal Offer Set • The number of potential offer sets to evaluate could be very large • For a site with M links and offer set size n, number of possible combinations: • E.g. for M= 100 and n = 10, there are more than 17 trillion combinations

The University of Texas at Dallas Heuristic Approach to Determine the Optimal Offer Set • Consider the expected entropy expression for learning the gender (n = 2) • P’(lj,ai), is proportional to P(lj,ai), the joint distribution of the aggregate link probabilities

The University of Texas at Dallas Heuristic Approach to Determine the Optimal Offer Set • To select n links to offer • For each attribute value, select link that maximizes P(ai,lj) • If more links needed, evaluate links with the next highest joint probability • Continue until all n links have been determined.

The University of Texas at Dallas Discussions • Assumption: the probability of clicking a link is conditionally independent of the probability of clicking other links. • If this assumption does not hold for some links, we can group the correlated links into disjoint sets, • use joint probabilities associated with these groups of links for belief revision, or • use aggregate group level probability parameters to revise beliefs

The University of Texas at Dallas Discussions • Assumption: the user will follow one of the links being offered. • Other possibilities • the user may leave the site • the user may click the back button, and select a different link • if there is a search engine available on the site, the user may submit a query and navigate to the results page

The University of Texas at Dallas Conclusion • Presented a framework for modeling user profiles for targeting purposes • Showed how the profile can be learnt implicitly from the links traversed • Showed how the learning process can be expedited by dynamically determining the offer set at each iteration • Data requirements are reasonable • Computationally not intensive

The University of Texas at Dallas On-going Work • Solution approaches to the optimal offer set selection problem – refine heuristic • Validate the models • Extend the model to learn multiple attributes simultaneously

The University of Texas at Dallas Thank you!

Learning Profiles from User Interactions