460 likes | 684 Vues
Projects for Online Advertising. PROJECT 1. 2. AD BEHAVIOR IN PANDORA. Arindam Paul ArindamPaul2012@u.northwestern.edu. Motivation. 3. Pandora is an automated music recommendation system. It has 2 kinds of services: premium (users have to pay) and free (supported by advertisements).
 
                
                E N D
PROJECT 1 2 • AD BEHAVIOR IN PANDORA Arindam Paul ArindamPaul2012@u.northwestern.edu
Motivation 3 • Pandora is an automated music recommendation system. It has 2 kinds of services: premium (users have to pay) and free (supported by advertisements). • To listen to music, users have to register to Pandora with their email id. As most users are free users, a major source of the revenue made by them are from advertisements.
Project Title 4 4 • Personalized Ads tend to draw more revenue. We want to find out how much (to what degree) does the users' music choices affect the ads delivered to him, if at all.
Methodology (Cont…) 5 • Use a scripting language (preferably python) to build a tool which scrapes the Pandora website for you. • Pandora publishes ads for companies. • The information both about i) user's details (age and gender etc) and ii) type of music (artist, music genre etc) are passed to the ad company via cookies.
Methodology 6 • The details have to be scraped by the tool to retrieve the useful portion and check whether user's music choices affect the ads. • Then, after the data is collected, information retrieval algorithms have to be used to analyze raw data and transform it into useful information.
Deliverables 7 • A tool to scrape Pandora • Term paper based on the analysis of the ad model (which should have) • Description of the Pandora Recommendation System and Ad Model • Analysis of the Ad Model with Graphs and Charts • Pros and Cons of the Current Model • Suggestions
Milestones 8 • Mid-Term (10th May) : Working tool which scrapes the website and basic analysis. • End-Term (7th Jun) : Fully functional tool and term paper submission.
PROJECT 2 9 • AD BEHAVIOR IN YOUTUBE Arindam Paul ArindamPaul2012@u.northwestern.edu
Motivation and Project Goal 10 • YouTube is a video sharing website with most of the content uploaded by individuals, while some uploaded by large corporations. • Like Pandora, Youtube also makes money from the ads. We all get irritated by the ads before our video starts. • We need to understand: • The attributes (features) which help YouTube generate the Personalized Ads. • For e.g. location information • The relative effect of these features
Methodology 11 • Use a scripting language (preferably python) to build a tool which crawls the Youtube website for you. You may use the YouTube Developer API to build your tool. • Like Pandora, the details have to be scraped by the tool to retrieve useful portion and check what affects the ads. We know for certain location affects the personalization of the ads. What else affects the ads ?
Deliverables 12 • A tool to scrape YouTube • Term paper based on the analysis of the ad model (which should have) • Description of YouTube Ad Model • Analysis with Graphs and Charts • Pros and Cons of the Current Model • Suggestions
Milestones 13 • Mid-Term (10th May) : Working tool which scrapes the website and basic analysis. • End-Term (7th Jun) : Fully functional tool and term paper submission.
PROJECT 3 14 • RECOMMENDATION SYSTEM IN YOUTUBE Arindam Paul ArindamPaul2012@u.northwestern.edu
Motivation 15 • YouTube is the world's largest video sharing website. Although users can watch videos without logging in, most of the features require logging in. • These include among others: • channels/subscriptions • previous watched history • connections with social accounts :Google+/Facebook
Project Goals 16 • Find which (some or all) of these features affect the recommendation model of Youtube video • The degree to which each of these features affect the recommendation model
Methodology 17 • Use a scripting language (preferably python) to build a tool which crawls the Youtube website for you. Again, you may use the YouTube Developer API to build your tool.
Deliverables 18 • A tool to scrape YouTube • Term paper based on the analysis of the Recommendation System • Description of Recommendation Mechanism • Analysis with Graphs and Charts • Limitations of the Current Model • Suggestions
Milestones 19 • Mid-Term (10th May) : Working tool which scrapes the website and basic analysis. • End-Term (7th Jun) : Fully functional tool and term paper submission.
Project 4 Pinterest Analysis Ning Xia ningxia2015@u.northwestern.edu
Project 4: Pinterest Analysis • Project Goal • Quantify Pinterest user activities; • pin, repin, like, and etc; • Quantify social migration on Pinterest; • How many users are from Facebook/Twitter? • Quantify information(pin) propagation;
Project 4: Pinterest Analysis • Methodology • Crawling with tools (ruby, mechanize, and etc.) • We already have the crawlers. • Data Collection. • Keep data collections. • Social Analysis; • Graphics related analysis.
Project 4: Pinterest Analysis • Milestones • Coding • Understand the tools; • Update the Pinterest crawler; • Data Collection • Crawling. • Analysis; • Models, statistics and etc;
Project 4: Pinterest Analysis • Submissions: • Crawlers; • Datasets; • Analysis Report.
Project 5 Ad Tracker Ning Xia ningxia2015@u.northwestern.edu
Project 5: Ad Tracker • Project Goal • We know that: • Ad companies are tracking users; • Ad companies know user information; • We want to find out: • How do they track users? UDID? • If so, there will be a privacy problem. • What do they know about the users? • Gender, age, and other information? • Is their knowledge about users accurate?
Project 5: Ad Tracker • Methodology: • Key-value pair analysis from HTTP traffic. • Key value pairs in GET/POST/cookies • Dataset: • From a mobile CSP; • Facebook profile info as the ground truth; • Tools: • Facebook Crawler(co-work with project1); • Analysis: • User ID or UDID? • Gender, age, zipcode, lat/long, …; • Other fields?
Project 5: Ad Tracker • Related work: • Privacy Diffusion on the Web: A Longitudinal Perspective, Balachander Krishnamurthy, WWW’09 • Privacy leakage vs. Protection measures: the growing disconnect, Balachander Krishnamurthy, W2SP’11 • Example:
Project 5: Ad Tracker • Milestones • Facebook crawler, 2 weeks; • Data collection, 2 week; • Analysis; • Deliverables: • Crawlers; • Datasets; • Analysis Report;
Project 6 User Mobility and Interests Ning Xia ningxia2015@u.northwestern.edu
Project 6: User Mobility and Interests • Project Goals • Develop tools to track user locations; • Collect anonymous data from volunteers; • Model user’s mobility/interests on different PoIs(point of interest) • Related Work • Learning to Rank for Spatiotemporal Search, WSDM’13
Project 6: User Mobility and Interests • Methodology: • Get user lat/long • Get user’s “check-in” for PoIs • Build models to parse user’s movement and interests. • Tools • Foursquare – to collect PoI information; • Andriod – to record user movement;
Project 6: User Mobility and Interests • Milestones: • Android Development; • Data Collection from volunteers; • User Interest Modeling. • Deliverables: • Android app; • Datasets; • Analysis Report;
Project 7 Ad Classification Marcel Flores marcelflores2007@u.northwestern.edu
Project 7: Ad Classification • Given an ad, can we tell what it’s about? • What categories are most common? • How closely are they correlated with page content? • Could enable deeper understanding of targeted ads and tracking • Allow auditing of Advertisers
Methodology • Develop a scraper which collects ads and ad urls in Python. • Collect text from linked pages (i.e. clean out the HTML. • Place ad/page in known category using NLTK machine learning tools.
Milestones • Week 3 - Scraper to collect ads and URLs • Week 7 - Initial analysis of ads and URLs with NLTK • Verification set • Week 10 - Finalized approach and tools • Selected algorithm, accuracy checks
Project 8 Android App Survey Marcel Flores marcelflores2007@u.northwestern.edu
Project 8:Android App Survey • What information is available to advertisers for ads running in Android applications? • Can we do anything to warn users about info advertisers are using?
Methodology • Analyze source/traffic from popular apps • Dex2jar and Java decompilers • Scripts for downloading apps from Play Store • Network proxy to monitor phone traffic • Create tools to notify user of what ads are doing
Milestones • Week 3: Have downloading scripts for play store, find apps that send user info • Week 7: Initial detection of information leakage • Week 10: On-device analysis of advertising information leakage
Project 9 Abusive Ads Marcel Flores marcelflores2007@u.northwestern.edu
Project 9:Abusive Ads • Do ads attempt to take advantage of computer illiterate users? • Are certain queries likely to result in abusive ads? • Could enable tools to project against fraud. • Increase accountability of advertisers.
Methodology • Collect ads resulting from queries/urls related to naive computer user topics • Collect a control set for random queries, most popular sites, advanced user topics • Measure for difference in ad features (word occurrence, URL occurrence, etc.)
Milestones • Week 3 - Scraper for ads on various topics • Week 7 - Feature selection and extraction • Week 10 - Final comparison between topic sets
Arindam Paul ArindamPaul2012@u.northwestern.edu Ning Xia ningxia2015@u.northwestern.edu Marcel Flores marcelflores2007@u.northwestern.edu