lecture 21 privacy and online advertising n.
Skip this Video
Loading SlideShow in 5 Seconds..
Lecture 21: Privacy and Online Advertising PowerPoint Presentation
Download Presentation
Lecture 21: Privacy and Online Advertising

Lecture 21: Privacy and Online Advertising

124 Vues Download Presentation
Télécharger la présentation

Lecture 21: Privacy and Online Advertising

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Lecture 21: Privacy and Online Advertising

  2. References • Challenges in Measuring Online Advertising Systems by SaikatGuha, Bin Cheng, and Paul Francis • Serving Ads from localhost for Performance, Privacy, and Profit by SaikatGuha, AlexeyReznichenko, Kevin Tang, HamedHaddadi, and Paul Francis

  3. Problem • Online advertising funds many web services • E.g., all the free stuff we get from Google • Ad networks gather much user information • How do they use the user information?

  4. Goals • Determining how well ad networks target users

  5. Methodology • Creating two clients representing two different user types • Measuring the different ads each client sees

  6. Challenges • How to compare ads • How to collect a representative snapshot of ads • Quantifying the differences • Avoiding measurement artifacts

  7. Comparing Ads is challenging • Ads don’t have unique IDs • A & B are semantically the same, but with different text • A & C are different, but with same display URLs

  8. How to define two ads are the same? • Easy but illegal approach: comparing destination URLs • FP: flagged as equal but not • FN: equal but not flagged • Display URL has the lowest FNs  Use display URL to define ads equality

  9. Taking a Snapshot • More ads can be displayed on any single page • How to determine all Ads that may be fed to a user? • Reload the page multiple times • But too many reloads may lead to ads churn: old ads expire, new ads show up

  10. Determining the # of reloads • Reloads every 5 seconds • Repeated for 200 queries • Curve becomes linear > 10 reloads • Ads churns • Use 10 reloads as the threshold

  11. Quantifying Change • Metrics • Jaccard index: • Extended Jaccard index (cosine similarity)

  12. Comparing Effectiveness • Views: # of page reloads containing the ad • Value: # of page reloads scaled by the position of the ad • Overlap: Jaccard index

  13. Comparing Effectiveness

  14. The winner is • Weight: log(views) or log(value)

  15. Avoiding artifacts • Different system parameters may lead to different ads view • Browsers used different DNS servers • Browsers receive different cookies • HTTP proxy

  16. Analysis • Configure two or more instances to differ by one parameter • Comparing results for • Search Ads • Website Ads • Online Social Network Ads

  17. Search Ads • A, B: control w/o cookies • C, D: w/ cookies enabled. Seeded w/ different personae • Google 730 random product-related queries for 5 days • No obvious behavioral targeting in search ads. Why? • Keyword based ads bidding • Location targeting not studied

  18. Websites Ads • Measure 15 websites that show Google ads • A, B: control in NY • C: SF; D: Germany • Location affects web ads

  19. Website Ads • A, B: control • C: browse 3 out of 15 websites • D and E: browse random websites and Google search random websites • Google does not use browsing behavior to pick ads

  20. Online social network ads • Set up three or more Facebook profiles • A, B: control and identical • C: differs from A by one profile parameter

  21. Online social network ads • Use all profile parameters to customize ads • Age and gender are two primary factors • Diurnal patterns due to ads churn • Should it increase or decrease? • Education and relationship matter less, except for engaged and non-engaged women

  22. Checking Impact of Sexual Preference • Six profiles with different sexual preferences • Two males interested in females (male control) • Two females interested in males (female control) • One male interested in male • One female interested in female

  23. Ads differ by sexual preferences

  24. Other results • Found neutral ads targeted exclusively to gay men • Clicking would reveal to the advertiser a user’s sexual preference • 66 ads shown exclusively to gay men more than 50 times during experiments

  25. Summary • Search ads are largely key-word based so far • Websites ads use location but probably not behavior • Social network ads use all profile attributes to target users

  26. Question: how can we design a privacy-preserving online advertising system?

  27. Goals • Support online advertising • A good revenue source to fund online services • Preserve user privacy

  28. PrivAd • Serving Ads from a localhost client • Actors: user, publisher, advertiser, broker, and dealer

  29. How it works • Advertisers upload ads to broker • User client subscribes to a set of the ads according to the user’s profile to the broker • Message encrypted with Broker’s public key and contains a symmetric private key • The Broker sends filtered ads to the user client • Ads are encrypted with the symmetric key • Dealer anonymizes the client’s message to Broker

  30. Ad View/Click Reporting • When a user clicks an ad, the user client sends a view/click report containing ad ID and publisher ID to the broker via the dealer • Dealer attaches a unique report ID, removes client identity information, maps the ID to the user identity information

  31. Click-fraud defense • Broker provides dealer the record IDs if it suspects click-fraud • The dealer finds the user • The dealer stops relaying ads to user if convinced • Questions not answered: how to detect by broker, and what’s the punishment

  32. Defining User Privacy • Unlinkability • No single player can link the identity of user with any piece of user’s profile • No single player can link together more than some limited number of pieces of personalization information of a given user • The dealer learns User A clicks on some ad • The broker learns someone clicked on ad X • Not robust to dealer/broker collusion

  33. Scaling PrivAd • Ads churn is significant • 2GB/month of compressed ad data

  34. Discussion • What challenges does PrivAd may face in a practical deployment?