1 / 26

Estimating Rates of Rare Events at Multiple Resolutions

Estimating Rates of Rare Events at Multiple Resolutions. Deepak Agarwal Andrei Broder Deepayan Chakrabarti Dejan Diklic Vanja Josifovski Mayssam Sayyadian. Estimation in the “tail”. Contextual Advertising Show an ad on a webpage (“impression”) Revenue is generated if a user clicks

saburo
Télécharger la présentation

Estimating Rates of Rare Events at Multiple Resolutions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Estimating Rates of Rare Events at Multiple Resolutions Deepak AgarwalAndrei BroderDeepayan ChakrabartiDejan DiklicVanja JosifovskiMayssam Sayyadian

  2. Estimation in the “tail” • Contextual Advertising • Show an ad on a webpage (“impression”) • Revenue is generated if a user clicks • Problem: Estimate the click-through rate (CTR) of an ad on a page • Most (ad, page) pairs have very few impressions, if any, • and even fewer clicks • Severe data sparsity

  3. Estimation in the “tail” • Use an existing, well-understood hierarchy • Categorize ads and webpages to leaves of the hierarchy • CTR estimates of siblings are correlated • The hierarchy allows us to aggregate data • Coarser resolutions • provide reliable estimates for rare events • which then influences estimation at finer resolutions

  4. System overview Retrospective data[URL, ad, isClicked] Crawl URLs a sample of URLs Classify pages and ads Rare event estimation using hierarchy Impute impressions, fix sampling bias

  5. Sampling of webpages • Naïve strategy: sample at random from the set of URLs • Sampling errors in impression volume AND click volume • Instead, we propose: • Crawling all URLs with at least one click, and • a sample of the remaining URLs • Variability is only in impression volume

  6. Ad classes Clicked pool Sampled Non-clicked pool Excess impressions(to be imputed) Page classes Imputation of impression volume #impressions = nij + mij + xij sums to ∑nij + K.∑mij[row constraint] sums toTotal impressions(known) sums to #impressions on ads of this ad class[column constraint]

  7. Imputation of impression volume Level 0 • Region= (page node, ad node) • Region Hierarchy • A cross-product of the page hierarchy and the ad hierarchy Level i Region Page classes Ad classes Page hierarchy Ad hierarchy

  8. Imputation of impression volume Level i Level i+1 sums to [block constraint]

  9. Imputing xij • Iterative Proportional Fitting [Darroch+/1972] • Initialize xij = nij + mij • Iteratively scale xij values to match row/col/block constraint • Ordering of constraints: top-down, then bottom-up, and repeat Level i Level i+1 block Page classes Ad classes

  10. Imputation: Summary • Given • nij (impressions in clicked pool) • mij (impressions in sampled non-clicked pool) • # impressions on ads of each ad class in the ad hierarchy • We get • Estimated impression volumeÑij = nij + mij + xijin each region ij of every level

  11. System overview Retrospective data[page, ad, isclicked] Crawl Pages a sample of pages Classify pages and ads Rare event estimation using hierarchy Impute impressions, fix sampling bias

  12. Rare rate modeling • Freeman-Tukey transform: • yij = F-T(clicks and impressions at ij)≈ transformed-CTR • Variance stabilizing transformation: Var(y) is independent of E[y]  needed in further modeling

  13. Rare rate modeling • Generative Model (Tree-structured Markov Model) variance Wij Wparent(ij) Unobserved “state” Sparent(ij) Sij βparent(ij) covariates βij variance Vij Vparent(ij) yparent(ij) yij

  14. Rare rate modeling • Model fitting with a 2-pass Kalman filter: • Filtering: Leaf to root • Smoothing: Root to leaf • Linear in thenumber of regions

  15. Experiments • 503M impressions • 7-level hierarchy of which the top 3 levels were used • Zero clicks in • 76% regions in level 2 • 95% regions in level 3 • Full dataset DFULL, and a 2/3 sample DSAMPLE

  16. Experiments • Estimate CTRs for all regions R in level 3 with zero clicks in DSAMPLE • Some of these regions R>0 get clicks in DFULL • A good model should predict higher CTRs for R>0 as against the other regions in R

  17. Experiments • We compared 4 models • TS: our tree-structured model • LM (level-mean): each level smoothed independently • NS (no smoothing): CTR proportional to 1/Ñ • Random: Assuming |R>0| is given, randomly predict the membership of R>0 out of R

  18. Experiments TS Random LM, NS

  19. Experiments Few impressions  Estimates depend more on siblings Enough impressions  little “borrowing” from siblings

  20. Related Work • Multi-resolution modeling • studied in time series modeling and spatial statistics [Openshaw+/79, Cressie/90, Chou+/94] • Imputation • studied in statistics [Darroch+/1972] • Application of such models to estimation of such rare events (rates of ~10-3) is novel

  21. Conclusions • We presented a method to estimate • rates of extremely rare events • at multiple resolutions • under severe sparsity constraints • Our method has two parts • Imputation  incorporates hierarchy, fixes sampling bias • Tree-structured generative model  extremely fast parameter fitting

  22. Rare rate modeling • Freeman-Tukey transform • Distinguishes between regions with zero clicks based on the number of impressions • Variance stabilizing transformation: Var(y) is independent of E[y]  needed in further modeling # clicks in region r ~ ~ # impressions in region r

  23. Rare rate modeling • Generative Model • Sij values can be quickly estimated using a Kalman filtering algorithm • Kalman filter requires knowledge of β, V, and W • EM wrapped around the Kalman filter filtering smoothing

  24. Rare rate modeling • Fitting using a Kalman filtering algorithm • Filtering: Recursively aggregate data from leaves to root • Smoothing: Propagate information from root to leaves • Complexity: linear in the number of regions, for both time and space filtering smoothing

  25. Rare rate modeling • Fitting using a Kalman filtering algorithm • Filtering: Recursively aggregate data from leaves to root • Smoothing: Propagates information from root to leaves • Kalman filter requires knowledge of β, V, and W • EM wrapped around the Kalman filter filtering smoothing

  26. Imputing xij • Iterative Proportional Fitting [Darroch+/1972] • Initialize xij = nij + mij • Top-down: • Scale all xij in every block in Z(i+1) to sum to its parent in Z(i) • Scale all xij in Z(i+1) to sum to the row totals • Scale all xij in Z(i+1) to sum to the column totals • Repeat for every level Z(i) • Bottom-up: Similar Z(i) Z(i+1) block Page classes Ad classes

More Related