1 / 23

Collaborative filtering with privacy

Collaborative filtering with privacy. Wim Verhaegh Aukje van Duijnhoven Jan Korst Pim Tuyls IPA herfstdagen, 23 November 2004. Privacy issue. Personalization is key in Ambient Intelligence requires user profiles Privacy risks of services untrusted server server gets hacked

Télécharger la présentation

Collaborative filtering with privacy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Collaborative filtering with privacy Wim Verhaegh Aukje van Duijnhoven Jan Korst Pim Tuyls IPA herfstdagen, 23 November 2004

  2. Privacy issue • Personalization is key in Ambient Intelligence • requires user profiles • Privacy risks of services • untrusted server • server gets hacked • server goes bankrupt • Perform personalization on encrypted data • collaborative filtering

  3. Overview • Collaborative filtering system • Privacy requirements • CF method • calculation scheme (formulas & example) • Encryption basics • Encrypted CF method • Item-based CF • Conclusion

  4. Collaborative filtering system • System to recommend new content • recommend content that ‘similar users’ like calculate similarities database with ratings music player similarity values user predict missing ratings recommend server side user side

  5. How to perform collaborative filtering? Security requirements • Nobody may know users’ ratings • not even anonymously • Nobody may know who rated what • not even anonymously • Nobody may know who resembles who

  6. users x x x x x x x x x x x x x x x x x x x x items x x x x x x x x x x x x x x x x x x x Collaborative filtering methods • Memory based • computes similarities and interpolates • user based • item based • Model based • first uses rating database to build a model(e.g. extract basic rating profiles) • uses model for prediction • Most approaches can be encrypted

  7. Memory-based CF with user similarities • Two steps • determine similarities between users • predict missing ratings • Step 1: Pearson correlation

  8. Step 2: prediction • E.g. weighted deviations from the average • similarities are weights

  9. Example • Tea and coffee flavors • 4 users • 9 items (flavors)

  10. Compute similarities, e.g. Example • Subtract averages

  11. Example • Use similarities to predict missing ratings • Prediction for Aukje, tea T3

  12. Public key encryption scheme: Paillier • Generate keys • choose large random primes p, q(private) • calculate n = pq and a ‘generator’ g (public) • Encrypt message m bywith r random • Homomorphism properties

  13. Encrypted inner product • User a: • User b: • User a encrypts vector and sends to b • User b computesand sends back to a • User a decrypts it to get inner product

  14. Encrypted CF: correlation step • Rewrite correlation as three inner productswhere • Zeros to avoid contributions from in sums

  15. Encrypted CF: correlation step • Protocol • Active user knows correlation values, but not to whom • Server knows between whom, but not the correlation values active user server other users copy

  16. Encrypted CF: prediction step • Rewrite • Protocol • each user b adds random factor active user server other users split

  17. users x x x x x x x x x x x x x x x x x x x x items x x x x x x x x x x x x x x x x x x x Memory-based CF with item similarities • Similarities computed between items • compare rows in the matrix • similar formulas

  18. Memory-based CF with item similarities • Similarities • Predictions

  19. Threshold Paillier • Calculation of sums: use threshold encryption • key is shared among k users • decryption needs > t shares server users > t

  20. Encrypted item-correlation step • Rewrite correlation • Protocol server users > t

  21. Encrypted item-based prediction • Rewrite prediction formula • item average: two sums • prediction: four inner products (server & user a) • protocols as before

  22. Conclusion • Collaborative filtering can be encrypted • various correlation and prediction formulas • various CF approaches • More computations to be done at users’ sites • encryption and decryption • users have to be online • Future work • protection against more complicated attacks • peer-to-peer solution

More Related