1 / 28

Automatic for the people: Reducing inadvertent leaks by personal machines

Automatic for the people: Reducing inadvertent leaks by personal machines. Landon Cox Duke University. Inadvertent leaks. Usability and privacy: A Study of Kazaa ... Good and Krekelberg, CHI, 2003 In 12 hours, found 150 inboxes on Kazaa Observed people downloading dummy inbox

chi
Télécharger la présentation

Automatic for the people: Reducing inadvertent leaks by personal machines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic for the people: Reducing inadvertent leaks by personal machines Landon Cox Duke University

  2. Inadvertent leaks • Usability and privacy: A Study of Kazaa ... • Good and Krekelberg, CHI, 2003 • In 12 hours, found 150 inboxes on Kazaa • Observed people downloading dummy inbox • Problem hasn’t gone away

  3. Stories from 2009

  4. Technical solution? Servers: Asbestos, HiStar, Flume Languages: Jif, Laminar, Resin Desktop: PrivacyScope, TightLip Process Files Reference monitor Process Network Process IPC Policy User Admin Dev Automation

  5. Automatic policy specific. • State of the art: pattern matching • Look for strings that look like SSNs, CCs, etc. • find_SSNs, Firefly, SENF, Spider, etc. • A bit brittle and error-prone • High false positive/negative rates • Let’s take a different approach

  6. Key observations 1) Personal machines often cache sensitive data 2) Servers force clients to access files using crypto 3)Crypto is general technique, used across admin. domains and applications

  7. RedFlag overview • Identifies processes that store decrypted data • Unobtrusive (requires no user input) • Compatible with legacy applications • Compatible with existing Internet protocols • High-level insights • Stop trying to figure out what sensitive data looks like • Use heuristics of how sensitive data is handled

  8. Caveats • We cannot stop all inadvertent leaks • Stop large, important class of leaks • Trust and threat model • Uncompromised host • No IP spoofing or DNS hijacking • Correct, trusted reference monitor (take your pick) • Buggy/absent access-control policies

  9. RedFlag system overview Monitor sockets Compose rules Inspect process

  10. Monitoring sockets • Goal • Try to identify incoming encrypted data • Only at application level (e.g., SSL) • Easy for most widely used apps • Look at remote port (e.g., 443 or 993) • Not always sufficient • Non-standard ports: Skype, Groove, Groupwise • XMPP sends SSL, non-SSL data to same port (5222/TCP)

  11. Information entropy • Compute entropy score for ambiguous ports • Negligible performance overhead • If score above threshold (~7.9 bits/byte), invoke inspection process • Can induce false positives • Compressed data sent in the clear (e.g., mp3s) • On-the-fly compression schemes (e.g., http content-coding=gzip) • Luckily, doesn’t need to be 100% accurate • Really just a performance optimization to save work • Only used as a first-pass filter • Correct any mistakes in inspection phase

  12. RedFlag system overview Monitor sockets Compose rules Inspect process

  13. Inspect process • Goals of inspection • Infer when file write depends on network read • Determine whether file write is decrypted data • Use taint-tracking • Too slow to perform in critical path of desktop apps • Perform asynchronously via deterministic replay • Fork if network monitor flags process (port or entropy) • Log libc calls in original, use log in replay process • Attach taint-tracker to replayed process (e.g., PIN) • Perform analysis on a free core in the background

  14. Taint tracking • Implement with PIN • Rewrite instructions to propagate taint • Record taint in shadow memory • Key questions • What are the taint sources? • What info to send to the policy composer?

  15. Address space “/tmp/attach.pdf, 74.125.45.83:443” } <!DOCTYPE html PUBLIC ... Taint label (byte) 0 0 0 0 0 1 } } Shadow memory Fine when there is no ambiguity about the source But what about ambiguous ports?

  16. Ambiguous ports • Search process memory for AES s-boxes • S-boxes are set by algorithm designer • S-boxes are unlikely to appear randomly • (also look for well-known transformations)

  17. 0 0 0 0 0 1 Ambiguous ports • If we find s-boxes in a library data section • Assume image is a crypto library • Vast majority of crypto libraries include AES implementation • Instrument lib to set “crypto bit” of inbound taint labels • If crypto bit == 1, network data was “routed” through crypto lib • If crypto bit == 0, assume network data was not decrypted • Also use s-boxes as taint source • Data derived from s-boxes have “AES bit” set • Can use to gauge strength of crypto algorithm Taint label (byte) 1 1 } ID index AES bit Crypto bit

  18. RedFlag system overview Monitor sockets Compose rules Inspect process

  19. Compose rules • Taint-tracking gives three pieces of info • Description of network source • If data was routed through crypto library • If data was derived from AES s-box • Can use this to compose policies

  20. Compose rules • Same source • Allow sensitive files to be copied back to their source • Raise alert otherwise • Generalize hostnames (e.g., *.google.com) • Obfuscation vs. confidentiality • Many P2P clients use crypto to obfuscate • Aren’t trying to protect data so use weak algorithms • (e.g., BitTorrent and LimeWire explicitly do not support AES) • If ambiguous port + no AES, then ignore file

  21. RedFlag implementation • Runs on Ubuntu 8.10 • Modified Jockey for logging/replay • Supports multi-threaded programs • User-level thread library • PIN tool for tainting • Based on sequential taint tracker from Speck • Modified to allow tainting during replay • Implemented s-box search, crypto and AES bits in taint label

  22. Evaluation • Accuracy • How well can RedFlag identify crypto libraries using s-boxes? • How well does RedFalg categorize sensitive files? • Performance • Will asynchronous taint-tracking fall behind?

  23. Identifying crypto libraries • Looked at 10 Ubuntu programs • Email: checkgmail, thunderbird • IM: pidgin • P2P: Azureus, Limewire, Skype, Transmission • Web: Firefox, Opera, wget • Successfully identified crypto libs in all • Including custom implementations, plugins (flash player) • Interesting case: Opera folds crypto into exectable

  24. Categorizing sensitive files • Non-sensitive files • Used Firefox • Loaded 30 most popular webistes (alexa) • RedFlag produced no false positives/negatives • Sensitive files • Downloaded 17 representative sensitive docs • Firefox, thunderbird, pidgin

  25. Categorizing sensitive files

  26. Taint-tracking performance

  27. Conclusions • RedFlag automates policy specification • Heuristic-based approach • Monitor process behavior, not file content • Sensitive files usually downloaded using crypto • Deal with ambiguous ports using entropy scores, AES s-boxes • Evaluation highlights • Automatically identified crypto libraries • Correctly categorized files in 45/47 scenarios • No false positives, three false negatives • Sufficient idle time in long-running process

  28. Thanks!I’m happy to take questions

More Related