210 likes | 319 Vues
Explore secure dataflow processing in open distributed systems, discussing ADU, topology, and integrity attacks, with provenance-based protection and function integrity attestation techniques. Learn about system design, implementation, and experimental evaluation to safeguard data processing applications. Evaluate overhead and detection probability of basic protection schemes, including randomized data attestation. This study emphasizes maintaining integrity and confidentiality in dataflow processing applications.
E N D
Towards Secure Dataflow Processing in Open Distributed Systems Juan Du, Wei Wei, Xiaohui (Helen) Gu, Ting Yu 1/21
Outline • Introduction • Design and Algorithms • Experimental Evaluation • Related Work • Conclusion 2/21
Dataflow Processing in Distributed System f1 f5 f5 f3 f2 f2 …,f2(f1(di)),… …,f1(di),… f1 …,f3(f2(f1(di))),… …di,… …di,… f4 …,f3(f2(f1(di))),… Component provider Data processing component di ADU Dataflow 3/21
Run in Open Distributed Systems • Dataflow Processing Applications • Network traffic monitoring • Sensor data analysis • Audio/video surveillance • Scientific data processing • Advantages in Open Distributed Systems • Highly scalable and available infrastructures • No need to maintain hardware and software • Challenges in Open Distributed Systems • Component providers come from different security domains • Not all data processing components are trustworthy 4/21
ADU Attack f1 f5 f5 f3 f2 f2 … f2(f1(d1), d0 … f2(f1(d1) … f1(d2), f1(d1) f1 … d2, d1 f4 Component provider Malicious component Data processing component di ADU Dataflow 5/21
Dataflow Topology Attack f1 f5 f5 f3 f2 f2 … f1(d2), … f1 f4 …f3(f5(f2(f1(d2)))), … …f3 (f2(f1(d2)))), … Component provider Malicious component Data processing component di ADU Dataflow 6/21
Function Integrity Attack f1 f5 f5 f3 f2 f2 … f1(d2),… … f0(f1(d2)),… … f1(d2), … f1 f4 Component provider Malicious component Data processing component di ADU Dataflow 7/21
System Design • Attack Models • ADU attack • Dataflow topology attack • Function integrity attack • Assumptions • Third-party component providers could be malicious • Composers and users are trusted • PKI is deployed in advance • Goals • Provide integrity and confidentiality for dataflow processing applications • Focus on discussing integrity issues 8/21
Provenance-based ADU Protection • d • receipt • d • d • [sqn, session_Id, hash(d)]sign_s2 • “Receipt” packet • ADU dropping attack • s2 may claim it does not receive d • s1 may claim it sends d, but it doesn’t 9/21
Provenance-based ADU Protection • f1 • f2 • f2(f1(d)) • d • f1(d) • [[h(d), h(f1(d))]sign_s1]key_c • [[h(d), h(f1(d))]sign_s1]key_c • [[h(f1(d)), h(f2(f1(d)))]sign_s2]key_c • input • output • input • output • Provenance evidence • Cached or carry-on evidence • Consistency verification between different components 10/21
Dataflow Topology Protection • C s1 s2 s3 C • C sig_c sig_c sig_c sig_c key_s1 key_s3 key_s2 [s1][s2][s3][C] C • f1 • s1 • f2 • s2 • s3 • f3 • C • Cascading topology encryption • Any component cannot change the dataflow topology • Each component only knows its previous hop and next hop 11/21
Dataflow Topology Protection • C s1 s2 s3 C • C sig_c sig_c sig_c sig_c key_s1 key_s3 key_s2 [s1][s2][s3][C] C • f1 • s1 [s1]sig_c[s2]sig_c[s3]sig_c[C]sig _ c key_s3 key_s2 • f2 • s2 • [s2]sig_c[s3]sig_c[C]sig _ c key_s3 • s3 • f3 • [s3]sig_c[C]sig _ c • C • Cascading topology encryption • Any component cannot change the dataflow topology • Each component only knows its previous hop and next hop • Onion routing [Goldschlag, et al., 1999] 12/21
Function Integrity Attestation • f1 • f2 • s1 • f1(d1) , f1(d3) • s5 • f2(f1(d1)) , f2(f1(d3)) • d1 • d3 • f1(d2) • s6 • s2 • f2(f1(d2)) • d2 • d3 • d2 • d1 • C • C • d3’ f1(d3’) • f2(f1(d3’)) • s3 • s7 • d2’ f2(f1(d2)) = = f2(f1(d2’)) ? • s8 • s4 • f1(d2’) • f2(f1(d2’)) f2(f1(d3)) = = f2(f1(d3’)) ? • Randomized data attestation • Achieve scalable function integrity attack detection • Duplicate a random subset of ADUs • Send duplicates to selected functionally equivalent components • Check result consistency • Continuously perform randomized data attestation 13/21
Implementation and Experimental Setup • Implementation • Implement a prototype of the secure dataflow processing • Follow the design of the IBM System S • Experiment setup • Conduct experiments on Planetlab • Use about 200 hosts • One host represents one component provider • Composer deployed on a pre-defined Planetlab host 14/21
Evaluation • Overhead caused by basic protection schemes • Randomized data attestation • Overhead • in terms of dataflow processing delay • (time of dngetting out - time of d1 getting in ) / n • Detection probability • non-collusion • collusion 15/21
Overhead of Basic Protection Schemes The overhead is about 10~15% for both secure dataflow schemes
Overhead of RandomizedData Attestation • # of redundant components k = 5 • data size = 1KB • data rate = 10 ADUs/sec • duration = 30s • Avg dataflow processing delay increases with the number of redundant components used • Due to sub-optimal dataflow topology
Detection Probability Detection probability increases with duplication probability puand number of redundant components used Detection is harder in collusion scenarios than that in non-collusion scenarios 18/21
Related Work • Distributed dataflow processing • Focuses on resource and performance management issues • Assumes that data processing components are trustworthy • Trust management in distributed systems • Distributed messaging systems [Haeberlen, et al. SOSP 2007] • Pub-sub overlay [Srivatsa, et al., CCS 2005] • None of them addressed secure and scalable dataflow processing in open distributed system • Byzantine fault-tolerance • in Wide area networks [Amir, et al., DSN 2006] • No trusted party 19/21
Conclusion • Finished Work • The first attempt to address the integrity of dataflow processing application delivery on open distributed systems • Identify and classify major security attacks • Propose a set of effective protection schemes • Future Work • Non-linear dataflow topology • Integrity attestation on stateful function • Further identify malicious component 20/21
Thank you • Questions? 21/21