260 likes | 380 Vues
This presentation explores the concept of accountability in distributed systems and its critical importance in cloud computing. Key topics include defining accountability, practical implementations like PeerReview, and the associated technical challenges. The implications of multiple administrative domains, stakeholder interests, and lack of transparency are examined, alongside lessons from traditional sectors such as banking. The goal is to develop systems capable of detecting and proving faults, promoting transparency, trust, and responsible behavior among nodes within a distributed network.
E N D
Accountable distributed systems and the accountable cloud Peter Druschel joint work with Andreas Haeberlen1, PetrKuznetsov2, Rodrigo Rodrigues 1 University of Pennsylvania 2 TU Berlin/Deutsche Telekom Labs Building and Programming the Cloud, Mysore, Jan 2010
Outline • Why accountability? • A definition • A practical implementation: PeerReview • Accountability in the Cloud • Technical Challenges • Conclusion Building and Programming the Cloud, Mysore, Jan 2010
Whatistheproblem? • Multiple administrative domains (federated, p2p) • Multiple stakeholders (hosting, Web) • different actors, somewhat different interests • lack of global visibility, control • Complex faults • software faults, mis-configuration, negligence, disgruntled employees, outside attacks, manipulation • Lack of transparency Building and Programming the Cloud, Mysore, Jan 2010
Learning fromthe 'offline' world • Reliesheavily on accountabilityto deal withfaults, misbehavior • Example: Banking • Recordcanbeusedto (manually) • detectproblems • identifytheresponsibleparty • convincethat a problemdoes (not) exist Building and Programming the Cloud, Mysore, Jan 2010
What does accountability mean in distributed systems? • Tamper-evident recordofeachnode‘sactions • (Automated) auditfor fault detection, localization • Evidencetoconvince a thirdpartythat a fault has (not) occured • Accountability provides • transparency • trust • incentives to avoid faults Building and Programming the Cloud, Mysore, Jan 2010
Outline • Why accountability? • A definition • A practical implementation: PeerReview • Accountability in the Cloud • Technical Challenges • Conclusion Building and Programming the Cloud, Mysore, Jan 2010
Ideal accountability Whenever a node is faulty, the system generates a proof of misbehavior against that node • Fault := Node deviates from expected behavior • Our goal is to automatically • detect faults • identify the faulty nodes • convince others that a node is (or is not) faulty • Can we build a system that provides the following guarantee? Building and Programming the Cloud, Mysore, Jan 2010
0 X Can we detect all faults? 100101011000101101011100100100 • Problem: Faults that affect only a node's internal state • Would require online trusted probes at each node • Focus on observable faults: • Faults that affect a correct node • Can detect observable faults without requiring trusted components A C Building and Programming the Cloud, Mysore, Jan 2010
Can we always get a proof? I sent X! A • Problem: He-said-she-said • Threepossiblecauses: • A neversent X • B refusestoacknowledge X • X was delayedbythenetwork • Cannotgetproofofmisbehavior! • Generalizetoverifiableevidence: • a proofofmisbehavior, or • a challengethat a faultynodecannotanswer • Whatifthechallengednodedoes not respond? • Does not prove a fault, but nodeissuspecteduntilitresponds X ? B I neverreceived X! ?! C Building and Programming the Cloud, Mysore, Jan 2010
Practical accountability • Requirementfor an accountabledistributedsystem: • Thisisuseful • Any (!) fault thataffects a correctnodeiseventuallydetectedandlinkedto a faultynode • Itcanbeimplemented in practice Whenever a fault isobservedby a correctnode, thesystemeventuallygeneratesverifiableevidenceagainst a faultynode Building and Programming the Cloud, Mysore, Jan 2010
Outline • Why accountability? • A definition • A practical implementation: PeerReview • Accountability in the Cloud • Technical Challenges • Conclusion Building and Programming the Cloud, Mysore, Jan 2010
PeerReview Addsaccountabilityto a givensystem • Implementedas a library • Providestamper-evident record • Detectsfaults via state-machinereplay Assumptions: • Nodes canbemodeledasdeterministicstatemachines • Thereis a trustedreferenceimplementationofthestatemachines • Correctnodescaneventuallycommunicate • Nodes cansignmessages Building and Programming the Cloud, Mysore, Jan 2010
PeerReview is widely applicable • App #1: NFS server in the Linux kernel • Many small, latency-sensitive requests • Tampering with files • Lost updates • App #2: Overlay multicast • Transfers large volume of data • Freeloading • Tampering with content • App #3: P2P email • Complex, large, decentralized • Denial of service • Attacks on DHT routing • Details in [Haeberlen et al., SOSP’07] • NetReview [Haeberlen et al. NSDI’08] • Metadata corruption • Incorrect access control • Censorship Building and Programming the Cloud, Mysore, Jan 2010
How much does PeerReview cost? • Log storage • 10 – 100 GByte per month, depending on application • Message signatures • Message latency (e.g. 1.5ms RTT with RSA-1024) • CPU overhead (embarrassingly parallel) • Log/authenticator transfer, replay overhead • Depends on # witnesses • Can be deferred to exploit bursty/diurnal load patterns Building and Programming the Cloud, Mysore, Jan 2010
Outline • Why accountability? • A definition • A practical implementation: PeerReview • Accountability in the Cloud • Technical Challenges • Conclusion Building and Programming the Cloud, Mysore, Jan 2010
Split administration in theCloud • Bug in Alice‘ssoftware • Subtledifferencesbetween Alice andBob‘senvironments • ... Alice Alice's customers Bob • Whatifthereis a problem? • Bug in Bob‘ssoftware • Insufficientresourceallocation • Hacker attack • ... Building and Programming the Cloud, Mysore, Jan 2010
Split administraction: Alice‘sperspective ? ? ? ? ? ? ? ? Alice Alice's customers Bob • If something is wrong, how will I know? • How can I tell if it's my software or the cloud? • If it's the cloud, how can I convince Bob? Building and Programming the Cloud, Mysore, Jan 2010
Split administraction: Bob'sperspective ? ? ? ? ? ? ? ? ? ? ? ? ? Alice Alice's customers Bob • If something is wrong, how will I know? • How can I tell if it's the cloud or Alice's software? • If it's Alice's software, how can I convince Alice? • If something is wrong, how will I know? • How can I tell if it's my software or the cloud? • If it's the cloud, how can I convince Bob? Building and Programming the Cloud, Mysore, Jan 2010
An idealized solution • Whatifwehad an oraclethat Alice and Bob couldaskaboutproblems? • Completeness:Ifthecloudisfaulty, theoracle will say so • Accuracy:Ifthecloudisnotfaulty, theoracle will say so • Verifiability: The oracleproducesevidencethatwouldconvince a thirdparty Alice Alice's customers Bob Oracle Building and Programming the Cloud, Mysore, Jan 2010
The accountablecloud • Idea: Makecloudaccountable • Cloudrecordsitsactions in a tamper-evident log • Alice canauditthe log and check forfaults • Use log toconstructevidencethat a fault does (not) exist • Shouldworkevenifoneparty was compromised! Alice Alice's customers Tamper-evidentlog Bob Building and Programming the Cloud, Mysore, Jan 2010
Discussion • Is thistoopessimistic? Cloudisn'tmalicious! • Hacker attacks, softwarebugs, operatorerror, maliciousclient, … • Difficulttocomeupwith a morerestrictive fault model • Withoutprovableproperties, evidencehaslittlevalue • Whywould a providerwanttodeploythis? • Attractivetoprospectivecustomers (peaceofmind) • Helps in handlingcustomercomplaints, resolvedisputes Building and Programming the Cloud, Mysore, Jan 2010
Outline • Why accountability? • A definition • A practical implementation: PeerReview • Accountability in the Cloud • Technical Challenges • Conclusion Building and Programming the Cloud, Mysore, Jan 2010
Is the technology ready? • Cloudaccountabilityshould • Haveprovableguarantees • Work formostcloudapplications • Requirenochangestoapplicationcode • Cover a widespectrumofproperties • Havereasonableoverhead • Can existingtechniquesdeliverthis? • CATS, Repeat&Compare, AIP, PeerReview, NetReview, AudIt, ... • More workisneeded! ? ? ? Building and Programming the Cloud, Mysore, Jan 2010
Work in progress: AVM Virtual machine • Goal: Provide accountability for arbitrary binary executables • Idea: Accountable virtual machine (AVM) • Cloud records enough data to enable deterministic replay • Alice can replay log against a reference implementation • Can audit any part of the hostedexecution Alice Bob Building and Programming the Cloud, Mysore, Jan 2010
Challenges • Complete state-machine replay expensive • limit to spot checks, investigation of suspected faults • multi-core replay is hard • replay log against an abstract model? • Checking performance properties • Checking information flow • Lots of research opportunities Building and Programming the Cloud, Mysore, Jan 2010
Summary • Accountability is a useful capability in distributed systems • tamper-evident record • fault detection and localization • evidence • Proposal: the accountable cloud • Can verify correct operation, produce evidence • Provable guarantees solid foundation for both players • Challenges remain Questions? Building and Programming the Cloud, Mysore, Jan 2010