Using Queries for Distributed Monitoring and Forensics

Using Queries for Distributed Monitoring and Forensics Atul Singh Rice University Petros Maniatis Intel Research Berkeley Timothy Roscoe Intel Research Berkeley Peter Druschel Max Planck Institute for Software Systems

Building and monitoring a system • Building a distributed system is a complex undertaking • Select properties • algorithms • implement, deploy • Switch to monitoring the system • Testing, debugging, profiling, tuning • Monitoring is hard, error-prone • Distributed state • Partial faults • Complex interactions • Asynchronous • External factors EuroSys 2006

Monitoring is hard! • Current state of the art: • Manual insertion of “printf” • Bringing logs to one place • Parsing/processing of logs • Scripts (perl/python) • Queries (Astrolabe) • Offline by nature Expose internal state • Ad-hoc, error-prone Probe exposed state • Correlate events • Bridge the semantic gap EuroSys 2006

Declarative systems: building systems via queries Probe the state • Declarative specification via queries • Execution by a distributed query processor • P2[SOSP’05]: a prototype declarative system • Concise specifications • Enables rapid prototyping • We present a monitoring framework for P2 • Flexible introspection • Retains semantics of application • Online execution tracing Expose internals EuroSys 2006

Overview • Introduction • P2 Background • Monitoring framework • Example applications/Performance • Conclusions EuroSys 2006

Dataflow graph R0 Network In Network Out R1 . . route Router A Router B route nextHop nextHop K -> B K -> C K K’ -> D .. K’ -> E .. Example: route operation in P2 route(B,K) :- route(A,K),nextHop(A,D,B), D == K. action :- event, precondition. K Rule strand Join route.A == nextHop.A Select D == K Project Application state nextHop EuroSys 2006

Overview • Introduction • Background • Monitoring framework • Examples applications/Performance • Conclusions EuroSys 2006

r1 Join Selection Project Introspection and Logging • Introspection at three levels • Application state level • Rule level • Dataflow level • Systematic instrumentation • System is built using smaller, re-usable components • Systematic insertion of logging statements • Logging data is in the form of tuples • Retains semantics of application logic • No need for translation EuroSys 2006

Tracing rule executions • We want to step through the execution • Each step corresponds to a rule • Do it in “online” fashion • For rule level tracing • Need to trace tuples • Match output tuple to input • Track tuples as they go over wire Node A Node B r0 r1 x y z w EuroSys 2006

r1 Join Selection Project r1 ruleId input x output y dest. d (1) Tracing rule executions • Matching input and output tuples of a rule • Tap elements at the beginning and end of a rule • Execution tracer: tracks rule executions • Execution records are stored as tuples in exectable x y input output Execution Tracer exec EuroSys 2006

y x A (2) Tracing tuples across wire • Each tuple has a locally unique ID • Tuple ID is sent along with the tuple • Upon receiving, a new tuple is created with different ID • Hooks in the network in/out handling subsystem • A record is created • tuple’s local ID • tuple’s remote ID • Node from which it came from x Network Out A B Network In y B’ tupleTable EuroSys 2006

x z v y A C r1 r0 z x y w B C Putting it all together Node A • Of course in reality, it’s more complicated … • Aborted rule executions • Pipelined rule executions Node B r0 r1 z w x y tupleTable exec tupleTable exec EuroSys 2006

Overview • Introduction • Background • Monitoring framework • Example applications/Performance • Conclusions EuroSys 2006

Example applications (I) • Distributed watchpoints: Trigger an event if true • Possibly trace back/forward • Oscillation of faulty/stale information (route flaps) • Gossiping for stabilization or updates • Inconsistent routing in DHT’s [Pastry, Chord,…] • Each node is responsible for a unique region • Route using distinct paths and check [Bamboo, Secure Routing] EuroSys 2006

Example applications (II) r1 • Online execution profiling: • How much time is spent in each rule? • Where are the bottlenecks? • Which rule is costlier? What operation? • Consistent Snapshots [Chandy-Lamport]: • Snapshot for the routing state • Queries on “snapshots” itself • What is the degree distribution? • How many node-disjoint paths? • No more than 16 rules for any of the above r2 r3 EuroSys 2006

Performance • 21 node Chord overlay in P2 • Monitored node on separate, unloaded machine • Overhead of introspection • CPU (0.98 1.3%), Memory (8MB 13MB) • Consistent distributed snapshot • Other results in the paper % CPU Util. Tx pkts(X1000) Rate (1/#sec) Rate (1/#sec) EuroSys 2006

Related Work • Management using database techniques [Hy+…] • Performance debugging [Magpie, Causeway…] • Configuration debugging for BGP, OSes [Time-travel…] • Distributed debuggers [WiDS, Pip, Replay Debugging…] • Deep embedded monitoring [IBM Websphere, Adaptations…] EuroSys 2006

Conclusions • Declarative development of systems • Integrated approach to building and monitoring • Automatic execution tracing • Online, in-place monitoring • Step towards “autonomic” distributed systems • Fault-finding tasks evolve with the system • Interesting future directions • User interface • Trade-off between monitoring accuracy and overhead • Questions? [Thank You] EuroSys 2006

Request to EuroSys • Please schedule my next talk on the first day • Move the submission deadline away from NSDI (last year, NSDI submission (19th Oct), EuroSys (20th)) EuroSys 2006

Questions? Thank You! EuroSys 2006

Using Queries for Distributed Monitoring and Forensics