280 likes | 464 Vues
S 2 DB: A Novel Simulation-Based Debugger for Sensor Network Applications. Ye Wen, Rich Wolski, Selim G ü r ü n Department of Computer Science UC Santa Barbara EMSOFT’2006. Sensor Networks. Sensor networks:
E N D
S2DB: A Novel Simulation-Based Debugger for Sensor Network Applications Ye Wen, Rich Wolski, Selim Gürün Department of Computer Science UC Santa Barbara EMSOFT’2006
Sensor Networks • Sensor networks: An ad-hoc community of thousands of heterogeneous, resource constrained, tiny devices Deployed in remote locations under extreme conditions Src: Culler’01 Src: Matt Welsh How can we debug an application on hundreds of sensor nodes concurrently and realistically without installing them in the field?
Outline • Overview • Other debugging methods • Our approach: S2DB • Building blocks • Debugging points • Virtual instrumentation • Coordinated break • Time traveling • Evaluation • Summary and conclusion
Other Debugging Approaches • JTAG • Set breakpoints, step-execute program and query hardware • Not possible to synchronize I/O and program execution • Visualization tools • Sympathy, SpyGlass, Surge Network Viewer, MoteView • Display network topology and analyze the data flow • Require a data collection agent on sensor node • Simulation-based debuggers • TOSSIM: debugs the emulated code; event-based • ATEMU, Avrora, Emstar have similar concepts that we built on and extend in various ways
Simulation-based Sensor network debugger (S2DB) • Base: a scalable distributed sensor network simulator • Fidelity: cycle-accurate full-system simulation of sensor network applications • Performance: real-time speed for hundreds of sensor nodes • Scalability: simulate thousands of sensor nodes using cluster computer • Novel debugging facilities for sensor network • For single device: debugging points for device state inspection; virtual debugging hardware for software-controlled debugging • For multiple devices: coordinated break condition for parallel debugging • For network: time traveling for trace analysis
Internal Distributed Simulator Design • Correct, faithful execution of AVR binaries • Rich, complete hardware simulation • Simple, effective radio synchronization protocol • Automatic node partitioning
Debugging Points • Conventional debuggers expose register, PC and memory • S2DB operates on debugging points: The access point to one of the internal states of the simulated machine • The conventional debug points in S2DB:
New Debugging Points • Further debugging points for exposing the full system states • Software defined events based on virtual debugging hardware • Synthetic high-level system events, derived from the combination of simple events/states
Debug Points: Setting and Executing • Print a variable X >print mem( X ) • Break execution on erasing first page of flash
Debug Points: Setting and Executing • Print a variable X >print mem( X ) • Break execution on erasing first page of flash >break when flash access( erase, 0x1 )
Debug Points: Setting and Executing • Print a variable X >print mem( X ) • Break execution on erasing first page of flash >break when flash access( erase, 0x1 ) • Break execution when pc matches foo and a program variable Y is larger than 1
Debug Points: Setting and Executing • Print a variable X >print mem( X ) • Break execution on erasing first page of flash >break when flash access( erase, 0x1 ) • Break execution when pc matches foo and a program variable Y is larger than 1 >break when pc() == foo && mem(Y)>1
Debug Points: Setting and Executing • Print a variable X >print mem( X ) • Break execution on erasing first page of flash >break when flash access( erase, 0x1 ) • Break execution when pc matches foo and a program variable Y is larger than 1 >break when pc() == foo && mem(Y)>1 High Overhead Lower Overhead In a complex expression, we evaluate lower overhead debug functions first to optimize condition evaluation
Virtual Hardware Based Code Instrumentation • Three virtual registers in reserved AVR address space: Command, input, output • Useful for injecting a “print” command into source code without going through serial port • Allows custom debugging points: e.g. To monitor nth execution of a function • User sets debugger to monitor ID,VALUE • User command writes <id, value> to output register • The debugger interrupts execution if id == ID and value == VALUE • 3 register accesses total
Parallel Debugging • Debugging single nodes is useful but not enough for distributed applications • Many bugs emerge from the interactions between nodes • E.g. packet delivery failures in network protocols and race conditions in distributed applications • Goals: • Display the status of multiple devices in parallel • Break the execution on multiple nodes simultaneously • Step execute multiple devices at same pace • These require clock synchronization • Should be efficient and scalable
Partially Ordered Synchronization Y B D UPDATE master X A C E WAIT Partially Ordered Synchronization • Partially ordered synchronization for the evaluation of coordinated break condition in distributed simulation • One node acts as master • Other nodes always follow the master, i.e. clocki < clockmaster
X Time period when condition satisfied Y Z A B C D All nodes should break at time C Coordinated Break Condition • Coordinated Break • Simple: breaks execution when all nodes’ clocks reach time T > :break when clock() == T • Conjunction of atomic conditions > :break when node1.cond1 && … && nodeX.condY
X Time period when condition satisfied Y Z A B C D All nodes should break at time C Coordinated Break Condition • Coordinated Break • Simple: breaks execution when all nodes’ clocks reach time T > :break when clock() == T • Conjunction of atomic conditions > :break when node1.cond1 && … && nodeX.condY Limitation: Arbitrary conditions (e.g. disjunction of atomic conditions) • Constraints from distributed simulation structure • Sacrifice generality for scalability and performance
Time Traveling • Analyze anomaly using trace logs • Replay events before and at the time of anomaly • Periodic check-pointing saves simulated network state • Snapshots of CPU, memory and hardware components • Also: Radio packet queue, receive/send queue, power status • Small footprint: ~5KB size • Flash is too large: log-based snapshot • Other mechanisms can be used to trigger a checkpoint • Debugging points • Break points
Debug Point Cost Comparison of Debugging Point Cost
Coordinated Break Points Coordinated break point condition with multiple devices Y-axis is the ratio to execution speed on real device without condition monitoring
Checkpointing Overhead Each curve represents a configuration. i.e, 4x1: running 1 host and 4 nodes per host
Conclusion and Summary • S2DB contributions • Debugging points • Virtual debugging hardware • Coordinated break condition for parallel debugging • Time traveling for sensor network debugging • The debugger overhead to simulation is less than 10% • Ongoing work: • User interface • Plug-in for Eclipse IDE • We expect feedback from sensor network community to add new debugging features
Related Work • Sensor Network Simulation • ATEMU, Avrora • Full system, multi-simulation, lock-step synchronization • No sensor network gateway support • EmTos • A wrapper library for TOSSIM and EmStar • All applications must be recompiled to host machine code and linked to EmTos • Other Simulation • Skyeye • Full system ARM emulator including LCD and debugger • Not intended for sensor networks and multi-simulation
Partially Ordered Synchronization Peer Synchronization Partially Ordered Synchronization • Peer synchronization: used in the base simulator • Synchronize only before a radio read, no order is enforced • Partially ordered synchronization for the evaluation of coordinated break condition in distributed simulation • One node acts as master • Other nodes always follow the master, i.e. clocki < clockmaster Y Y B D B D UPDATE WAIT UPDATE master X A C X A C E WAIT UPDATE WAIT
Ensemble Synchronization • Clock synchronization • Execution rates of simulators should be proportional to real devices • Lock-step method: synchronize clocks on each serial byte transfer period • Serial transfer rate: 57.6 Kbits/seconds (128 Mote cycles) • Ensemble simulation requires clock synchronization to slowest simulation thread • Stargate simulator is the bottleneck (most complex) • Communication • Packets assembled using receivers local clock • Packet rate: 19.2 Kbits/seconds
Debugging over Simulation • Sensor network research requires substantial engineering, investment, and learning curve • Configuring/installing network devices a hassle • Many bugs not detected until run-time • HW lacks user-interface, debugging requires HW modification • Analyzing erroneous behavior not easy • Hard to distinguish hardware faults from software bugs • Simulation has significant advantages • + Provides a controlled environment • + Cost-effective solution • - Not the same as real-life execution