430 likes | 563 Vues
Building up to Macroprogramming: An Intermediate Language for Sensor Networks. Ryan Newton, Arvind (MIT), and Matt Welsh (Harvard) @ IPSN 2005 (IEEE Int’l Conf on Information Processing in Sensor Networks). Presented by Ryo Sugihara @ largescale seminar (2005/08/15).
E N D
Building up to Macroprogramming:An Intermediate Language for Sensor Networks Ryan Newton, Arvind (MIT), and Matt Welsh (Harvard) @ IPSN 2005 (IEEE Int’l Conf on Information Processing in Sensor Networks) Presented by Ryo Sugihara @ largescale seminar (2005/08/15)
Programming Sensor Networks • Difficult • Highly distributed system • Stringent resource constraint • Energy-efficiency is mandatory in every application • nesC: de-facto standard language • Low level • Everything is mixed: Computation, Concurrency, Communication, Efficiency • Far too difficult for end-users to make programs • Not experts in programming Necessity for software support
Previous Approach • Query engines: Sensornet as a database • TinyDB, COUGAR, … • Simple but limited use • High-level programming abstraction • Abstract Regions, Hood, EnviroTrack, … • Neighborhood-based programming primitives and operations on them • For specific types of application: target tracking • Programming toolkit • SNACK • Reusable service library and compiler • Targeting general mechanism: not very easy to use • Virtual machine • Maté • VM architecture + Intermediate code (cf. Java bytecode) • For efficient reprogramming • Similar approach, different objectives (discussed later)
“Macroprogramming” • Program sensornet as a whole • Easier than programming at the level of individual nodes • e.g) Matrix multiplication Matrix notation vs. Parallel program in MPI • Compile into node-level programs • “Regiment” [Newton & Welsh, 2004] • Purely functional macroprogramming language for sensornet • Example: tracking moving vehicle • A region stream is created that represents the value of the proximity sensor on every node in the network • Each value is also annotated with the location of the corresponding sensor. • Data items that fall below the threshold are filtered out. • The spatial centroid of the remaining collection of sensor values is computed to determine the approximate location of the object that generated the readings
Regiment TinyDB AbstRegion Hood COUGAR Intermediate Language Difficult because of large semantic gap Executable (TinyOS component) Need for Intermediate Langugage • Problem arisen in Regiment • Compilation is difficult • Large semantic gap between Regiment and nesC • Solution: Intermediate Language • Benefit: • Common abstraction for high-level abstractions • Reducing complexity for building compiler
Regiment TinyDB AbstRegion Hood COUGAR Intermediate Language Difficult because of large semantic gap Executable (TinyOS component) Requirementsfor Intermediate Language • Simplicity: • Abstract away the details of • Concurrency • Communication • Expressivity: • Capture enough details • To permit extensive optimizations by compiler
Goals • Define an intermediate language that • Provides simple and versatile abstractions for • Communication • Data dissemination • Remote execution • Constitutes a framework for network coordination • That can be used to implement sophisticated algorithms: • Leader election • Group maintenance • In-network aggregation • Note: Use by human is not a goal • But it turns out to be usable as a result
Outline • Background & Motivation • Macroprogramming • Need for intermediate language • Requirements & goals • Distributed Token Machines (DTM) Model • Token Machine Language (TML) • Examples • Evaluation / Conclusion / Future Works • Discussion
DTM (Distributed Token Machines) Model • Token-based Execution and Communication model that structures: • Concurrency • Communication • Storage • Features • Atomic execution • Simplify consistency issue • Bounded-time completion • Enable precise scheduling
DTM Model • Players • Token • Token Message • Token name + Payload • Form of token on the network • Token Object • Token + Fixed-size private memory • Only accessed by associated token handler • Sensor node • Scheduler • Dispatch Token Message to Token Handlers • Token Handler • Executed upon receiving Token Message • Shared Memory • Shared among token handlers • Token Store • Storing token objects • Sole place for dynamic memory allocation • When creating token objects
How DTM Works • Scheduler receives Token Message • Scheduler directs Token Message to corresponding Token Handler • Token Handler picks corresponding Token Object from Token Store • If none, create and initialize one • Token Handler consumes the message payload and executes atomically • Possibly read/write Token Object’s private memory and node’s shared memory # Programming in DTM = Writing Token Handlers
Token Handler • Interfaces • schedule(Ti, priority, data...) / timed_schedule(...) • Insert a new Token Message in a nodes’ local scheduling queue • “timed_...” will be executed precisely after the specified time • bcast(Ti, data...) • Broadcast a Token Message to the radio neighbors • No ACK • is_scheduled(Ti) / deschedule(Ti) • Query/removal of Token Messages waiting in the scheduler • present(Ti) / evict(Ti) • Interface into the nodes’ Token Store as a whole • Query/remove of Token Objects to/from local node’s Token Store
Outline • Background & Motivation • Macroprogramming • Need for intermediate language • Requirements & goals • Distributed Token Machines (DTM) Model • Token Machine Language (TML) • Examples • Evaluation / Conclusion / Future Works • Discussion
Token Machine Language (TML) • Realization of DTM model • DTM provides an execution model • TML fills in • Set of basic operators: • Concrete syntax for describing Token Handlers • Goals of TML • Lightweight • Otherwise, compilation from high-level lang to TML will be complex • Efficiently mapped onto TinyOS etc. • Different semantics: Event-driven • Otherwise, compiled executable will not be practically usable • Versatile • Applicability to wide range of systems • Because TML aims to be a common abstract layer • Mask the complexity • Because it is the fundamental reason for intermediate language • ... as opposed to “portability” and “safety” in JVM etc.
TML Features • Subset of C, extended with DTM interfaces • Without data pointer • No invalid memory reference • Only fixed length loops • Completion in bounded time • Sole procedure call: • “Scheduling of token” • Compiled to nesC code • Guarantee conformance to DTM execution semantics • Buildup of abstraction • Enriched by adding features (example follows next) Example of Token declaration
Building up Features: Subroutine Calls • Subroutine calls • NOT preferred by system • For small and fast atomic actions • Execution time could be unpredictable • Introduce call-stack: another dynamic structure • Preferred by users • For simplicity • Solution: CPS transformation • CPS: Continuation Passing Style
CPS Transformation • Automatically generated • Systematic conversion in TML, but difficult in nesC token int Red(int a) { stored int y = 0; schedule Blue(4); y = subcall Green(3); bcast Red(a); return y; } token int Red[id](int a) { stored int y = 0; schedule Blue(4); schedule RedK(ALLOC, a, y); schedule Green(3, RedK[id]); ... } tokenRedK[id](int mode, int[2] freevars) { stored int a, y; // static vars if (mode == ALLOC) { a = freevars[0]; // captured a y = freevars[1]; // captured y } else if (mode == INVOKE) { y = freevars[0]; // returnval bcast Yellow(a + y); } } tokenGreen(int x, tokname k) { ... returnval = ...; schedule RedK(INVOKE, returnval); } subroutine call “continuation”
Building up Features: Blocking Operation • Split-phase Blocking Operation • Split-phase TinyOS operations • Expose them as blocking operations • Using CPS transformation, as in “subcall” case TinyOS split-phase TML “sense” operation
Outline • Background & Motivation • Macroprogramming • Need for intermediate language • Requirements & goals • Distributed Token Machines (DTM) Model • Token Machine Language (TML) • Examples • Evaluation / Conclusion / Future Works • Discussion
Examples • Building up Features: • Adding “gradient” interface • TML examples • Timed data gathering • Distributed event detection • Leader election • Intermediate Lang for Macroprogramming • Compiling Regiment
Example: Adding “Gradient” Interface • “Gradient” • General purpose mechanism for breadth-first exploration from a source node • Establish a spanning tree that tells all nodes within the gradient how to route to the source as well as their hop-count • Used in “Directed Diffusion” • Interfaces (to Scheduler) • gemit(Ti, data...) • Begins gradient propagation (cf. bcast) • grelay(Ti, data...) • Continues gradient propagation (cf. bcast) • greturn(Tcall, Tvia, Taggr, data...) • Propagate data up to their roots • Via the gradient specified by Tvia • Fires Tcall token handler on the data when it reaches the root • With optional aggregation (Taggr) • dist(Ti) / version(Ti) • Get info about neighbors (in terms of gradient)
Example: Timed Data Gathering • Sample each node’s light sensor every sec • Routing tree is refreshed every 10 secs Gather and GlobalTree tokens will be scheduled when the node is first turned on. startup Gather, GlobalTree; base_startup SparkGlobal; tokenSparkGlobal() { gemitGlobalTree(); timed_scheduleSparkGlobal(10000); } tokenGlobalTree() { grelayGlobalTree(); } tokenGather() { greturn(BaseReceive, GlobalTree, NULL, subcall sense_light()); timed_scheduleGather(1000); } base_startup only applies to base-station BaseReceive is a predefined token handler and supported only on the base-station sense_light() is described like a blocking operation
Example: Distributed Event Detection shared int total_activation; tokenEventDetected () { emit AddActivation[MYID](1); schedule AddActivation[MYID](1); } tokenAddActivaton[sub] (int x) { if ( dist(self) < 2 ) relay AddActivaton(x); total_activation += x; if (total_activation > threshold) greturn(BaseReceive, GlobalTree, NULL, ALARM); timed_call SubActivation[sub](1500, x); } tokenSubActivation[sub] (int x) { total_activation -= x; if (total_activation <= 0) { evict AddActivation[sub]; evict SubActivation[sub]; } } • Alarm is raised when a quorum of sensors detect an event • Assuming unreliable sensors • Every node spread two-hop gradient when it detects an event
Example: Leader Election shared int winner; token elect_leader(tokname T) { int current = winner; if (current == 0 || current < MYID) { winner = MYID; timed_schedule Confirm_Fire[T](5000, T); emit Compete(MYID, T); } } tokenCompete(int id, tokname T) { if (winner == 0) { winner = MYID; timed_schedule Confirm_Fire[T](5000, T); } if (version(Compete) == 0 || id > winner) { winner = id; relay Compete(id, T); } } tokenConfirm_Fire[sub](tokname T) { if (MYID == winner) schedule T(); }
Example: Compiling Regiment • Tokens • membership • Defines a region • Node holding a membership token is a member of region at that time • formation • Initiate the work of constructing / discovering the region • Have constraints on where and when they need to fire • Translate region-logic to token-logic • Rather straightforward • “Let region be a set of nodes satisfying criteria of .....” • “Give membership token to a set of nodes satisfying ...”
Outline • Background & Motivation • Macroprogramming • Need for intermediate language • Requirements & goals • Distributed Token Machines (DTM) Model • Token Machine Language (TML) • Examples • Evaluation / Conclusion / Future Works • Discussion
Evaluation • Current implementation • High-level simulator • Compiler targeting nesC/TinyOS environment • Comparison with native TinyOS code • Code size: Very good • RAM, CPU usage: Bad • Overhead of running scheduler • Unnecessary copying of buffers • By excluding pointers
Conclusion / Findings • Atomic action model of concurrency is good because ... • Preclude deadlocks • Easy reasoning about timing • Communication bound to persistent storage (= tokens) is good because ... • Give a way to refer to communications that have happened through the token they leave behind
Future Directions • Dynamic retasking • As in Maté: • Support Maté’s view: • “Separate representations for • End-user programming model • Code transport layer • Execution engine” • TML code should be compiled into Maté bytecode • Focus on whole-program compilation • Numerous optimizations depend on whole program optimization • Impossible in Maté architecture (Bytecode + VM) • Incorporate token-based algorithms • Routing, Consensus,...
Discussion • Is DTM model enough capable? • To describe every mode of execution at each node • Is token an appropriate abstraction? • How is it different from messages as in MPI? • TML: explicit association between message and object • Is TML • Versatile enough? • Is macroprogramming a viable approach? • Easy? Expressive? Extensible? Flexible? • What is missing? • Refinement of code for further optimization (by hand)?
References • nesC: • David Gay, Philip Levis, Robert von Behren, Matt Welsh, Eric Brewer, and David Culler , "The nesC Language: A Holistic Approach to Network Embedded Systems" [citeseer], PLDI, 2003 • TinyOS: • Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler and Kristofer Pister, “System architecture directions for network sensors” ASPLOS-IX, 2000 • Maté/ASVM: • Philip Levis and David Culler, "Maté: A Tiny Virtual Machine for Sensor Networks." ASPLOS-X, 2002 • Philip Levis, David Gay, and David Culler, "Active Sensor Networks“, NSDI, 2005 • Abstract Regions • Matt Welsh and Geoff Mainland , "Programming Sensor Networks Using Abstract Regions" [slides], NSDI, 2004 • Regiment • Ryan Newton and Matt Welsh, “Region Streams: Functional Macroprogramming for Sensor Networks”, DMSN, 2004 • Directed Diffusion • Chalermek Intanagonwiwat, Ramesh Govindan and Deborah Estrin, “Directed diffusion: A scalable and robust communication paradigm for sensor networks”,MobiCOM, 2000 • TinyDB: • Samuel R. Madden, Mehul A. Shah, Joseph M. Hellerstein, and Vijayshankar Raman, “Continuously Adaptive Continuous Queries over Streams”, SIGMOD, 2002 • COUGAR • Philippe Bonnet, J. E. Gehrke, and Praveen Seshadri, “Querying the Physical World”. IEEE Personal Communications, Vol. 7, No. 5, October 2000, pages 10-15 • SNACK • Ben Greenstein, Eddie Kohler, and Deborah Estrin, “A Sensor Network Application Construction Kit (SNACK)”, SenSys, 2004
TML • Unified abstraction for • Communication • Token is the only method of communication • Execution • Token Handler is the only place of execution • Network state management • Token Objects and Shared Memory
Subtoken • Indexed token object • All subtokens share a single token handler • for efficiency • cf) component parameterization in nesC • cf2) Can be seen as “multiple instances (=subtoken) of token class”
nesC [Gay, Levis, von Behren, Welsh, Brewer & Culler, 2003] • C-based language • Component-oriented • component with bidirectional interface • Event-based execution model • Concurrency model with extensive compile-time analysis • “Asynchronous code” / “Synchronous code” • “Split-phase operations” • No blocking operations: Request / Completion
TinyOS [Hill, Szewczyk, Woo, Hollar, Culler & Pister, 2000] • Implemented by nesC • Event-based programming model • Set of components • Rather than monolithic OS • Abstract hardware • Layering of components: “Command” and “Event” • Services provided such as: • RF messaging • Periodic timer events • Asynchronous access to UART data transfers • Mechanism for static, persistent storage
Abstract Regions [Welsh & Mainland, 2004] • Efficient communication primitives • Provide interface for • Neighborhood discovery • Initialize, may or may not be done continuously • Enumeration • Get set of nodes in the region • Data sharing • "Get" and "Put": tuple-space-like programming model • Reduction • Reduce a shared variable across nodes in the region using the indicated associative operator (sum, max,...) • Allow tuning the trade-off between accuracy and resource usage • By exposing low level tuning parameters: • #retransmission, timeout interval...
Directed Diffusion [Intanagonwiwat, Govindan & Estrin, 2000] • Communication abstraction for Data-centric dissemination • Sink (base-station) publishes “interests” • Source (sensor) publishes “event” • Propagation is facilitated when gradient is large • Reinforcement-based adaptation • Successful paths are reinforced • Realizes energy-efficient & robust dissemination
Maté / ASVM [Levis & Culler, 2002], [Levis, Gay & Culler, 2005] • Application specific virtual machine • To cope with different programming models for different application classes • Complete generality is not necessary for a given network • Mainly for • Efficient dynamic reprogramming • Safety • (NOT for portability as in JVM) • Expose new programming primitives by • Using application-specific opcodes
Maté Build Process Appl-specific VM Compile code (in tscript etc.) to custom instruction set Users select these three things From Philip Levis, "Rapid and Safe Programming of In-situ Sensor Networks“, Qualifying examination proposal, Jan 2004. http://www.cs.berkeley.edu/~pal/talks/quals.pdf
Maté / ASVM Features • Customizable virtual machine framework • Event-driven, thread-based, stack architecture • Customizable execution events and instruction set • Users select three things • A scripting language • A set of library functions • A set of events that trigger execution • Maté builds a VM and scripting environment • Instruction set customized for deployment • VM provides needed services: routing, code propagation • Scripts compile to custom instruction set • VM automatically distributes new programs • VM and scripter form an application-specific runtime
TML vs. Maté • (According to TML paper) • Maté ...provides only low-level radio communication directly within Maté, and uses application-specific opcodes – essentially a foreign function interface into other TinyOS components – to expose new communication primitives such as abstract regions. • In contrast, TML provides a coordinated communication and execution architecture, based on tokens, that is used to build up new communication abstractions. • Ideally, the user program should be compiled to a concise bytecode supported by a pre-installed virtual machine. We will look into targetting a virtual machine rather than native code, perhaps Maté itself. • In short, • TML can be used on top of Maté
Regiment [Newton & Welsh, 2004] • Purely functional macroprogramming language • No direct manipulation of program state • High-level program manipulates “regions” of sensor data as values • Individual nodes appear as data streams • Regions are groupings of these streams • Designated using a number of different criteria • Program operates over these streams and regions • translated into node-level actions • Example: Tracking Object • A region stream is created that represents the value of the proximity sensor on every node in the network • Each value is also annotated with the location of the corresponding sensor. • Data items that fall below the threshold are filtered out. • The spatial centroid of the remaining collection of sensor values is computed to determine the approximate location of the object that generated the readings