Download
querying sensor networks n.
Skip this Video
Loading SlideShow in 5 Seconds..
Querying Sensor Networks PowerPoint Presentation
Download Presentation
Querying Sensor Networks

Querying Sensor Networks

123 Vues Download Presentation
Télécharger la présentation

Querying Sensor Networks

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Querying Sensor Networks Sam Madden UC Berkeley October 2, 2002 @ UCLA

  2. Introduction • Programming Sensor Networks Is Hard • Especially if you want to build a “real” application • Declarative Queries Are Easy • And, can be faster and more robust than most applications!

  3. Overview • Overview of Declarative Systems • TinyDB • Features • Demo • Challenges+ Research Issues • Language • Optimizations • The Next Step

  4. Overview • Overview of Declarative Systems • TinyDB • Features • Demo • Challenges + Research Issues • Language • Optimizations • The Next Step

  5. Declarative Queries: SQL • SQL is the traditional declarative language used in databases SELECT {sel-list} FROM {tables} WHERE {pred} GROUP BY {pred} HAVING {pred} SELECT dept.name, AVG(emp.salary) FROM emp,dept WHERE emp.dno = dept.dno AND (dept.name=“Accounting” OR dept.name=“Marketing”) GROUP BY dept.name

  6. ON EVENT bird_detect(loc) AS bd • SELECT AVG(s.light), AVG(s.temp) • FROM sensors AS s • WHERE dist(bd.loc,s.loc) < 10m • SAMPLE PERIOD 1s for 10 • [Coming soon!] 3 • SELECT AVG(volume) • FROM sensors • WHERE light > 400 • GROUP BY roomNo • HAVING AVG(volume) > 200 2 Rooms w/ volume > 200 Declarative Queries for Sensor Networks • Examples: SELECT nodeid, light FROM sensors WHERE light > 400 SAMPLE PERIOD 1s 1

  7. General Declarative Advantages • Data Independence • Not required to specify how or where, just what. • Of course, can specify specific addresses when needed • Transparent Optimization • System is free to explore different algorithms, locations, orders for operations

  8. Data Independence In Sensor Networks • Vastly simplifies execution for large networks • Since locations are described by predicates • Operations are over groups • Enables tolerance to faults • Since system is free to choose where and when operations happen

  9. Optimization In Sensor Networks • Optimization Goal : Power! • Where to process data • In network • Outside network • Hybrid • How to process data • Predicate & Join Ordering • Index Selection • How to route data • Semantically Driven Routing

  10. Overview • Overview of Declarative Systems • TinyDB • Features • Demo • Challenges + Research Issues • Language • Optimizations • The Next Step

  11. TinyDB • A distributed query processor for networks of Mica motes • Available today! • Goal: Eliminate the need to write C code for most TinyOS users • Features • Declarative queries • Temporal + spatial operations • Multihop routing • In-network storage

  12. Query {A,B,C,D,E,F} A {B,D,E,F} B C {D,E,F} D F E TinyDB @ 10000 Ft (Almost) All Queries are Continuous and Periodic • Written in SQL • With Extensions For : • Sample rate • Offline delivery • Temporal Aggregation

  13. TinyDB Demo

  14. Applications + Early Adopters • Some demo apps: • Network monitoring • Vehicle tracking • “Real” future deployments: • Environmental monitoring @ GDI (and James Reserve?) • Generic Sensor Kit • Parking Lot Monitor Demo!

  15. TinyDB Architecture (Per node) SelOperator AggOperator • TupleRouter: • Fetches readings (for ready queries) • Builds tuples • Applies operators • Deliver results (up tree) TupleRouter • AggOperator: • Combines local & neighbor readings Network • SelOperator: • Filters readings Radio Stack Schema TinyAllloc • Schema: • “Catalog” of commands & attributes (more later) • TinyAlloc: • Reusable memory allocator!

  16. Free Bitmap Master Pointer Table Heap Free Bitmap Master Pointer Table Heap Free Bitmap Free Bitmap Master Pointer Table Master Pointer Table Heap Heap TinyAlloc • Handle Based Compacting Memory Allocator • For Catalog, Queries Handle h; call MemAlloc.alloc(&h,10); … (*h)[0] = “Sam”; call MemAlloc.lock(h); tweakString(*h); call MemAlloc.unlock(h); call MemAlloc.free(h); User Program Compaction

  17. Schema • Attribute & Command IF • At INIT(), components register attributes and commands they support • Commands implemented via wiring • Attributes fetched via accessor command • Catalog API allows local and remote queries over known attributes / commands. • Demo of adding an attribute, executing a command.

  18. Overview • Overview of Declarative Systems • TinyDB • Features • Demo • Challenges + Research Issues • Language • Optimizations • Quality

  19. ? ? ? ? ? ? 3 Questions ? • Is this approach expressive enough? • Can this approach be efficient enough? • Are the answers this approach gives good enough?

  20. Q1: Expressiveness • Simple data collection satisfies most users • How much of what people want to do is just simple aggregates? • Anecdotally, most of it • EE people want filters + simple statistics (unless they can have signal processing) • However, we’d like to satisfy everyone!

  21. Query Language • New Features: • Joins • Event-based triggers • Via extensible catalog • In network & nested queries • Split-phase (offline) delivery • Via buffers

  22. Sample Query 1 Bird counter: CREATE BUFFER birds(uint16 cnt) SIZE 1 ON EVENT bird-enter(…) SELECT b.cnt+1 FROM birds AS b OUTPUT INTO b ONCE

  23. Sample Query 2 Birds that entered and left within time t of each other: ON EVENT bird-leave AND bird-enter WITHIN t SELECT bird-leave.time, bird-leave.nest WHERE bird-leave.nest = bird-enter.nest ONCE

  24. Sample Query 3 Delta compression: SELECT light FROM buf, sensors WHERE|s.light – buf.light| > t OUTPUT INTO buf SAMPLE PERIOD 1s

  25. Sample Query 4 Offline Delivery + Event Chaining CREATE BUFFER equake_data( uint16 loc, uint16 xAccel, uint16 yAccel) SIZE 1000 PARTITION BY NODE SELECT xAccel, yAccel FROM SENSORS WHERE xAccel > t OR yAccel > t SIGNAL shake_start(…) SAMPLE PERIOD 1s ON EVENT shake_start(…) SELECT loc, xAccel, yAccel FROM sensors OUTPUT INTO BUFFER equake_data(loc, xAccel, yAccel) SAMPLE PERIOD 10ms

  26. Event Based Processing • Enables internal and chained actions • Language Semantics • Events are inter-node • Buffers can be global • Implementation plan • Events and buffers must be local • Since n-to-n communication not (well) supported • Next: operator expressiveness

  27. Operator Expressiveness: Aggregate Framework • Standard SQL supports “the basic 5”: • MIN, MAX, SUM, AVERAGE, and COUNT • We support any function conforming to: Aggn={fmerge, finit, fevaluate} Fmerge{<a1>,<a2>}  <a12> finit{a0}  <a0> Fevaluate{<a1>}  aggregate value (Merge associative, commutative!) Partial Aggregate Example: Average AVGmerge {<S1, C1>, <S2, C2>}  < S1 + S2 , C1 + C2> AVGinit{v}  <v,1> AVGevaluate{<S1, C1>}  S1/C1 From Tiny AGgregation (TAG), Madden, Franklin, Hellerstein, Hong. OSDI 2002 (to appear).

  28. Isobar Finding

  29. Temporal Aggregates • TAG was about “spatial” aggregates • Inter-node, at the same time • Want to be able to aggregate across time as well • Two types: • Windowed: AGG(size,slide,attr) • Decaying: AGG(comb_func, attr) • Demo! size =4 slide =2 … R1 R2 R3 R4 R5 R6 …

  30. Expressiveness Review • Internal & nested queries • With logging of results for offline delivery • Event based processing • Extensible aggregates • Spatial & temporal • On to Question 2: What about efficiency?

  31. Q2: Efficiency • Metric: power consumption • Goal: reduce communication, which dominates cost • 800 instrs/bit! • Standard approach: in-network processing, sleeping whenever you can…

  32. But that’s not good enough… • What else can we do to bring down costs? • Sleep Even More? • Events are key • Apply automatic optimization! • Semantically driven routing • …and topology construction • Operator placement + ordering • Adaptive data delivery

  33. TAG • In-network processing • Reduces costs depending on type of aggregates • Exploitation of operator semantics Tiny AGgregation (TAG), Madden, Franklin, Hellerstein, Hong. OSDI 2002 (to appear).

  34. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Depth = d

  35. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 1 1 Sensor # 1 1 1 Epoch # 1

  36. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 2 3 Sensor # 1 2 2 Epoch # 1

  37. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 3 4 Sensor # 1 3 2 Epoch # 1

  38. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 4 5 Sensor # 1 3 2 Epoch # 1

  39. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 5 5 Sensor # 1 3 2 Epoch # 1

  40. Simulation Result Simulation Results 2500 Nodes 50x50 Grid Depth = ~10 Neighbors = ~20 Some aggregates require dramatically more state!

  41. Taxonomy of Aggregates • TAG insight: classify aggregates according to various functional properties • Yields a general set of optimizations that can automatically be applied

  42. Optimization: Channel Sharing • Insight: Shared channel enables optimizations • Suppress messages that won’t affect aggregate • E.g., in a MAX query, sensor with value v hears a neighbor with value ≥ v, so it doesn’t report • Applies to all exemplary, monotonic aggregates • Learn about query advertisements it missed • If a sensor shows up in a new environment, it can learn about queries by looking at neighbors messages. • Root doesn’t have to explicitly rebroadcast query!

  43. Optimization: Hypothesis Testing • Insight: Root can provide information that will suppress readings that cannot affect the final aggregate value. • E.g. Tell all the nodes that the MIN is definitely < 50; nodes with value ≥ 50 need not participate. • Depends on monotonicity • How is hypothesis computed? • Blind guess • Statistically informed guess • Observation over first few levels of tree / rounds of aggregate

  44. B C B B C C B B C C 1 A A A 1/2 1/2 A A Optimization: Use Multiple Parents • For duplicate insensitive aggregates • Or aggregates that can be expressed as a linear combination of parts • Send (part of) aggregate to all parents • Decreases variance • Dramatically, when there are lots of parents

  45. TAG Summary • In Query Processing A Win For Many Aggregate Functions • By exploiting general functional properties of operators, many optimizations are possible • Requires new aggregates to be tagged with their properties • Up next: non-aggregate query processing optimizations – a flavor of things to come!

  46. Attribute Driven Topology Selection • Observation: internal queries often over local area* • Or some other subset of the network • E.g. regions with light value in [10,20] • Idea: build topology for those queries based on values of range-selected attributes • Requires range attributes, connectivity to be relatively static * Heideman et. Al, Building Efficient Wireless Sensor Networks With Low Level Naming. SOSP, 2001.

  47. Attribute Driven Query Propagation SELECT … WHERE a > 5 AND a < 12 Precomputed intervals == “Query Dissemination Index” 4 [1,10] [20,40] [7,15] 1 2 3

  48. Attribute Driven Parent Selection Even without intervals, expect that sending to parent with closest value will help 1 2 3 [1,10] [20,40] [7,15] [3,6]  [1,10] = [3,6] [3,7]  [7,15] = ø [3,7]  [20,40] = ø 4 [3,6]

  49. Hot off the press…

  50. Operator Placement & Ordering • Observation: Nested queries, triggers, and joins can often be re-ordered • Ordering can dramatically affect the amount of work you do • Lots of standard database tricks here