1 / 78

Querying Sensor Networks

Querying Sensor Networks. Sam Madden UC Berkeley October 2, 2002 @ UCLA. Introduction. Programming Sensor Networks Is Hard Especially if you want to build a “real” application Declarative Queries Are Easy And, can be faster and more robust than most applications!. Overview.

oscar-walsh
Télécharger la présentation

Querying Sensor Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Querying Sensor Networks Sam Madden UC Berkeley October 2, 2002 @ UCLA

  2. Introduction • Programming Sensor Networks Is Hard • Especially if you want to build a “real” application • Declarative Queries Are Easy • And, can be faster and more robust than most applications!

  3. Overview • Overview of Declarative Systems • TinyDB • Features • Demo • Challenges+ Research Issues • Language • Optimizations • The Next Step

  4. Overview • Overview of Declarative Systems • TinyDB • Features • Demo • Challenges + Research Issues • Language • Optimizations • The Next Step

  5. Declarative Queries: SQL • SQL is the traditional declarative language used in databases SELECT {sel-list} FROM {tables} WHERE {pred} GROUP BY {pred} HAVING {pred} SELECT dept.name, AVG(emp.salary) FROM emp,dept WHERE emp.dno = dept.dno AND (dept.name=“Accounting” OR dept.name=“Marketing”) GROUP BY dept.name

  6. ON EVENT bird_detect(loc) AS bd • SELECT AVG(s.light), AVG(s.temp) • FROM sensors AS s • WHERE dist(bd.loc,s.loc) < 10m • SAMPLE PERIOD 1s for 10 • [Coming soon!] 3 • SELECT AVG(volume) • FROM sensors • WHERE light > 400 • GROUP BY roomNo • HAVING AVG(volume) > 200 2 Rooms w/ volume > 200 Declarative Queries for Sensor Networks • Examples: SELECT nodeid, light FROM sensors WHERE light > 400 SAMPLE PERIOD 1s 1

  7. General Declarative Advantages • Data Independence • Not required to specify how or where, just what. • Of course, can specify specific addresses when needed • Transparent Optimization • System is free to explore different algorithms, locations, orders for operations

  8. Data Independence In Sensor Networks • Vastly simplifies execution for large networks • Since locations are described by predicates • Operations are over groups • Enables tolerance to faults • Since system is free to choose where and when operations happen

  9. Optimization In Sensor Networks • Optimization Goal : Power! • Where to process data • In network • Outside network • Hybrid • How to process data • Predicate & Join Ordering • Index Selection • How to route data • Semantically Driven Routing

  10. Overview • Overview of Declarative Systems • TinyDB • Features • Demo • Challenges + Research Issues • Language • Optimizations • The Next Step

  11. TinyDB • A distributed query processor for networks of Mica motes • Available today! • Goal: Eliminate the need to write C code for most TinyOS users • Features • Declarative queries • Temporal + spatial operations • Multihop routing • In-network storage

  12. Query {A,B,C,D,E,F} A {B,D,E,F} B C {D,E,F} D F E TinyDB @ 10000 Ft (Almost) All Queries are Continuous and Periodic • Written in SQL • With Extensions For : • Sample rate • Offline delivery • Temporal Aggregation

  13. TinyDB Demo

  14. Applications + Early Adopters • Some demo apps: • Network monitoring • Vehicle tracking • “Real” future deployments: • Environmental monitoring @ GDI (and James Reserve?) • Generic Sensor Kit • Parking Lot Monitor Demo!

  15. TinyDB Architecture (Per node) SelOperator AggOperator • TupleRouter: • Fetches readings (for ready queries) • Builds tuples • Applies operators • Deliver results (up tree) TupleRouter • AggOperator: • Combines local & neighbor readings Network • SelOperator: • Filters readings Radio Stack Schema TinyAllloc • Schema: • “Catalog” of commands & attributes (more later) • TinyAlloc: • Reusable memory allocator!

  16. Free Bitmap Master Pointer Table Heap Free Bitmap Master Pointer Table Heap Free Bitmap Free Bitmap Master Pointer Table Master Pointer Table Heap Heap TinyAlloc • Handle Based Compacting Memory Allocator • For Catalog, Queries Handle h; call MemAlloc.alloc(&h,10); … (*h)[0] = “Sam”; call MemAlloc.lock(h); tweakString(*h); call MemAlloc.unlock(h); call MemAlloc.free(h); User Program Compaction

  17. Schema • Attribute & Command IF • At INIT(), components register attributes and commands they support • Commands implemented via wiring • Attributes fetched via accessor command • Catalog API allows local and remote queries over known attributes / commands. • Demo of adding an attribute, executing a command.

  18. Overview • Overview of Declarative Systems • TinyDB • Features • Demo • Challenges + Research Issues • Language • Optimizations • Quality

  19. ? ? ? ? ? ? 3 Questions ? • Is this approach expressive enough? • Can this approach be efficient enough? • Are the answers this approach gives good enough?

  20. Q1: Expressiveness • Simple data collection satisfies most users • How much of what people want to do is just simple aggregates? • Anecdotally, most of it • EE people want filters + simple statistics (unless they can have signal processing) • However, we’d like to satisfy everyone!

  21. Query Language • New Features: • Joins • Event-based triggers • Via extensible catalog • In network & nested queries • Split-phase (offline) delivery • Via buffers

  22. Sample Query 1 Bird counter: CREATE BUFFER birds(uint16 cnt) SIZE 1 ON EVENT bird-enter(…) SELECT b.cnt+1 FROM birds AS b OUTPUT INTO b ONCE

  23. Sample Query 2 Birds that entered and left within time t of each other: ON EVENT bird-leave AND bird-enter WITHIN t SELECT bird-leave.time, bird-leave.nest WHERE bird-leave.nest = bird-enter.nest ONCE

  24. Sample Query 3 Delta compression: SELECT light FROM buf, sensors WHERE|s.light – buf.light| > t OUTPUT INTO buf SAMPLE PERIOD 1s

  25. Sample Query 4 Offline Delivery + Event Chaining CREATE BUFFER equake_data( uint16 loc, uint16 xAccel, uint16 yAccel) SIZE 1000 PARTITION BY NODE SELECT xAccel, yAccel FROM SENSORS WHERE xAccel > t OR yAccel > t SIGNAL shake_start(…) SAMPLE PERIOD 1s ON EVENT shake_start(…) SELECT loc, xAccel, yAccel FROM sensors OUTPUT INTO BUFFER equake_data(loc, xAccel, yAccel) SAMPLE PERIOD 10ms

  26. Event Based Processing • Enables internal and chained actions • Language Semantics • Events are inter-node • Buffers can be global • Implementation plan • Events and buffers must be local • Since n-to-n communication not (well) supported • Next: operator expressiveness

  27. Operator Expressiveness: Aggregate Framework • Standard SQL supports “the basic 5”: • MIN, MAX, SUM, AVERAGE, and COUNT • We support any function conforming to: Aggn={fmerge, finit, fevaluate} Fmerge{<a1>,<a2>}  <a12> finit{a0}  <a0> Fevaluate{<a1>}  aggregate value (Merge associative, commutative!) Partial Aggregate Example: Average AVGmerge {<S1, C1>, <S2, C2>}  < S1 + S2 , C1 + C2> AVGinit{v}  <v,1> AVGevaluate{<S1, C1>}  S1/C1 From Tiny AGgregation (TAG), Madden, Franklin, Hellerstein, Hong. OSDI 2002 (to appear).

  28. Isobar Finding

  29. Temporal Aggregates • TAG was about “spatial” aggregates • Inter-node, at the same time • Want to be able to aggregate across time as well • Two types: • Windowed: AGG(size,slide,attr) • Decaying: AGG(comb_func, attr) • Demo! size =4 slide =2 … R1 R2 R3 R4 R5 R6 …

  30. Expressiveness Review • Internal & nested queries • With logging of results for offline delivery • Event based processing • Extensible aggregates • Spatial & temporal • On to Question 2: What about efficiency?

  31. Q2: Efficiency • Metric: power consumption • Goal: reduce communication, which dominates cost • 800 instrs/bit! • Standard approach: in-network processing, sleeping whenever you can…

  32. But that’s not good enough… • What else can we do to bring down costs? • Sleep Even More? • Events are key • Apply automatic optimization! • Semantically driven routing • …and topology construction • Operator placement + ordering • Adaptive data delivery

  33. TAG • In-network processing • Reduces costs depending on type of aggregates • Exploitation of operator semantics Tiny AGgregation (TAG), Madden, Franklin, Hellerstein, Hong. OSDI 2002 (to appear).

  34. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Depth = d

  35. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 1 1 Sensor # 1 1 1 Epoch # 1

  36. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 2 3 Sensor # 1 2 2 Epoch # 1

  37. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 3 4 Sensor # 1 3 2 Epoch # 1

  38. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 4 5 Sensor # 1 3 2 Epoch # 1

  39. 1 2 3 4 5 Illustration: Pipelined Aggregation SELECT COUNT(*) FROM sensors Epoch 5 5 Sensor # 1 3 2 Epoch # 1

  40. Simulation Result Simulation Results 2500 Nodes 50x50 Grid Depth = ~10 Neighbors = ~20 Some aggregates require dramatically more state!

  41. Taxonomy of Aggregates • TAG insight: classify aggregates according to various functional properties • Yields a general set of optimizations that can automatically be applied

  42. Optimization: Channel Sharing • Insight: Shared channel enables optimizations • Suppress messages that won’t affect aggregate • E.g., in a MAX query, sensor with value v hears a neighbor with value ≥ v, so it doesn’t report • Applies to all exemplary, monotonic aggregates • Learn about query advertisements it missed • If a sensor shows up in a new environment, it can learn about queries by looking at neighbors messages. • Root doesn’t have to explicitly rebroadcast query!

  43. Optimization: Hypothesis Testing • Insight: Root can provide information that will suppress readings that cannot affect the final aggregate value. • E.g. Tell all the nodes that the MIN is definitely < 50; nodes with value ≥ 50 need not participate. • Depends on monotonicity • How is hypothesis computed? • Blind guess • Statistically informed guess • Observation over first few levels of tree / rounds of aggregate

  44. B C B B C C B B C C 1 A A A 1/2 1/2 A A Optimization: Use Multiple Parents • For duplicate insensitive aggregates • Or aggregates that can be expressed as a linear combination of parts • Send (part of) aggregate to all parents • Decreases variance • Dramatically, when there are lots of parents

  45. TAG Summary • In Query Processing A Win For Many Aggregate Functions • By exploiting general functional properties of operators, many optimizations are possible • Requires new aggregates to be tagged with their properties • Up next: non-aggregate query processing optimizations – a flavor of things to come!

  46. Attribute Driven Topology Selection • Observation: internal queries often over local area* • Or some other subset of the network • E.g. regions with light value in [10,20] • Idea: build topology for those queries based on values of range-selected attributes • Requires range attributes, connectivity to be relatively static * Heideman et. Al, Building Efficient Wireless Sensor Networks With Low Level Naming. SOSP, 2001.

  47. Attribute Driven Query Propagation SELECT … WHERE a > 5 AND a < 12 Precomputed intervals == “Query Dissemination Index” 4 [1,10] [20,40] [7,15] 1 2 3

  48. Attribute Driven Parent Selection Even without intervals, expect that sending to parent with closest value will help 1 2 3 [1,10] [20,40] [7,15] [3,6]  [1,10] = [3,6] [3,7]  [7,15] = ø [3,7]  [20,40] = ø 4 [3,6]

  49. Hot off the press…

  50. Operator Placement & Ordering • Observation: Nested queries, triggers, and joins can often be re-ordered • Ordering can dramatically affect the amount of work you do • Lots of standard database tricks here

More Related