Optimizing Sensor Network Queries with In-Network Summaries for Approximate Data Retrieval

Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Approximate Answer Queries • Approximate representation of the world: • Discrete locations • Lossy communication • Noisy measurements • Applications do not expect accurate values (tolerance to noise) • Example: • Return the temperature at all locations ±1C, with 95% confidence • Query Satisfaction: • On expectation the requested portion of sensor values lies within the error range

In-network Decisions Query Use in-network models to make routing decisions No centralized planning

In-network Summaries Spanning tree T(V,E’) + Models Mv for all nodes v Mv represents the whole subtree rooted at v.

Model Complexity • Gaussian distributions at the leaves: • good for modeling individual node measurements Need for compression

Talk “outline” Compression In-network summaries Construction Traversal

Collapsing Gaussian Mixtures • Compress an m-size mixture to a k-size mixture. • Look at simple case (k=1) • Minimize KL-divergence? “Fake” mass

Quality of Compression Depends on query workload Query with acceptable error window W’<W Query with acceptable error window W

Compression Accurate mass inside interval No guarantee on the tails

Query Satisfaction • A response R={r1…rn} satisfies query Q(w,δ) if: • In expectation the values of at least δn nodes lie within [ri-w,ri+w] Q In-network summary Within error bounds [r1, r2, r3, r4, r5, r6, r7, r8, r9, r10] R

Optimal Traversal • Given: tree and models • Find: subtree such that [μleaves] response Can be computed with Dynamic Programming

Greedy Traversal • If local model satisfies • Return μ • Else descend to child node More conservative solution: enforces query satisfiability on every subtree instead of the whole tree

Traversal Evaluation

Optimal Tree Construction • Given a structure, we know how to build the models • But how do we pick the structure?

Traversal = cut Theorem: In a fixed fanout tree, the cost of the traversal is where |C| is the size of the cut, and F the fanout Intuition: minimize cut size Group nodes into a minimum number of groups which satisfy the query constraints Clustering problem

Optimal Clustering • Given a query Q(w,δ), optimal clustering is NP-hard • Related to the Group Steiner Tree Problem • Greedy algorithm with factor log(n) approximation • Greedily pick max size cluster • Issue: does not enforce connectivity of clusters

Greedy Clustering • Include extra nodes to enforce connectivity • Augment clusters only with accessible nodes (losing the logn guarantee)

Clustering comparison • 2 distributed clustering algorithms are compared to the centralized greedy clustering

Talk “outline” Compression Enriched models In-network summaries Construction Traversal

Enriched models • Support more complex models • k-mixtures • Compress to a k-size mixture instead of a SGM • Virtual nodes • Every component of the k-size mixture is stored as a separate “virtual node” • SGMs on multiple windows • Maintain additional SGMs for different window sizes • More space, more expensive model updates (SGM = Single Gaussian Model)

Evaluation of enriched models SGM surprisingly effective in representing the underlying data

Talk “outline” Sensitivity analysis Compression In-network summaries Construction Traversal

Tree Construction Parameters and Effect on Performance • Confidence • Performance for workloads of different confidence than the hierarchy design • Error window • Broader vs narrower ranges of window sizes • Assignment of windows across tree levels • Temporal changes • How often should the models be updated

Confidence Workload of 0.95 confidence Design confidence does not have a big impact on performance

Error windows A wide range is not always better, because it forces the traversal of more levels

Model Updates

Compression Conclusions Enriched models In-network summaries Traversal Construction • Analyzed compression schemes for in-network summaries • Evaluated summary traversal • Studied optimal hierarchy construction • Studied increased complexity models • Showed that simple SGM are sufficient • Analyzed the effect on efficiency of various parameters Sensitivity analysis

Optimizing Sensor Network Queries with In-Network Summaries for Approximate Data Retrieval

Optimizing Sensor Network Queries with In-Network Summaries for Approximate Data Retrieval

Presentation Transcript

Sensor Network Querying

Wireless Sensor Network

Wireless Sensor Network

Sensor Network Applications

Sensor Network

Wireless Sensor Network

Sensor Network Management

Wireless Sensor Network

Sensor Network

Sensor Network Security

Wireless Sensor Network (Ubiquitous Sensor Network)

Sensor Network Routing

SENSOR NETWORK ARCHITECTURE

Sensor Network Programming

Clinical monitoring using sensor network technology

Sensor-/Mesh-Network

Sensor Network Security

Sensor Network Architectures