400 likes | 616 Vues
Approximate Aggregation Techniques for Sensor Databases. John Byers Department of Computer Science Boston University Joint work with Jeffrey Considine, George Kollios, Feifei Li. Sensor Network Model.
E N D
Approximate Aggregation Techniques for Sensor Databases John Byers Department of Computer Science Boston University Joint work with Jeffrey Considine, George Kollios, Feifei Li
Sensor Network Model • Large set of sensors distributed in a sensor field. • Communication via a wireless ad-hoc network. • Node and links are failure-prone. • Sensors are resource-constrained • Limited memory, battery-powered, messaging is costly.
Sensor Databases • Treat sensor field as a distributed database • But: data is gathered, not stored nor saved. • Perform standard queries over sensor field: • COUNT, SUM, GROUP-BY • Exemplified by work such as TAG and Cougar • For this talk: • One-shot queries • Continuous queries are a natural extension.
Tiny Aggregation (TAG) Approach[Madden, Franklin, Hellerstein, Hong] • Aggregation component of TinyDB • Follows database approach • Uses simple SQL-like language for queries • Power-aware, in-network query processing • Optimizations are transparent to end-user. • TAG supports COUNT, SUM, AVG, MIN, MAX and others
TAG (continued) • Queries proceed in two phases: • Phase 1: • Sink broadcasts desire to compute an aggregate. • Nodes create a routing tree with the sink as the root. • Phase 2: • Nodes start sending back partial results. • Each node receives the partial results of its children and computes a new partial result. • Then it forwards the new partial result to its parent. • Can compute any decomposable function • f (v1, v2, …, vn) = g( f (v1, .., vk), f (vk+1, …, vn))
4 3 3 9 4 4 4 1 2 3 3 1 2 2 2 2 1 1 Example for SUM sink • Sink initiates the query • Nodes form a spanning tree • Each node sends its partial result to its parent • Sink computes the total sum 20
Classification of Aggregates • TAG classifies aggregates according to • Size of partial state • Monotonicity • Exemplary vs. summary • Duplicate-sensitivity • MIN/MAX (cheap and easy) • Small state, monotone, exemplary, duplicate-insensitive • COUNT/SUM (considerably harder) • Small state and monotone, BUT duplicate-sensitive • Cheap if aggregating over tree without losses • Expensive with multiple paths and losses
Basic approaches to computing SUM • Separate, reliable delivery of every value to sink • Extremely costly in energy and energy consumption • Aggregate values back to sink along a tree • A single fault eliminates values of an entire subtree • “Split” values and route fractions separately • Send (value / k) to each of k parents • Better variance, but same expectation as approach (2) • Send values along multiple paths • Duplicates need to be handled. • <ID, value> pairs have limited in-network aggregation.
Design Objectives for Robust SUM • Admit in-network aggregation of partial values • Let aggregates be both order-insensitive and duplicate-insensitive • Be agnostic to routing protocol • Trust routing protocol to be best-effort. • Routing and aggregation logically decoupled [NG ’03]. • Some routing algorithms better than others.
Design Objectives (cont) • Final aggregate is exact if at least one representative from each leaf survives to reach the sink. • This won’t happen in practice. • It is reasonable to hope for approximate results. • We argue that it is reasonable to use aggregation methods that are themselves approximate.
Outline • Motivation for sensor databases and aggregation. • COUNT aggregation via Flajolet-Martin • SUM aggregation • Experimental evaluation
Flajolet / Martin sketches [JCSS ’85] • Goal: Estimate N from a small-space representation of a set. • Sketch of a union of items is the OR of their sketches • Insertion order and duplicates don’t matter! Prerequisite:Let h be a random, binary hash function. Sketch of an item For each unique item with ID x, For each integer 1 ≤ i ≤ k in turn, Compute h (x, i). Stop when h (x, i) = 1, and set bit i. ∩
Flajolet / Martin sketches (cont) Estimating COUNT Take the sketch of a set of N items. Let j be the position of the leftmost zero in the sketch. j is an estimator of log2 (0.77 N) S 1 1 1 0 1 j = 3 Best guess: COUNT ~ 11 • Fixable drawbacks: • Estimate has faint bias • Variance in the estimate is large.
Flajolet / Martin sketches (cont) • Standard variance reduction methods apply. • Compute m independent sketches in parallel. • Compute m independent estimates of N. • Take the mean of the estimates. Provable tradeoffs between m and variance of the estimator
Application to COUNT • Each sensor computes k independent sketches of itself (using unique ID x) • Coming next: sensor computes a sketch of its value. • Use a robust routing algorithm to route sketches up to the sink. • Aggregate the k sketches via union en-route. • The sink then estimates the count.
Multipath Routing Braided Paths: Two paths from the source to the sink that differ in at least two nodes
Routing Methodologies • Considerable work on reliable delivery via multipath routing • Directed diffusion [IGE ’00] • “Braided” diffusion [GGSE ’01] • GRAdient Broadcast [YZLZ ’02] • Broadcast intermediate results along gradient back to source • Can dynamically control width of broadcast • Trade off fault tolerance and transmission costs • Our approach similar to GRAB: • Broadcast. Grab if upstream, ignore if downstream Common goal: try to get at least one copy to sink
Simple Upstream Routing • By expanding ring search, nodes can compute their hop distance from the sink. • Refer to nodes at distance i as level i. • At level i, gather aggregates from level i+1. • Then broadcast aggregates to level i - 1 neighbors. • Ignore downstream and sidestream aggregates.
Extending Flajolet / Martin Sketches • Also interested in approximating SUM • FM sketches can handle this (albeit clumsily): • To insert a value of 500, perform 500 distinct item insertions • Our observation: We can simulate a large number of insertions into an FM sketch more efficiently. • Sensor-net restrictions • No floating point operations • Must keep memory usage and CPU time to a minimum
Simulating a set of insertions • Set all the low-order bits in the “safe” region. • First S = log c – 2 log log c bits are set to 1 w.h.p. • Statistically estimate number of trials going beyond “safe” region • Probability of a trial doing so is simply 2-S • Number of trials ~B(c,2-S). [Mean = O(log2c)] • For trials and bits outside “safe” region, set those bits manually. • Running time is O(1) for each outlying trial. Expected running time: O(log c) + time to draw from B(c,2-S) + O(log2c)
Fast sampling from discrete pdf’s • We need to generate samples from B(n, p). • General problem: sampling from a discrete pdf. • Assume can draw uniformly at random from [0,1]. • With an event space of size N: • O(log N) lookups are immediate. • Represent the cdf in an array of size N. • Draw from [0, 1] and do binary search. • Cleverer methods for O(log log N), O(log* N) time Amazingly, this can be done in constant time!
Constant Time Sampling • Theorem [Walker ’77]: For any discrete pdf D over a sample space of size N, a table of size O(N) can be constructed in O(N) time that enables random variables to be drawn from D using at most two table lookups.
Sampling in O(1) time [Walker ’77] • Start with a discrete pdf. {0.40, 0.30, 0.15, 0.10, 0.05} • Construct a table of 2N entries. Algorithm Pick a column at random. Pick x uniformly from [0, 1]. If x < pi output i. Else output Qi A B C D E pi 1 0.25 1 0.75 0.5 __ __ Qi A B A In table above: Pr[B] = 1 * 0.2 + 0.5 * 0.2 = 0.3 Pr[C] = 0.75 * 0.2 = 0.15
Methods of [Walker ’77] (cont.) • Ok, but how do you construct the table? Table construction Take “below-average” i. Choose pi to satisfy xi = pi /n. Set j with largest xj as Qi Reduce xjaccordingly. Repeat. A B C D E 0.15 0.10 0.05 0.40 0.30 0.20 0 0.25 0 0 0.20 A B C D E pi 1 0.25 1 0.75 0.5 __ __ Qi A B A Linear time construction.
Back to extending FM sketches • We need to sample from B(c, 2-S) for various values of S. • Using Walker’s method, we can sample from B(c, 2-S) in O(1) time and O(c) space, assuming tables are pre-computed offline.
Back to extending FM sketches (cont) • With more cleverness, we can trade off space for time. Recall that, • Running time = time to sample from B + O(log2c) • Sampling in O(log2c) time leads to O(c / log2c) space. • With max sensor value of 216, saving a log2c term is a 256-fold space savings. • Tables for S = 1, 2,…, 16 together take 4600 bytes (without this optimization, tables would be >1MB)
Intermission • FM sketches require more work initially. • Need k bits to represent a single bit! • But: • Sketched values can easily be aggregated. • Aggregation operation (OR) is both order-insensitive and duplicate-insensitive. • Result is a natural fit with sensor aggregation.
Outline • Sensor database motivation • COUNT aggregation via Flajolet-Martin • SUM aggregation • Experimental evaluation
Experimental Results • We employ the publicly available TAG simulator. • Basic topologies: grid (2-D lattice) and random • Can modulate: • Grid size [default: 30 by 30] • Node, packet, or link loss rate [default: 5% link loss rate] • Number of bitmaps [default: twenty 16-bit sketches]. • Transmission radius [default: 8 neighbors on the grid]
Experimental Results • We consider four main methods. • TAG: transmit aggregates up a single tree • DAG-based TAG: Send a 1/k fraction of the aggregated values to each of k parents. • SKETCH: broadcast an aggregated sketch to all neighbors at level i –1 • LIST: explicitly enumerate all <key, value> pairs and broadcast to all neighbors at level i – 1. • LIST vs. SKETCH measures the penalty associated with approximate values.
Message Comparison • TAG: transmit aggregates up a single tree • 1 message transmitted per node. • 1 message received per node (on average). • Message size: 16 bits. • SKETCH: broadcast a sketch up the tree • 1 message transmitted per node. • Fanout of k receivers per transmission (constant k). • Message size: 20 16-bit sketches = 320 bits.
Compressability • The FM sketches are amenable to compression. • We employ a very basic method: • Run length encode initial prefix of ones. • Run length encode suffix of zeroes. • Represent the middle explicitly. • Method can be applied to a group of sketches. • This alone buys about a factor of 3. • Better methods exist.
Future Directions • Spatio-temporal queries • Restrict queries to specific regions of space, time, or space-time. • Other aggregates • What else needs to be computed or approximated? • Better aggregation methods • FM sketches have rather high variance. • Many other sketching methods can potentially be used.