Context-Aware Sensors

Context-Aware Sensors Eiman Elnahrawy and Badri Nath Department of Computer Science, Rutgers University EWSN January 19th 2004

Outline • Introduction, Motivations, Related Work • Context-Awareness • Approach: Modeling and Learning • Applications • Preliminary Evaluations • Challenges and Research Directions • Conclusion

Introduction • Sensors expected to become a major source of information • Applications • Monitoring: sometimes remote harsh environments • Habitat, climate, contamination • Agriculture and crops • Quality of food • Structures (response to earthquakes) • Tracking and military applications • Traffic control • Industry (control at assembly lines) • Medical (smart medicine cabinets)

Major design goal Limitations of Wireless Sensor Networks • Limited battery life: if abused, sensors last few days, otherwise, may last up to few months • Limited communication bandwidth • Limited processing capability

High rate of packet loss • Poor communication links • Connection failures • Fading of signal strength • Packet collision between multiple transmitters • Constant or sporadic interferences • > 10% of the links suffer average loss rate > 50% • Packet loss of most links fluctuates over time with estimated variance 9% - 17% • Topology is continuously changing (node failure, node mobility)

Limitations cause many data quality problems… 1. Outliers: serious events/bogus readings at low battery levels 2. Missing values • Low level solutions to tolerate loss don’t usually work, problem persists • Limited resources:Can we sample?

Inevitable! • (Uncontrollable) harsh environmental conditions, HW and radio problems • Current technology: cheap low quality sensors, vary in their tolerance to quality problems • Focus of industry is even cheaper sensors-> lower quality that varies with the cost of the sensor

Serious… • Incompleteness/Imperfection/Uncertainty • Need to know event/malicious sensor • Seriously affects decision-making/triggers • False +ve/-ve/misleading answers • May cost you money • May jeopardize application: e.g. routing based on gradient

I can’t rely on this sensor data anymore. It has too many problems!!? • Missing information • Hmm, is this a malicious sensor • Something strange or sensor gone bad • Can we sample? • Noise • Bias • Limitations result in many data quality problems • Serious for immediate decision making or actuator triggers!!

General Approach • Relatively dense networks (coverage, connectivity, robustness, etc.) • Correlated and/or redundant readings • Spatial and temporal dependencies • Why don’t we exploit these spatio-temporal relationships among sensors (contextual information)?

Related Work • Spatio-temporal correlations in sensor data • Dimensions [Ganesan et al. 2002] • Premon [Goel et al. 2001] • Geospatial data analysis [Heidemann et al. 2001] • Assume the existence of such correlations without attempting to explicitly quantify them • Other data quality problems • Reducing the effect of noise [Elnahrawy et al. 2003] • Calibration (a post deployment technique) [Bychkovskiy et al. 2003]

In-network aggregation [Madden et al. 2002, 2003, Zhao et al. 2002] • Motivated our online in-network learning of relationships • Spatial and temporal data[Shekhar et al. 2003] • Graphical models in computer vision and image processing[Smyth et al. 1998, Freeman 1999]

Contextual Information Encodes spatial dependencies as well as temporal dependencies Enables sensors to locally predict their current readings Context-Awareness Sensors are awareof their context (neighborhood and history) Given context information sensors can infer (predict) their reading Two Concepts

Learning the Contextual Information • Probabilistic approach based on Bayes classifiers learning and utilizing contextual information learning parameters of a Bayes classifier and then making inferences Mapping • Scalable (distributed) and energy-efficient procedure for online learning • Inference computed locally at the node

S Modeling the Contextual Information • Markovian Model (short range dependencies):last reading, immediate neighbors T H N T+1 T+2

Why Bayesian? • Simple training and inference (sensors can afford it) • Bayesian-based models have been used in literature (image processing, spatial data) • Gives good results and (sometimes) outperforms more sophisticated classifiers • Has a very nice “decomposability and progressive learning” property -> Distributed learning

Bayesian and Sensor Networks • Features: h,n • Last reading of sensor h (temporal information) • Current readings of some immediate neighbors n (spatial information) • In our preliminary work we used 2 neighbors • Quantization: R = {ri} • Divide range of possible values into a finite set of non-overlapping subintervals, not necessarily of equal length, each subinterval = class

S H N Prediction in Bayesian Classifiers • MAP (Maximum A posteriori) : calculate the most likely class of the current sensor reading rMAP given • The observed features h,n (spatio-temporal information) • The parameters θ (conditional probability tables)

Naive Bayes • Features conditionally independent given the target class • Parameters θ(CI) become • The 2 conditional prob. tables for P(h| ri), P(n| ri) • The prior probability of each class P(ri)

Parameters are just ratios of counters! Frequency of r1 = |r1| / |D| = # [n  (r2,r2)  current reading  r2]/|r2| = # [H  r2 current reading  r1]/|r1| Total number of counters 1 + m + 3/2 m2 + ½ m3

Learning the Parameters • Data is free: most networks are readily used for collecting learning data (e.g., monitoring) • 2 phases: learning and testing • In-network, in a distributed fashion using in-network aggregation • Sensors collect training data and estimate the parameters locally( 1 + m + 3/2 m2 + ½ m3 counters) • Parameters (counters) are then aggregated while propagating up the routing tree (SUM aggregate) • Flood overall counters to every sensor

Stationary vs. Non-Stationary • Perfect Stationarity: Use in-network aggregation, most efficient • Handling dynamic correlations requires a priori knowledge of the dynamics • Over time: re-learn the parameters dynamically at each change • Over space: cluster the network into geographical regions where the “stationarity in space" assumption holds inside each region • Time, space: hybrid approach

Analysis: In-network vs. Centralized • Both apply, different Communication cost • Roughly measured by size of learning data • Vary from application to another • Depends on accuracy and routing mechanism • More experiments needed (future work) • Non-stationary (space): centralized is inferior

In-network learning Distributive summary aggregate k X O(m3)X O(n), k epochs, m classes, and n nodes  O(m3) summary agg., k times Effectively reduces traffic Analysis: (Imperfectly) Stationary Centralized learning • Centralized agg. (detailed set) • p X O(n2), p training instances (application-dependent) •  p centralized aggregates • Significant traffic Examples show centralized learning is an order of magnitude higher

Predicting any missing value Detecting malicious sensors Discovering outliers Super-resolution (Sampling) Inference Problem Applications

Evaluations • Synthetic data (Tracking data set) • Phenomenon with sharp boundaries • Shockwave propagating around a center based on Euclidean distance • 10000 sensors over a grid of 100 x 100 • Divided range of readings into 10 bins (classes) • Added outliers with % 10-90

As % outliers increases • The classifier takes more time (iterations) to learn • The error in prediction increases and then remains constant at 7% • Sensors rely more on the temporal correlations

As % outliers increases • We were able to detect about 90% of the added outliers • Incorrect prediction were off by less than 1

Evaluations Acknowledgement: Robert Szewczyk @Berkeley • Real data (Great Duck Island GDI) • Intel’s project off the shore of Maine • Subset of the nodes (2, 12, 13, 15, 18, 24, 32, 46, 55, and 57) • Spatially adjacent • 5 sensors (light, temperature, thermopile, thermistor, humidity) • Readings from August 6 to September 9, 2002 (about 140,000 each sensor)

Light Temperature Thermistor Humidity

Evaluations • Error becomes small enough in a relatively short time • > 90% accuracy in most of the cases • Stationary, random imprecision, noise, and outliers in the testing phase

Challenges • Dynamic correlations • Heterogeneity • Number of neighbors, selection criteria • Efficient routing • Dealing with rare events • Avoid quantization -> Regression models • Multi-dimensional

Future Work • Prototype and more Evaluations • Preliminary evaluations to investigate efficiency • Extremely valuable in highlighting major decisions and potential deployment problems • Characterization • Overall cost • Integration • Integrating noise, calibration, and context-awareness • Important to ensure learning of accurate correlations

Conclusion • Dealing with data quality problems is very important • Context-awareness: learning and making inferences • Works well • Applications: missing values, outliers, sampling • Many open problems and future work directions

Thank You

Context-Aware Sensors

Context-Aware Sensors

Presentation Transcript

Context-Aware Security

Context-Aware Computing

Context Aware Mobile Commerce

Context-aware saliency

Context-Aware Saliency Detection

Context-Aware Saliency Detection

Context-Aware Saliency Detection

Context Aware Location

Context-aware communication

Context-Aware Clustering

I5180 Context-Aware Computing

Context-Aware Communication

Context-Aware Recommendation

Context Aware Technology

Context-Aware Query Classification

Context-Aware Clustering

Context-aware toolkits

Context-Aware Clustering

CAT – Context Aware Toolkit

Context and Context-Aware Computing