1 / 36

CS 862 Presentation

CS 862 Presentation . Querying the Physical World ------ Cornell University Event Detection Services Using Data Service Middleware in Distributed Sensor Networks ------ University of Virginia Presented By Gary Zhou @ UVA. No avi value for each data, so not really real-time based.

borka
Télécharger la présentation

CS 862 Presentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 862 Presentation Querying the Physical World------ Cornell UniversityEvent Detection Services Using Data Service Middleware in Distributed Sensor Networks------ University of VirginiaPresented By Gary Zhou @ UVA

  2. No avi value for each data, so not really real-time based. • Concentrate on individual mote. • Special interesting point: represent device function • Provide database-like abstraction to applications • There is avi value for each data, so really Real-time based • Group-based robust coordination • Special interesting point: provide event detection service • Provide database-like abstraction to applications Comparison between these two papers • Query the physical world • Event Detection Service

  3. Outline --- Querying the Physical World • Device Networks & Their Query Processing • Description of Device Networks • Three kinds of queries • Two approaches • Device Database System • Device & Function • User representation • Internal representation • Queries • Query Processing over Device Database System • Performance Metrics • Distributed Query Execution Plans • Experiments • Discussions

  4. Outline --- Event Detection Service • Motivation • Data services in sensor networks • Data Service Middleware (DSWare) • Pay more attention to Event Detection Service • Experiments and performance • Discussions

  5. Device Networks & Their Query Processing • Description of Device Network • The widespread deployment of sensors, actuators and mobile devices is transforming the physical world into a computing platform. • Emerging networking techniques ensure that devices are interconnected and accessible from local- or wide-area networks. • Using this new computing platform, users interact with portions of the physical world.

  6. Three kinds of Queries • Historical queries • These are typically aggregate queries over historical data obtained from the device network. • An example --- For each rainfall sensor in 1800 JPA, display the average level of rainfall for 1999. • Snapshot queries • These queries concern the device network at a given point in time. • An example --- Retrieve the current rainfall level for all sensors in 1800 JPA. • Long-running queries • These queries concern the device network over a time interval. • For the next 5 hours, retrieve every 30 seconds the rainfall level for all sensors in 1800 JPA.

  7. Two Approaches • The warehousing approach • Definition --- In this approach, data are extracted from the devices in a predefined way and stored in a centralized database system that is responsible for query processing. • Device database system • Definition --- A database system that enables distributed query processing over a device network.

  8. Two Approaches --- warehousing • Advantages of warehousing approach • It is well suited for aggregated queries asked for historical data. • Disadvantages of warehousing approach • It disassociates access to device from the query workload. • It uses valuable resources to transfer large amount of raw data from devices to the database server.

  9. Two Approaches --- Device database system • Device database system • Device & Function • User representation • Internal representation • Queries

  10. Device & Function • Device • Each device is a mini-server that supports a set of functions and can process portions of the queries directly at the device. • example, a function that detects an abnormal rainfall level. • Function • A function either • Acquires, stores and processes data or • Triggers an action in the physical world • Synchronous function • It returns result immediately, on demand. • It is used to monitor continuous phenomena, for example, a function that returns the rainfall level. • Asynchronous function • It returns result after an arbitrary period of time. • It is used to monitor threshold events, for example, a function that detects an abnormal rainfall level.

  11. User representation • Devices are represented as ADTs • Abstract Data Type (ADT) objects • ADT objects are objects that are single attribute values encapsulating a collection of related data. ADT objects provide controlled access to encapsulated data through a well-defined interface. • An example: RFSensors (Sensor,X,Y) provides Sensor.getRainfallLevel()

  12. Internal representation • Device functions are represented as virtual relations • Virtual relation • It is a tabular representation of a function. A record in it contains the input arguments and the output argument of the function it is associated with. • Properties of Virtual relation • It is appended only • It is naturally partitioned across all devices represented by the same device ADT

  13. They are naturally formulated as declarative queries in SQL • An example of long-running query SELECT R.Sensor.getRainfallLevel() FROM RFSensors R WHERE R.Sensor.getRainfallLevel() > 50 AND $every(30) The function $every(30) specifies that a new record is inserted every 30 seconds into the append-only virtual relation corresponding to the function RFSensor.getRainfallLevel(). Queries • Historical queries • Snapshot queries

  14. Query Processing over Device Database System • Performance Metrics • Traditional performance metrics • Throughput --- average number of queries processed per unit of time • Response time --- time needed by the system to produce all answer records to a query. • New performance metrics • Resource Usage --- The total amount of energy consumed by the devices when executing a query. • Reaction Time --- The interval between the time a function, called on devices, returns the value and the time the corresponding answer is produced on the front-end.

  15. Distributed Query Execution Plans • Query --- Retrieve every 30 seconds the rainfall level if it is greater than 50 mm. SELECT VR.value FROM VRFSensorsGetRainfallLevel VR, RFSensors R WHERE VR.Sensor = R.Sensor AND VR.value > 50 AND $every(30)

  16. Plan T • Data extracted from the devices are materialized in the relation VR that is located on the front-end. • Join relation R and relation VR (using join condition VR.Sensor = R.Sensor AND VR.value > 50) • Both R and VR are in the front-end. And the join is executed on the front-end

  17. Plan A • It is a simple tree where R is joined on the front-end with relation VR partitioned across a set of devices. • The front-end asked each device to measure rainfall level and to transfer the resulting virtual records back to the front-end. • Each virtual record arriving on the front-end is then joined with relation R. • Disadvantages --- All devices with rainfall sensors transmit data to the front-end while the query only concerns the sensors which measure a rainfall level greater than 50.

  18. Plan B • Define a semi-join between R and the partitions of VR located on the devices. The semi-join projects out the joining attribute from R (here the device ID Sensor) and sends it to all devices. • On the devices, whenever the rainfall level is measured, a virtual record is generated and joined with the portion of relation R sent by the front-end (using joining condition R.Sensor = VR.Sensor and VR.value > 50) • If the joining condition is verified, the virtual record is sent back to the front-end to get joined with complete records from relation R .

  19. Plan C • It only pushes the selection (VR.value > 50) onto the device. Only records that verify the condition are sent back to the front-end where they are joined with relation R. • Compared to Plan B, there is no subset of relation R transmitted to the devices.

  20. Resource usage for sensors located outside a flood area • With Plan B, a semi-join is pushed to the device. The condition on the rainfall level is checked on the device and no data is sent back because of being outside of the flood. • Plan B pays the initial cost of transferring a fragment of relation R to the devices. This initial cost is amortized (compared to Plan A) during the lifespan of the long-running query. • With Plan C, a selection is pushed to the device. The condition on the rainfall level is checked on the device and also no data is sent back because of locating outside of the flood. • With Plan A, data is sent back to the front-end whenever it is generate.

  21. Resource usage for sensors located inside a flood area • With all plans, data is always sent back to the front-end. • The initial cost of Plan B is here never amortized. So line B will rise rapidly with time increasing. • Because the cost of performing a selection is low compared to the cost of sending data. • Question: Why Plan C and Plan A have almost similar curves?

  22. Conclusion of Plans • Pushing a selection as in Plan C is the optimal. This is intuitive since the query filters out uninteresting events generated on the devices. • Pushing the selection allows the device database system to trade efficiently increased processing on the devices for reduced communication.

  23. Discussions • I love the idea of using virtual relations to represent device functions • The complete query semantics over a Device Database are not given here. • No avi value for each data, so not really real-time based. • Individual nodes are not important, and a mote’s sensor may get damaged and repots wrong value. So group-based coordinate should be introduced.

  24. Event Detection Service

  25. Motivation • sensor networks are data-centric and real-time based – Abstraction of real-time data semantics needed • Individual nodes in sensor networks are unreliable -- Group-based robust coordination needed • Detection of some events relies on more than one type of sensor data -- The relationship can help to increase the reliability of data decisions

  26. Data Services in sensor networks • Queries (location, frequency, duration) • Data/Event dissemination • Data Aggregation • Data-centric Storage/Caching • Event Detection • Data Security and Access Authorization

  27. Services in Data Service Middleware Database-like abstraction Real-time Scheduling Application Event Detection Subscription DSWare Group Management Aggregation Data Storage Caching Authorization Sensor nodes Compare? Data Service Middleware (DSWare) • Data Storage • Static copies & provide reliability • Caching • Variable copies & improve performance • Data Storage • Map the key to a logical node • Map a logical node to multiple physical nodes • Caching • Spread copies along the routing path

  28. Explosion Atomic Event Reports Determine the occurrence of compound events Problems with current event detection schemes • An external node collects reports of atomic events and determines whether the compound event occurs • reduce possible in-network processing and increase unnecessary concentrated traffic around the decision node • Increase detection delay (unacceptable for some time-critical applications)

  29. Explosion Detected in the area: High Temperature, light intensity change, acoustic changes Event Detection Service in DSWare • Event: application-interested activity in the environment that can be monitored or detected • Hierarchy of events • Atomic event: • detected through a single sensor’s observation • e.g. High Temperature, light intensity change, acoustic change • Compound event: • consists of a set of atomic events • detected based on the detection of atomic events that a compound event consists of • e.g. Explosion

  30. Event Detection Scheme in DSWare • Confidence • Every compound event detection report has a confidence value, which indicates the reliability of the report • Confidence function is designed based on data semantics • Related importance of different atomic sub-events • Temporary continuity of events • Statistical models • Similarity among adjacent regions • Waiting Time Window • The time that an aggregation node waits for the arrivals of all possible atomic event reports • When TW timeouts, report a compound event if the confidence value reaches the minimum confidence requirements of this event • Avoid endless waiting for messages loss • Enable event detection based on partial information collected

  31. No reports Lost Report E f=0.9h f=0.3h f=1.2h Report E f=0.6h f=0.9h f=0.3h f=0.3h f=0.9h f=1.2h T A L L A T L Shift time window Time window A Simple Example: Explosion (E) • Sub-events: high temperature (T), special light (L), acoustic changes (A) • Confidence function: f = [0.6 * BOOL(T) + 0.3 * BOOL(L) + 0.3 * BOOL(A)] * h (h: history factor, increases if the explosion event has been detected in previous waiting time window. Assume 1≤h≤2) • Minimum Confidence: 0.8 Group Leader time time

  32. Some other issues in event detection • Temporal resolution • Some events last much longer than the sensing interval of a sensor. So probably some applications will report a single event repetitively, which is unnecessary. • Spatial resolution • If the size of a detection group is too small compared to the event, there might be several groups in this event’s coverage that will report the same event.

  33. Performance in Reduction of Communication • Base line: • Only one report of an environment property is generated from a group during each sensing interval. • Send all reports to an outside node and the entire analysis will be done there. • DSWare has less communication.

  34. Performance in Differentiating Events and Event-like Factors • How to differentiate repetition report of event from event-like factor? • How about the performance with different time window size and different minimum confidence value?

  35. Discussions • The idea of event detection service is well developed and completely discussed. • In DSWare, data is replicated in multiple physical nodes that can be mapped to a single logical node. So consistency among these nodes is a key issue. In this paper, “weak consistency” is mentioned. But what’s the definition of “weak consistency” in sensor network? • Since multiple physical nodes are used to map to a single logical node, why data caching is needed? What’s the different purposes of introducing both of them. • It is mentioned that application can specify the actual scheduling schema in the sensor networks based on the most important concerns. But is it a good way for application to do that? It doesn’t seem a simple work.

  36. Discussions --- (cont.) • What is the position of real-time scheduling in the system? How to provide real-time? • Two questions about Fig 5. • How to differentiate repetition report of event from event-like factor? • How about the performance, with different time window size and different minimum confidence value? • A little typing mistake: • In the last sentence before 5.1, “an explosion event will be reported if the Confidence_E is not less than 0.9” should be “an explosion event will be reported if the Confidence_E is no less than 0.9”

More Related