1 / 23

Streaming Knowledge Bases

Streaming Knowledge Bases. Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008. Streaming Knowledge Bases. Onkar Walavalkar, Anupam Joshi Tim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008.

decker
Télécharger la présentation

Streaming Knowledge Bases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Streaming Knowledge Bases Onkar Walavalkar, Anupam JoshiTim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008

  2. Streaming Knowledge Bases Onkar Walavalkar, Anupam JoshiTim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008

  3. Streaming Knowledge Bases Onkar Walavalkar, Anupam JoshiTim Finin and Yelena Yesha University of Maryland, Baltimore County 27 October 2008

  4. Overview Motivation Streaming databases Streaming knowledge bases Experiments and results Conclusions  Motivation  Stream DBs  Stream KBs  Experiments  Conclusions 

  5. Operating Room of the Future drugs • ORs will be awash in low-level data, much of it noisy or incomplete • Challenges include coping with the noise and interpreting the low-level data to recognize high-level events and activities RFID RFID ORF tools AwarePoint WIFI patient Monitors Bluetooth devices staff Motivation Stream DBs  Stream KBs  Experiments  Conclusions 

  6. Initial work in OR training • UMD Mastri Center is experimenting with OR technologies and training environments • The Human Patient Simulator from METI • Designed to react like a human • Responds to medical treatment • Generates continuous streams of data, moderated by • Initial conditions (e.g. blunt trauma multiple injuries scenario) • human interactions Motivation Stream DBs  Stream KBs  Experiments  Conclusions 

  7. Results Results Query Data Efficient Data Stream Management Index Queries Index Data Traditional DBMS Stream Management System • Data is stored/indexed in system • Queries applied to stored data as they “stream through” • Queries stored/indexed in system • Data applied to stored queries as they “stream through” Several efforts: Tapestry, Aurora, TelegraphCQ  Motivation Stream DBs  Stream KBs  Experiments  Conclusions 

  8. Event Detection - Level 3 Medical Encounter Record Video Clipper Rule Base Event Detection - Level 2 Events Staff Assert facts Assert facts Patient History Trend Analyzer Low-Level Event Processor Events Database Physiological Data Medical Supplies Event Detection - Level 1 Stream Processor (TelegraphCQ) RFID System Patient Monitor Continuous Queries Medicines Tools Staff MotivationStream DBs  Stream KBs  Experiments  Conclusions 

  9. What’s wrong with this picture? We need to enhance this to support semantic interoperability for medical data & knowledge The medial community has a long history developing & using standard ontologies & metadata Incoming streams of data can be in rdf And reference terms in appropriate ontologies Motivation Stream DBs Stream KBs  Experiments  Conclusions 

  10. What’s wrong with this picture? • Streaming Database systems use continuous queries specified over a sliding time window • e.g., [range by ‘30 seconds’ slide by ‘10 seconds’] • Issues: • Where do we we do reasoning? • How do we answer queries against a sliding window of data? Motivation Stream DBs Stream KBs  Experiments  Conclusions 

  11. RDF Stream Processing Query for Class of Concern Input Triple Stream Detected Instances input stream handler Special domainrules & queries Enhanced Stream Static Data Store RangeInfo DomainInfo Classtree PropertyTree InverseInfo Motivation Stream DBs Stream KBs  Experiments  Conclusions 

  12. Experiments and results • Three simple reasoners • Jena, in core • Pre-computed custom hash tables • Using tables in TelegraphCQ • Various scenarios • Ontology size: 118 - 23.1 MB • Number of subclasses: 49 - 57,000 • Subclass depth: 2 - 9 • Data rate: 1 - 50 triples per second

  13. Domain Example • Monitor data stream looking for observations of invasive species from Bioblitz and eco-blogging data streams • Uses our Ethan ontologies for ecoinformatics • Tree of life (~340K taxons from ITIS and other sources) • Species profiles • Invasive species definitions • Observation

  14. Reasoning delay comparison for all approaches

  15. Reasoning delay comparison for all approaches

  16. Reasoning delay comparison for all approaches

  17. Reasoning delay comparison for all approaches

  18. VM Usage comparison of all 3 approaches

  19. VM Usage for Jena for different classes

  20. VM usage comparison for Hashtable and TCQ

  21. Conclusions If the incoming triple data rate goes beyond a certain limit, the reasoning speed starts to lag and tends to slow down the incoming stream. The speedup achieved by using TCQ and a hashtable prove the value of pre-processing an ontology, particularly for fast streaming facts.

  22. http://ebiquity.umbc.edu/

More Related