1 / 68

Is the (Semantic) Web a Database?

Is the (Semantic) Web a Database?. Laks V.S. Lakshmanan University of British Columbia http://www.cs.ubc.ca/~laks. Joint work: Igor Naverniouk (UBC) Fereidoon Sadri Univ. of North Carolina @ Greensboro. Thanks to: Wendy Wang Zhimin Chen. Debunking the hype in the title.

silvio
Télécharger la présentation

Is the (Semantic) Web a Database?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Is the (Semantic) Web a Database? Laks V.S. Lakshmanan University of British Columbia http://www.cs.ubc.ca/~laks Joint work: Igor Naverniouk (UBC) Fereidoon Sadri Univ. of North Carolina @ Greensboro Thanks to: Wendy Wang Zhimin Chen

  2. Debunking the hype in the title • Asilomar 98 Report: the web is a huge DB!… • But the web ain’t a DB: Mendelzon, Nov. 98! • Our punchline: • Adding semantics doesn’t make it a DB! • BUT,  a huge embedded collection of repositories of info. (and services) which could greatly benefit from a databasey abstraction • Disclaimer: • not a talk aboutSW. • Work in progress. SW=DB?, Keynote, IDEAS 2003

  3. Overview • What is the Semantic Web and why bother with it? • The Web and Databases • SW – technologies & tools • The X-DARES-U project @ UBC • Summary, Related Work, & Future Challenges SW=DB?, Keynote, IDEAS 2003

  4. Overview • What is the Semantic Web and why bother with it? • The Web and Databases • SW – technologies & tools • The X-DARES-U project @ UBC • Summary, Related Work, & Future Challenges SW=DB?, Keynote, IDEAS 2003

  5. Semantic Web – what and why • SW = Web + a host of technologies. • XML and XML schema • Resource Description Framework (RDF) and RDF schema • Ontologies (domain specific) • Ontology languages (DAML+OIL, OWL, …) • Description logics • Key idea: semantically mark up your data (and functionality) • So, SW = semantic view of the world (web) SW=DB?, Keynote, IDEAS 2003

  6. Semantic Web – what and why (Info. discovery) • elaborate, precise, automated searches. e.g.: search program correctly locates a person based on partial knowledge: last name = "Cook," works for a company on your client list, and has a son attending your alma mater, Avondale Univ. • semantics will help automate complicated processes and transactions. • Tim Berners- Lee+, Sci.Am. May 2001. SW=DB?, Keynote, IDEAS 2003

  7. Semantic web – what and why (value chain creation) Pete’s agent, Go! Turn down the volume of Pete’s TV Pete, how about we taking mom to her physical therapy sessions in turn ? Lucy Semantic Web Sure. • Tim Berners- Lee+, Sci.Am. May 2001. Pete’s noisy TV Pete SW=DB?, Keynote, IDEAS 2003

  8. final plan Lucy’s agent, Go! Retrieve the related information. Pete’s Agent, Go! Give Pete’s schedule to Lucy’s agent. Address & available appointment slot Address & available appointment slot Doctor1’s agent Doctor2’s agent Pete, I will find a clinic within a 20-mile radius of my home and set up the plan for the two of us. Lucy Semantic Web That’s great • Tim Berners- Lee+, Sci.Am. May 2001. Pete SW=DB?, Keynote, IDEAS 2003

  9. Semantic Web – what and why “Is this rocket science? Well, not really. The Semantic Web, like the World Wide Web, is just taking well established ideas, and making them work interoperability over the Internet. This is done with standards, which is what the World Wide Web Consortium is all about. We are not inventing relational models for data, or query systems or rule-based systems. We are just webizing them. We are just allowing them to work together in a decentralized system - without a human having to custom handcraft every connection.” -- Tim Berners-Lee, Business Case for the Semantic Web, http://www.w3.org/DesignIssues/Business SW=DB?, Keynote, IDEAS 2003

  10. Overview • What is the Semantic Web and why bother with it? • The Web and Databases • SW – technologies & tools • The X-DARES-U project @ UBC • Summary, Related Work, & Future Challenges SW=DB?, Keynote, IDEAS 2003

  11. The Web and Databases SW=DB?, Keynote, IDEAS 2003

  12. The Web and Databases • Yet, it’s worthwhile bringing a “databasey” look and feel. • Semantic web initiative • Confluence of knowledge representation, AI, IR, DB, … • Spell out semantics via semantic markup. SW=DB?, Keynote, IDEAS 2003

  13. Overview • What is the Semantic Web and why bother with it? • The Web and Databases • SW – technologies & tools • The X-DARES-U project @ UBC • Summary, Related Work, & Future Challenges SW=DB?, Keynote, IDEAS 2003

  14. SW – Technologies & Tools Courtesy: Ian Horrocks, CADE 2002. SW=DB?, Keynote, IDEAS 2003

  15. SW – Technologies & Tools • XML & XML schema • RDF & RDF schema • Ontologies & Ontology description languages (OWL) • SOAP & WSDL [enhance value of SW] SW=DB?, Keynote, IDEAS 2003

  16. SW – T & T (XML) • Relevance of XML Example: <movies> <film><fid>F1</fid> <title>Manhattan Murder Mystery</title> <genre>satire</genre> <genre> mystery</genre> <actor><name>woody allen</> <role>…</> </actor> <actor> … </film> … </movies> • No rigid schema, yet self-describing • Flexible description/exchange language • But, no semantics! • “schemaless” ain’t always good! SW=DB?, Keynote, IDEAS 2003

  17. SW – T & T (XML schema) • No typing and integrity constraints! • Fix: DTD (initially) and then XML schema. • Example: <xs:element name=“film"> <xs:complexType> <xs:sequence> <xs:element name=“fid“ type="xs:string“ minOccurs=“1” maxOccurs=“1”/> <xs:element name=“title" type="xs:string“ min=“1” max=“1”/> <xs:element name=“genre" type="xs:string“ min=“0” max=“unbounded”/> … </xs:sequence> </xs:complexType> </xs:element> SW=DB?, Keynote, IDEAS 2003

  18. SW – T & T (RDF) • XML’s main advantage: near-universal standard for data interchange (e.g., w/ tools to publish from from files, spreadsheets, DB, … sources) • Yet, offers no semantics! • Your ZIP is my Postal Code • Your “name” and my “name” don’t mean the same • Besides, XML by itself doesn’t solve info. sharing and interoperability problems • Need common unambiguous vocabulary => RDF SW=DB?, Keynote, IDEAS 2003

  19. SW – T & T (RDF) • Syntax for describing data/resources on the web & relationships in terms of classes and properties SW=DB?, Keynote, IDEAS 2003

  20. SW – T & T (RDF) • Syntax for describing data/resources on the web & relationships in terms of classes and properties Example: http://www.myspace.ca/f1 URI has_actor title http://www.myspace.ca/t1 name role A URI can point to description of the resource. http://www.myspace.ca/f1 ... Woody Allen Can be URIs too. SW=DB?, Keynote, IDEAS 2003

  21. SW – T & T (RDF) • classes and resources (subjects): e.g., p1 is a person. • properties/predicates map subjects to objects: e.g., p1 has name “Woody Allen”. • subjects and predicates – associated with URIs. • objects – URIs or literal strings. • reification: predicates as resources – e.g.: • “domain of name is person”. • relationships between classes and between predicates (how to?) -- actor is a subclass of person -- (predicate) has_actor is a subset of involves -- (predicate) directed_by is a subset of involves => SW=DB?, Keynote, IDEAS 2003

  22. SW – T & T (RDF schema) • RDFS – provides RDF vocabulary description and type system. • similar to but different from OO languages’ type systems: property-centric vs. class-centric. • ontology description languages such as OWL build on it. • Example => SW=DB?, Keynote, IDEAS 2003

  23. SW – T & T (RDF schema) <rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <rdf:Description ID="registeredTo"> <rdf:type resource="http://www.w3.org/1999/02/22-rdf- syntax-ns#Property"/> <rdfs:domain rdf:resource="#MotorVehicle"/> <rdfs:range rdf:resource="#Person"/> </rdf:Description> <rdf:Description ID="rearSeatLegRoom"> <rdf:type resource="http://www.w3.org/1999/02/22-rdf- syntax-ns#Property"/> <rdfs:domain rdf:resource="#PassengerVehicle"/> <rdfs:domain rdf:resource="#Minivan"/> <rdfs:range rdf:resource="http://www.w3.org/2000/03/example/ classes#Number"/> </rdf:Description> </rdf:RDF> SW=DB?, Keynote, IDEAS 2003

  24. SW – T & T (OWL) • RDF/RDF schema – too weak to completely describe semantics of application/data. • Role filled by languages like DAML+OIL, OWL. • E.g., how does an application know watch in one source, wristwatch in another, and clock in a third are closely related? • How does it know that curb and kerb are essentially the same thing? • More generally, need for relating various terms used in an app. domain and their boolean combos. => ontology. SW=DB?, Keynote, IDEAS 2003

  25. SW – T & T (OWL) Example: assume “standard” name spaces. E.g., rdf := “http://www.w3.org/1999/02/22-rdf-syntax-ns#” owl := “http://www.w3.org/2002/07/owl# “ camera := “http://www.xfront.com/owl/ontologies/camera#” … units.domain = Interval. units.range = Thing. cost.domain = PrchsbleItem. cost.range = Money. shutterspeed.domain = Camera. shutterspeed.range = Interval. focal-length = size. f-stop = aperture. owl:: Money  Thing. currency.rdfs:domain = Money. currency.rdfs:range = Thing. Interval  Thing. min.domain = Interval. min.range = xsd:float. max.domain = Interval. max.range = xsd:float. SW=DB?, Keynote, IDEAS 2003

  26. SW – T & T (OWL) • A slightly diff. perspective: Money[currency->Thing]. Interval[min->float; max->float; units->Thing]. Camera[shutterspeed->Interval; size->…; aperture->…]. Type declarations. Thing Money PrchsbleItem Interval Camera Taxonomy. aperture = f-stop focal-length = size … Equivalences. But, terms can refer to different namespaces. SW=DB?, Keynote, IDEAS 2003

  27. SW – T & T (SOAP) • Protocol for exchanging info. over http. • Platform & language independent. • XML-based. • Based on request and response. • E.g., getStockPrice: specify stockName and obtain stockPrice. • Mandatory & optional functions. • Many hops possible between sender and ultimate receiver w/ obligations for intermediate nodes. • RPC for web apps. Flexible. SW=DB?, Keynote, IDEAS 2003

  28. SW – T & T (WSDL) • Distributed computing on the web. • Invoke remote method on your data. • Invoke remote method on data from some place else. • Obtain/provide data (XML). • WSDL spec. – XML doc describing service location and operations supported (and types). • Used in tandem with SOAP (or other protocols). • UDDI – web services registry. SW=DB?, Keynote, IDEAS 2003

  29. Overview • What is the Semantic Web and why bother with it? • The Web and Databases • SW – technologies & tools • The X-DARES-U project @ UBC • Summary, Related Work, & Future Challenges SW=DB?, Keynote, IDEAS 2003

  30. The X-DARES-U Project at UBC • Vision and Goals • Architecture • XML interoperability • Current status • Based on [Lakshmanan & Sadri ICSW ’03]. SW=DB?, Keynote, IDEAS 2003

  31. X-DARES-U (vision & goals) • XML Data Warehouse with Semantic Enrichment project at UBC. • Leverage semantic web to provide interoperability between data sources and services and enable resource discovery.  • Enable (partly virtual) warehouse of XML data with support for semantic views. • Use hierarchies for flexible, yet powerful data modeling and management. • OLAP style analysis and mining functionalities on XML data, leveraging SVs SW=DB?, Keynote, IDEAS 2003

  32. X-DARES-U (architecture) SW=DB?, Keynote, IDEAS 2003

  33. Root Buildings Animals People Animals Warehouse Towers Warehouse Towers Directory Server Global Query Local Intermediate Results Local Queries Source 2 Source 1 Source 3 Source 4 Ontology description in “OWL” Ontology Server 1 Ontology Server 2 Ontology Server 3 Topic Hierarchy User query: For each state, list the warehouse information in that state. Coordinator RDF+RDF Schema Semantic View 1 Semantic View 2 Semantic View 3 Semantic View 4 RDBMS SW=DB?, Keynote, IDEAS 2003 LDAP Spreadsheet XML

  34. Root Buildings Animals People Animals Warehouse Towers Warehouse Towers Directory Server Final Results Inter-source results Inter-source Queries Source 2 Source 1 Source 3 Source 4 Ontology description in “OWL” Ontology Server 1 Ontology Server 2 Ontology Server 3 Topic Hierarchy User query: For each state, list the warehouse information in that state. Coordinator Global Query Local Intermediate Results RDF+RDF Schema Semantic View 1 Semantic View 2 Semantic View 3 Semantic View 4 RDBMS SW=DB?, Keynote, IDEAS 2003 LDAP Spreadsheet XML

  35. X-DARES-U Interoperability Here’s how source 1 models its data. store * warehouse @ * item id city state description id name SW=DB?, Keynote, IDEAS 2003

  36. X-DARES-U Interoperability Here’s how source 2 models its data. store items warehouses * * item warehouse @ id name desc. wid @ state id city SW=DB?, Keynote, IDEAS 2003

  37. X-DARES-U Interoperability Here’s how source 3 models its data. store inventory warehouses items * * * i-tuple w-tuple inv-tuple id name desc. id city state i-id w-id SW=DB?, Keynote, IDEAS 2003

  38. X-DARES-U Interoperability “Find distinct items available in warehouses in each state.” @!#$%&*() ? ? ? source1 source3 source2 SW=DB?, Keynote, IDEAS 2003

  39. X-DARES-U Interoperability item-id, item-name, item-desc, item-wh, wh-wid, wh-city, wh-state source1 source3 source2 SW=DB?, Keynote, IDEAS 2003

  40. X-DARES-U Interoperability FOR $S IN distinct(doc(…)/wh-state/tuple/state) RETURN <state> {$S} FOR $X IN doc(…)/wh-state/tuple[state=$S], $Y IN doc(…)/i-wh/tuple[wh = $X/wh], $Z IN doc(…)/i-id/tuple[item = $Y/item] RETURN <item><id> distinct($Z/itemId)}</></></> wh-state Join item-wh Join item-id coordinator source1 source3 source2 SW=DB?, Keynote, IDEAS 2003

  41. X-DARES-U Interoperability • But who creates these semantic views and how? • Who: Local data source administrators. • How: we envisage SV authoring tools. • Additionally, queries and applications can leverage domain specific ontologies. • Several in existence or offing already: • camera.owl (www.xfront.com) • Dublin core (generic ontology for docs) • GPS coordinate, security, space shuttle, … (orlando.drc.com/SemanticWeb/Topics/Ontology/Ontologies.htm) SW=DB?, Keynote, IDEAS 2003

  42. X-DARES-U Interoperability Example – SV authoring for source1 store * warehouse @ * item id city state description id name SW=DB?, Keynote, IDEAS 2003

  43. X-DARES-U Interoperability Example – SV authoring for source1 store * warehouse @ * item id city state description id name SW=DB?, Keynote, IDEAS 2003

  44. X-DARES-U Interoperability Example – SV authoring for source1 store * warehouse @ * item id city state description item-wh($I,$W)  source1/store/warehouse $X, $X/@id $W, $X/item/id $I id name SW=DB?, Keynote, IDEAS 2003

  45. X-DARES-U Interoperability • Other predicates “populated” similarly: e.g., item-name($I,$N)  source1/store/warehouse/item $X, $X/id $I, $X/name $N • Can use URI-generating functions to make it more faithful to RDF spirit: • Make all “id”s URIs (standardized) • Relate “local” id’s used by source to such URIs e.g.: item-id(fI($I),$I)  source1/store/warehouse/item $I SW=DB?, Keynote, IDEAS 2003

  46. X-DARES-U Interoperability • XML RDF mapping tool: • User/admin chooses arguments for RDF predicates • “glue” given or inferred (w/ possible user interaction) • XSLT mapping program generated automatically • BUT, rule-based syntax is more convenient for reasoning. SW=DB?, Keynote, IDEAS 2003

  47. X-DARES-U Local Query Rewriting • Global query: p(X,Y) || q(Y,Z) • Coordinator handles 2 kinds of queries: • Local queries: pi(X,Y) || qi(Y,Z) • Inter-source queries: pi(X,Y) || qj(Y,Z) Global Q Coordinator Source Query Rewriter Src sem. view Source access code Ontology Global Local SW=DB?, Keynote, IDEAS 2003

  48. X-DARES-U Local Query Rewriting • Generation of IS queries – similar. • Space of strategy options: • Materialize all predicates at coordinator & evaluate locally. • Materialize nothing but answers. • Partial • Choice must be cost-based • Dynamic programming approach • (Some) inter-source queries eliminable SW=DB?, Keynote, IDEAS 2003

  49. X-DARES-U Local Query Rewriting FOR $S IN distinct(doc(…)/wh-state/tuple/state) RETURN <state> {$S} FOR $X IN doc(…)/wh-state/tuple[state=$S], $Y IN doc(…)/i-wh/tuple[wh = $X/wh], $Z IN doc(…)/i-id/tuple[item = $Y/item] RETURN <item><id> {distinct($Z/itemId)}</></></> source1 store * warehouse @ * city state item id id name description SW=DB?, Keynote, IDEAS 2003

  50. X-DARES-U Local Query Rewriting FOR $S IN source1/store/warehouse/state RETURN <state> {$S} FOR $X IN doc(…)/wh-state/tuple[state=$S], $Y IN doc(…)/i-wh/tuple[wh = $X/wh], $Z IN doc(…)/i-id/tuple[item = $Y/item] RETURN <item><id> {distinct($Z/itemId)}</></></> source1 store * warehouse @ * city state item id id name description SW=DB?, Keynote, IDEAS 2003

More Related