1 / 52

XML and the Semi-Structured Data Model

XML and the Semi-Structured Data Model. Motivation. We have seen that relational databases are very convenient to query. However: There is a LOT of data not in relational databases!! Perhaps the most widely accessed database is the web, and it certainly isn’t a relational database.

eljah
Télécharger la présentation

XML and the Semi-Structured Data Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML and the Semi-Structured Data Model

  2. Motivation • We have seen that relational databases are very convenient to query. However: • There is a LOT of data not in relational databases!! • Perhaps the most widely accessed database is the web, and it certainly isn’t a relational database.

  3. Documents Vs. Databases

  4. Querying the Web • The web can be queried using a search engine, however, we can’t ask questions like: • What is the weather in Zanzibar today? • What is the lowest price for which a Jaguar is sold on the web? • Problems: • There are no facilities for asking complex questions, such as aggregation of data • Words have overloaded meanings (Jaguar)

  5. Understanding the Web • In order to query the web, we must be able to understand it. • 2 Computer Science Approaches: • Artificial Intelligence Approach • Database Approach

  6. Artificial Intelligence Approach “The web is unstructured and we must deal with it” • Use techniques for machine learning to understand the web. • Example: To understand the word “Jaguar” check if it appears on a page with the word car or automobile; or rather with jungle and Africa • Problem: Such techniques tend to be inexact and have a large percentage of mistakes

  7. Database Approach “The web is unstructured and we will structure it” • Sometimes problems that are very difficult can be solved easily by enforcing a standard • Encourage the use of XML as a standard for data exchange on the web

  8. Example XML Document Opening Tag <?xml version=“1.0”?> <transaction> <account>89-344</account> <buy shares = “100”> <ticker exch = “NASDAQ”>WEBM</ticker> </buy> <sell shares = “30”> <ticker exch = “NYSE”>GE</ticker> </sell> </transaction> Closing Tag Element Attribute Name Attribute Value

  9. XML Representation of a Table <?xml version=“1.0”?> <ROWSET> <ROW num = “1” > <ENAME>KING </ENAME> <SAL>5000</SAL> </ROW> <ROW num = “2” > <ENAME>SCOTT </ENAME> <SAL>3000</SAL> </ROW> </ROWSET>

  10. Very Unstructured XML <?xml version=“1.0”?> <DamageReport> The insured’s <Vehicle Make = “Volks”> Beetle </Vehicle> broke through the guard rail and plummeted into the ravine. The cause was determined to be <Cause>faulty brakes </Cause>. Amazingly there were no casualties. </DamageReport>

  11. XML Vs. HTML • XML and HTML are brothers. They are both special cases of SGML. • HTML has specific tag and attribute names. These are associated with a specific meaning • XML can have any tag and attribute name. These are not associated with any meaning • HTML is used to specify visual style • XML is used to specify meaning

  12. Rules for Creating XML Documents

  13. Rule 1 – XML Declaration • An XML document should begin with an XML declaration. A simple declaration is: <?xml version=“1.0”?> Other things can be specified, such as character encoding.

  14. Rule 2 – Document Element • Use exactly one top-level document element: Example: <?xml version=“1.0”?> <Question> This is legal </Question> <?xml version=“1.0”?> <Question> Is this legal? </Question> <Answer> No. </Answer>

  15. Rule 3 – Match Opening and Closing Tags • XML is case sensitive. The following examples are all illegal Example: <Question> This is legal </QUESTION> <Question> <B> Is this legal? </Question> </B>

  16. Rule 4 – Comments • Comments are between <!-- and --> characters. Comments can’t appear as attribute values or within a tag. Example: <!-- This is a legal comment --> <Question <!-- This is illegal -->> Why is this illegal <!-- This is a legal comment --> </Question>

  17. Rule 5 – Element Names • Element and attribute names must be continuous sequences of letters or hyphens or underscores. Example: Legal Names: <_legal> <This-is-OK> I Illegal Names: <2-Part-Question> <Two Part Question><Question 4You = “Yes”>

  18. Rule 6 – Attribute Values • Attribute values • go in opening tags. • should be enclosed by matching quotes (‘ or “) • should have only text and not tags Legal Example: <Question Poster = “Yitzchak”>Do you like XML? </Question> <Answer Poster = ‘Yaakov’>I do.</Answer>

  19. Rule 6 – Continued Illegal Examples: <Question Poster = “Yitzchak’>Do you like XML? </Question> <Question>Do you like XML? </Question Poster = “Yitzchak”> <Question Poster = “<first>Yitzchak</first>”>Do you like XML? </Question>

  20. Rule 7 – Empty Elements • Empty elements are elements that do not contain text or nested elements. They can be written in a compact syntax: <Person First = “Shmuel” Last = “Levy”></Person> is the same as <Person First = “Shmuel” Last = “Levy” />

  21. Abstract View of XML

  22. A Different Data Model

  23. An Example <?xml version=“1.0”?> <transaction> <account>89-344</account> <buy shares = “100”> <ticker exch = “NASDAQ”>WEBM</ticker> </buy> <sell shares = “30”> <ticker exch = “NYSE”>GE</ticker> </sell> </transaction>

  24. Corresponding Tree transaction account buy sell 89-344 shares shares ticker ticker 100 30 exch exch NASDAQ NYSE WEBM GE

  25. Using XML • Quering XML: There are query languages that query XML and return XML. Examples: XQuery, XPath, SQL4X • Displaying XML:An XML document can have an associated style-sheet which specifies how the document should be translated to HTML. Examples: CSS, XSL

  26. Namespaces • Namespaces are used to attach an accepted meaning to a set of tags. • Syntax for defining a namespace <SomeElement xmlns:prefixname=“namespaceURL” > the namespace will be recognized within the SomeElement element.

  27. Example Namespace <irs:Form id=“1040” xmlns:irs=“http://www.irs.gov”> <irs:Name>Tina Wells</irs:Name> <PhoneNumber>03-5655666</PhoneNumber> </irs:Name> • In order for the namespace to be recognized in all elements, the declaration should be in the document element

  28. XSQL Pages

  29. What are XSQL Pages? • XSQL pages are XML documents that have SQL queries embedded in them. • When a user requests to view an XSQL page, the web server: • Dynamically computes the embedded queries • Translates the query results into XML • Inserts the results in the proper places in the document • Transforms the result to HTML if a stylesheet is given

  30. A Simple Example <?xml version=“1.0”?> <xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”> SELECT sname FROM Sailors </xsql:query> You should specify the connection and the namespace on the document element

  31. Page Seen in Browser <?xml version=“1.0”?> <ROWSET> <ROW num = “1” > <SNAME>Rusty</SNAME> </ROW> <ROW num = “2” > <SNAME>Justin </SNAME> </ROW> </ROWSET> • A ROWSET element encloses query result • Each ROW element encloses each row • Each column in the row is within a tag with its column’s name

  32. Another Example <?xml version=“1.0”?> <RESULTS connection=“scott” xmlns:xsql=“urn:oracle-xsql”> Here is something interesting: <xsql:query> SELECT sname, age + rating as ra FROM Sailors WHERE sid = 13 </xsql:query> </RESULTS>

  33. Resulting Document <?xml version=“1.0”?> <RESULTS> Here is something interesting: <ROWSET> <ROW num = “1” > <SNAME>Rusty</SNAME> <RA>55</RA> </ROW> </ROWSET> </RESULTS>

  34. Using Parameters • Your page can use parameters. The value of a parameter param is determined in the following fashion: • The value of the URL parameter param if supplied • The value of the HTTP session object param if supplied • The value of the closest ancestor’s attribute named param, if present • An empty string

  35. Example with Parameters <?xml version=“1.0”?> <xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql” sname = “Joe”> SELECT * FROM Sailors WHERE sname = ‘{@sname}’ </xsql:query>

  36. Evaluating the Query • Suppose the XSQL document is at: http://cs.huji.ac.il/~db/query1.xsql • Then, requesting the url: http://cs.huji.ac.il/~db/query1.xsql?sname=Jim will return all the details of Jim. • Requesting http://cs.huji.ac.il/~db/query1.xsql will return all the details of Joe (the defualt value)

  37. A Strange Example <?xml version=“1.0”?> <xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql” select = “*” where = “1=1” order=“1”> SELECT {@select} FROM {@from} WHERE {@where} ORDER BY {@order} </xsql:query>

  38. Customizing Results • The query tag can have different attributes that customize the query results. Here are some of the important options: • max-rows: The maximum number of rows returned • skip-rows: The number of rows to skip before returning rows • rowset-element: The name of the rowset element • row-element: The name of the row element

  39. Customizing Results <?xml version=“1.0”?> <xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql” skip = “0” max-rows=“2” skip-rows={@skip} > SELECT * FROM Program ORDER BY url </xsql:query> By calling the same page with different values for skip, we can see the different programs

  40. Notes • An XSQL document can have many queries. • The queries can appear within arbitrary XML tags • We can produce XML that has a more nested structure using the CURSOR function...

  41. Remembering Subqueries in the SELECT Clause • Subqueries in the SELECT clause must return a single value. What do we do if we want for each boat, all the sailors who reserved the boat? • We want each bid to be associated with a table of Sailors data!

  42. Using the CURSOR Function <?xml version=“1.0”?> <xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”> SELECT bid, CURSOR(SELECT sid, sname FROM Sailors S, Reserves R WHERE S.sid = R.sid and R.bid = B.bid) as Reservers FROM Boats B; </xsql:query>

  43. Note use of select query alias instead of inner row set and row tags. <?xml version=“1.0”?> <ROWSET> <ROW num = “1” > <BID>113</BID> <RESERVERS> <RESERVERS_ROW num = “1” > <SID> 13 </SID> <SNAME> Joe </SNAME> </RESERVERS_ROW> <RESERVERS_ROW num = “2” > .... </RESERVERS_ROW> </RESERVERS> </ROW> </ROWSET>

  44. Setting Page Level Parameters • The following statement defines a parameter pname. The value of pname is the value in the first column of the first row of the query • The variable pname will be recognized in the page <xsql:set-page-param name=“pname”> SELECT Statement </xsql:set-page-param>

  45. Example <?xml version=“1.0”?> <page connection=“scott” xmlns:xsql=“urn:oracle-xsql”> <xsql:set-page-param name=“num-stories”> SELECT headings_num FROM user_prefs WHERE userid={@user} </xsql:set-page-param> <xsql:query max-rows={@num-stories} > SELECT title, url FROM latest_news </xsql:query> </page>

  46. Another Way to Define a Page Level Parameter • Page level parameters can also be set with the statement: <xsql:set-page-param name=“pname” value=“val”/> • For example: <xsql:set-page-param name=“num-stories” value=“10”/>

  47. Additional Options • The set-page-param element can have the following attributes: • only-if-unset: If the value is “yes” then the parameter will be set only if it has no value • ignore-empty-value: If value is “yes” then the parameter will be set only if its value will not be an empty string

  48. Setting Cookie Values • The following statement defines a parameter pname. The value of pname is the value in the first column of the first row of the query • The variable pname will be recognized until the cookie expires <xsql:set-cookie name=“pname”> SELECT Statement </xsql:set-cookie>

  49. Additional Attributes for Set-Cookie • The set-cookie element can have the following attributes: • max-age: The number of seconds before the cookie expires (defaults to expire when user exits current browser instance) • only-if-unset • ignore-empty-value

  50. Example <?xml version=“1.0”?> <page connection=“scott” xmlns:xsql=“urn:oracle-xsql”> <xsql:set-cookie name=“siteuser” max-age=“31536000” only-if-unset=“yes” ignore-empty-value=“yes”> SELECT username FROM site_users WHERE username= ‘{@username}’ and password=‘{@password}’ </xsql:set-cookie> <!-- Other Actions Here --> </page>

More Related