1 / 22

CS 245: Database System Principles Notes 14: Coping with Limited Capabilities of Sources

CS 245: Database System Principles Notes 14: Coping with Limited Capabilities of Sources. Hector Garcia-Molina. Heterogeneous Databases. Distributed Database System. DBMS 1. DBMS 2. legacy. web site. data. data. data. data. Limited Capabilities. Example: Amazon.com.

wendysmith
Télécharger la présentation

CS 245: Database System Principles Notes 14: Coping with Limited Capabilities of Sources

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 245: Database System PrinciplesNotes 14: Coping with LimitedCapabilities of Sources Hector Garcia-Molina Notes 14

  2. Heterogeneous Databases Distributed Database System DBMS1 DBMS2 legacy web site data data data data Notes 14

  3. Limited Capabilities Notes 14

  4. Example: Amazon.com must specify at least one of these author: title: this attribute not returned subject: format: menu of choices price: cannot query on this attribute Notes 14

  5. Example: BarnesAndNoble.com must specify at least one of these author: title: Menu of choices subject: format: can query if one of other attributes specified price: Notes 14

  6. Why Limited Capabilities? • Search forms • Security • Indexes • Legacy Notes 14

  7. Capability vs. Content • Capability description • Can only search for subject = “art,” “history,” “science” • Content description • Source only contains subject = “art,” “history,” “science” Notes 14

  8. Outline • Describing source capabilities • Extending source capabilities • How mediators cope with limited capabilities • Mediator capabilities • Other topics mediator source source source Notes 14

  9. Describing Query Capabilities R(X, Y, ... Z) • Adornments: • f: may or may not specify • u: cannot be specified • b: must be specified • c[S]: specified from list S • o[S]: optional, chose from S Notes 14

  10. Describing Query Capabilities R(X, Y, ... Z) • With output restriction • f’ • u’ • b’ • c’[S] • o’[S] • Adornments: • f: may or may not specify • u: cannot be specified • b: must be specified • c[S]: specified from list S • o[S]: optional, chose from S Notes 14

  11. Example • Relation R(X, Y, Z) • Description Templates: bu’f, uf’c[z1, z2] • Answerable queries: R(x1, Y, Z), R(X, Y, z1) • Unanswerable queries: R(X, y1, Z), R(X, Y, z3) Notes 14

  12. Other Description Mechanisms • Tsimmis • query templates • Information Manifold • capability records (# bound attrs, conditions ok,...) • Disco • Garlic • black box • Contex-free grammars Notes 14

  13. Extending Source Capabilities Query: author=“Freud” AND price > 10 wrapper amazon Source: R(author, price, ...) Template: b, u, ... Notes 14

  14. Extending Source Capabilities Query: author=“Freud” AND price > 10 Wrapper Filter: price > 10 wrapper Source Query: author=“Freud” amazon Source: R(author, price, ...) Template: b, u, ... Notes 14

  15. Another Example Query: (author = “Freud” OR author = “Jung”) AND price < 10 wrapper Barnes&Noble R(author, price, ...) No disjunctive conditions; Price can only be specified with author Notes 14

  16. Another Example Query: (author = “Freud” OR author = “Jung”) AND price < 10 Union Operation wrapper Q1: author = “Freud” AND price < 10 Q2: author = “Jung” AND price < 10 Barnes&Noble R(author, price, …) No disjunctive conditions; Price can only be specified with author Notes 14

  17. Extending Source Capabilities • General scheme: • try many query rewritings • check if query fragments supported by source • check if wrapper can combine answer fragments • do all this very efficiently!! [See ICDE99 paper] • Tsimmis, Info Manifold: no disjunctive queries • DISCO: no query splitting • Garlic: only CNF queries Notes 14

  18. Mediator Processing Query: M(5, Y, Z, W, 3) M(X, Y, Z, W, U) = Join(R, T) mediator source source T(Z, W, U) f, u, b R(X, Y, Z) f, f, b Notes 14

  19. Plan 1 Query: M(5, Y, Z, W, 3) (3) Join answers M(X, Y, Z, W, U) = Join(R, T) mediator (1) R(5, Y, Z) (2) T(Z, W, 3) source source T(Z, W, U) f, u, b R(X, Y, Z) f, f, b Notes 14

  20. Plan 2 Query: M(5, Y, Z, W, 3) (3) Join answers (2) for each (z,w,u)  P: R(5, Y, u) M(X, Y, Z, W, U) = Join(R, T) mediator (1) P = T(Z, W, 3) source source T(Z, W, U) f, u, b R(X, Y, Z) f, f, b Notes 14

  21. Mediator Plan Generation • Need feasible and efficient plan • Search space is huge • Tsimmis, Info Manifold, Garlic: • exponential algorithms • Polynomial algorithms: • often find optimal or near-optimal plan • bounded performance • [See ICDT99 Paper] Notes 14

  22. Conclusion • Not all sources are created equal! • Need to • describe what sources can do • efficiently process queries with limited sources • describe what mediators can do • exploit content information • deal with unavailable sources Notes 14

More Related