170 likes | 260 Vues
Information Integration Using Logical Views. Jeffrey D. Ullman. Overview. Information Integration Systems Global-as-view (Gav.) vs. Local-as-view (Lav.) Query Reformulation Specification of Source Description Adding new sources. Query Reformulation.
E N D
Information Integration Using Logical Views Jeffrey D. Ullman
Overview • Information Integration Systems • Global-as-view (Gav.) vs. Local-as-view (Lav.) • Query Reformulation • Specification of Source Description • Adding new sources
Query Reformulation • Problem: rewrite a user query expressed in the mediated schema into a query expressed in the source schema Given a query Q in terms of the mediator schema relations, and descriptions of information sources Find a query Q’ that uses only thesource relations, such that • Q’ Q, and • Q’ provides all possible answers to Q given the sources
Solving Queries by Views Mediator Relations Source Relations
Query Rewriting Using Views • Query Containment: q’ q D q’(D) q(D) • Query Equivalence: q’=q q’ q ^ q q’ Given query q and view definitions V={v1, …, vn} • q’ is an Equivalent Rewriting of q using V if • q’ refers only to views in V, and • q’ = q • q’ is an Maximally-Contained Rewriting of q using V if • q’ refers only to views in V and • q’ q, and • There is no rewriting q1, such that q’ q1 and q1q’
Complexity of Query Containment • Conjunctive Queries (CQ) (NP-Complete) • Q1: p(X,Z) :- a(X,Y) & a(Y,Z) • Q2: p(X,Z) :- a(X,Y) & a(V,Z) • CQ’s With Negation ( -Complete) • Q1: p(X,Z) :- a(X,Y) & a(Y,Z) & NOT a(X,Z) • CQ’s With Arithmetic Comparision ( -Complete) • Q1: p(X,Z) :- a(X,Y) & a(Y,Z) & X<Y • Datalog Programs • p(A,C) :- a(A,B) & b(B,C)
Specification of Source Description • Views: resources that used by integrator to help to answer queries • Gav. Mediator relation defined as view over source relations • Lav. Source relation defined as view over mediator relations
Information Integration Systems • Information Manifold (IM) • AT&T • Local-as-View (Lav) • Description logic • Source relations defined as views of mediator relations ( a collection of global predictions) • Tsimmis • Stanford and IBM • Global-as-View (Gav) • Mediator relations defined as views of source relations
IM Example Global Predicates: Mediator relations
IM Example (Cont.) Views: Source Relations Mediator Relations Query: “What are Sally’s phone and office?” Mediator Relations
IM Example (Cont.) Answer: Source Relations Query reformulation : Bucket Algorithm (check query containment NP-Complete (query length) )
Advantages and Disadvantages (IM) • Advantage: adding new sources • Mediator (global predicates, source descriptions) • Query processing • Disadvantages : query reformulation (Bucket algorithm)
Tsimmis • OEM and MSL Mediator Relations
Tsimmis Example Source Relations Exported OEM Objects Mediator Relations Query: “What are Sally’s phone and office?” Source Relations
Advantage and Disadvantage ( Tsimmis) • Advantage • Query reformulation: rule unfolding • Disadvantage • Mediation description • Adding, removing, and modifying source description
IM vs. Tsimmis • Query Reformulation • Adding Sources • Levels of Mediation • Semistructured Data • Constraints • Automatic Generation of Components (Wrappers and Mediators)