Enhancing Data Exchange Through Bidirectional Mappings
120 likes | 206 Vues
Explore bidirectional mappings and updates for closer peer collaboration. Algorithm and policy details for efficient data exchange. Experimental evaluations show feasibility and benefits.
Enhancing Data Exchange Through Bidirectional Mappings
E N D
Presentation Transcript
BidirectionalMappings for Data and Update Exchange Grigoris Karvounarakis Zachary G. Ives University of Pennsylvania WebDB 2008
Collaborative Data Sharing Systems [CIDR05] ∆B+/− Peer B ∆C+/− Peer C Peer A DBMS ∆A+/− ∆A+/− Queries, edits PUBLISH Each peer has a local instance Relate peers by schema mappings Support update exchange according to different administrator policies
Unidirectional update exchange • ORCHESTRA [SIGMOD06,VLDB07]: • Each mapping (tgd[Fagin+03]/GLAV[Lenzerini02]) has source and target peers • Propagate effects of updates forward, to target • Related to view maintenance m1: R(xy) Æ S(yz) ! T(xyz) m2: T(xyz) !U(xyz) RR + (1,2) TT UU (1,2) (1,2) m2 m1 – SS (1,3) (1,3) + m2 + (2,3)
Bidirectionalmappings and update exchange • Goal: Tighter coupling between peers • Both forward and backward propagation of updates (insertions/deletions) along mappings • E.g., for mirroring of content between peers • Our contributions: • Language for specifying bidirectional mappings and update policies (for backwards propagation of deletions) • Algorithms for propagation of updates in both directions and for propagation with detection and prevention ofside effects [Dayal+82] atrun-time • Experimental evaluation, illustrating feasibility
m1 :R(xy)ÆS(xzw)!T(xyz)ÆV(wx) m1:T(xyz)ÆV(wx)!R(xy)ÆS(xzw) m2: T(xyz) ! U(xyz) m2 :U(xyz) ! T(xyz) Bidirectional update exchange: insertions m1:R(xy)ÆS(xzw)$T(xyz)ÆV(wx) m2:T(xyz)$U(xyz) R T U (1, 1) + (1, 1, 1) (1, 1, 1) m2 m1 (3, 2) + (1, 1, 2) (1, 1, 2) m2 S (3, 2, 3) (3, 2, 3) + m2 m1 (1, 1, 4) + V (1, 2, 4) + m1 (4,1) (3, 3, 5) (5,3) +
Propagating deletions to source tuples • Need to track down and delete source tuples from which deleted ones were derived • Multiple options for propagating updates backwards over joins • User specified update policies resolve this ambiguity: • Guaranteed to perform any deletion, as long as there is at least one * in each side * * R(xy) Æ S(xzw) $ T(xyz)ÆV(wx)
Bidirectional update exchange: deletions m1:*R(xy)ÆS(xzw)$T(xyz)Æ*V(wx) m2:*T(xyz)$*U(xyz) R T U (1, 1) + (1, 1, 1) (1, 1, 1) m2 m1 (3, 2) + (1, 1, 2) (1, 1, 2) (1, 1, 2) (1, 1, 2) m2 S (3, 2, 3) (3, 2, 3) + m2 m1 (1, 1, 4) + V (1, 2, 4) + m1 (4,1) (3, 3, 5) (3, 3, 5) (5,3) +
Avoiding side effects on T m1:*R(xy)ÆS(xzw)$T(xyz)Æ*V(wx) m2:*T(xyz)$*U(xyz) R T U (1, 1) + (1, 1, 1) (1, 1, 1) m2 m1 (3, 2) + (1, 1, 2) (1, 1, 2) m2 S (3, 2, 3) (3, 2, 3) + m2 m1 (1, 1, 4) + V (1, 2, 4) + m1 (4,1) (3, 3, 5) (5,3) +
Experimental evaluation • Implementation strategy: Mappings -> Datalog programs -> SQL + fixpoint • Java layer over RDBMS (DB2) • Synthetic update workload sampled from SWISS-PROT biological data set • Randomly-generated schemas and mappings • 2000 initial tuples in each peer before propagation • Questions: • Overhead of bidirectional over unidirectional (in paper) • Feasibility of deletion propagation and overhead of side effect detection
Related work • Data exchange[Haas+99, Miller+00, Popa+02, Fagin+03, Hernich+07], peer data exchange [Fuxman+05] • View update[Dayal+82, Bancilhon+81], Harmony [Bohannon+06], [Matsuda+07] • Incremental view maintenance[Gupta+93] • Peer data management systems (PDMS) Piazza[Halevy+03,04],Hyperion[Kementsietsidis+04], [Bernstein+02], [Calvanese+04], ...
Conclusions and future work • Contributions: • Bidirectional mappings for update exchange between peers that want closer collaboration • Combine forward and backward propagation to compute peer instances incrementally • Incorporate update policies for backward propagation • Dynamic detection and prevention of side effects • Future work: • Combine unidirectional and bidirectional mappings, to also guarantee that all initial deletions are performed • Study opportunities for performance optimization