1 / 20

Chapter 21.2 Modes of Information Integration

Chapter 21.2 Modes of Information Integration. ID: 219 Name: Qun Yu Class: CS257 219 Spring 2009 Instructor: Dr. T.Y.Lin. Content Index. 21.2 Modes of Information Integration 21.2.1 Federated Database Systems 21.2.2 Data Warehouses 21.2.3 Mediators. Federations.

kyna
Télécharger la présentation

Chapter 21.2 Modes of Information Integration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 21.2Modes of Information Integration ID: 219 Name: Qun Yu Class: CS257 219 Spring 2009 Instructor: Dr. T.Y.Lin

  2. Content Index 21.2 Modes of Information Integration 21.2.1 Federated Database Systems 21.2.2 Data Warehouses 21.2.3 Mediators

  3. Federations • The simplest architecture for integrating several DBs • One to one connections between all pairs of DBs • n DBs talk to each other, n(n-1) wrappers are needed • Good when communications between DBs are limited

  4. Wrapper • Wrapper: a software translates incoming queries and outgoing answers. In a result, it allows information sources to conform to some shared schema.

  5. Federations Diagram DB1 DB2 2 Wrappers 2 Wrappers 2 Wrappers 2 Wrappers 2 Wrappers 2 Wrappers DB3 DB4 A federated collection of 4 DBs needs 12 components to translate queries from one to another.

  6. Example Car dealers want to share their inventory. Each dealer queries the other’s DB to find the needed car. Dealer-1’s DB relation: NeededCars(model,color,autoTrans) Dealer-2’s DB relation: Auto(Serial, model, color) Options(serial,option) wrapper Dealer-2’s DB Dealer-1’s DB wrapper

  7. Example… For(each tuple(:m,:c,:a) in NeededCars){ if(:a=TRUE){/* automatic transmission wanted */ SELECT serial FROM Autos, Options WHERE Autos.serial = Options.serial AND Options.option = ‘autoTrans’ AND Autos.model = :m AND Autos.color =:c; } Else{/* automatic transmission not wanted */ SELECT serial FROM Auto WHERE Autos.model = :m AND Autos.color = :c AND NOT EXISTS( SELECT * FROM Options WHERE serial = Autos.serial AND option=‘autoTrans’); } } Dealer 1 queries Dealer 2 for needed cars

  8. Data Warehouse • Sources are translated from their local schema to a global schema and copied to a central DB. • User transparent: user uses Data Warehouse just like an ordinary DB • User is not allowed to update Data Warehouse

  9. Warehouse Diagram User query result Warehouse Combiner Extractor Extractor Source 1 Source 2

  10. Example Construct a data warehouse from sources DB of 2 car dealers: Dealer-1’s schema: Cars(serialNo, model,color,autoTrans,cdPlayer,…) Dealer-2’s schema: Auto(serial,model,color) Options(serial,option) Warehouse’s schema: AutoWhse(serialNo,model,color,autoTrans,dealer) Extractor --- Query to extract data from Dealer-1’s data: INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer) SELECT serialNo,model,color,autoTrans,’dealer1’ From Cars;

  11. Example Extractor --- Query to extract data from Dealer-2’s data: INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer) SELECT serialNo,model,color,’yes’,’dealer2’ FROM Autos,Options WHERE Autos.serial=Options.serial AND option=‘autoTrans’; INSERT INTO AutosWhse(serialNo, model, color, autoTans, dealer) SELECT serialNo,model,color,’no’,’dealer2’ FROM Autos WHERE NOT EXISTS ( SELECT * FROM serial =Autos.serial AND option = ‘autoTrans’);

  12. Construct Data Warehouse There are mainly 3 ways to constructing the data in the warehouse: 1) Periodically reconstructed from the current data in the sources, once a night or at even longer intervals. Advantages: simple algorithms. Disadvantages: 1) need to shut down the warehouse; 2) data can become out of date.

  13. Construct Data Warehouse 2) Updated periodically based on the changes(i.e. each night) of the sources. Advantages: involve smaller amounts of data. (important when warehouse is large and needs to be modified in a short period) Disadvantages: 1) the process to calculate changes to the warehouse is complex. 2) data can become out of date.

  14. Construct Data Warehouse 3) Changed immediately, in response to each change or a small set of changes at one or more of the sources. Advantages: data won’t become out of date. Disadvantages: requires too much communication, therefore, it is generally too expensive. (practical for warehouses whose underlying sources changes slowly.)

  15. Mediators • Virtual warehouse, which supports a virtual view or a collection of views, that integrates several sources. • Mediator doesn’t store any data. • Mediators’ tasks: 1)receive user’s query, 2)send queries to wrappers, 3)combine results from wrappers, 4)send the final result to user.

  16. Result User query Query Result Result Query Query Result Query Result A Mediator diagram Mediator Wrapper Wrapper Source 1 Source 2

  17. Example Same data sources as the example of data warehouse, the mediator Integrates the same two dealers’ source into a view with schema: AutoMed(serialNo,model,color,autoTrans,dealer) When the user have a query: SELECT sericalNo, model FROM AkutoMed Where color=‘red’

  18. Example In this simple case, the mediator forwards the same query to each Of the two wrappers. Wrapper1: Cars(serialNo, model, color, autoTrans, cdPlayer, …) SELECT serialNo,model FROM cars WHERE color = ‘red’; Wrapper2: Autos(serial,model,color); Options(serial,option) SELECT serial, model FROM Autos WHERE color=‘red’; The mediator needs to interprets serial into serialNo, and then returns the union of these sets of data to user.

  19. Example There may be different options for the mediator to forward user query, for example, the user queries if there are a specific model&color car (i.e. “Gobi”, “blue”). The mediator decides 2nd query is needed or not based on the result of 1st query. That is, If dealer-1 has the specific car, the mediator doesn’t have to query dealer-2.

  20. THANK YOU ! Reference: Database Systems– The complete Book 2nd Edition, 21.2 notes from http://infolab.stanford.edu/~ullman/dscb.html

More Related