1 / 32

Database Management Systems

Database Management Systems. Chapter 11 Distributed Databases. Objectives. Why do you need a distributed database? What are distributed databases? How is data distributed with client/server systems? Can a Web approach solve the data distribution issues?

adamma
Télécharger la présentation

Database Management Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Management Systems Chapter 11 Distributed Databases

  2. Objectives • Why do you need a distributed database? • What are distributed databases? • How is data distributed with client/server systems? • Can a Web approach solve the data distribution issues? • How much data can you send to a client form? • How do you transfer data across diverse systems? • How will Sally’s employees access the database?

  3. Distributed Databases SELECT Sales FROM Britain.Sales UNION SELECT Sales FROM France.Sales UNION SELECT Sales FROM Italy.Sales • Definition • Advantages / Uses • Problems / Complications • Client-Server / SQL Server • Microsoft Access Germany Britain France Italy

  4. Distributed Database Definition • Multiple independent databases • Each DBMS is a complete DBMS (engine, queries, locking, transactions, etc.) • Usually on different machines. • Usually in different locations. • Connected by a network. • Might be different environments • Hardware • Operating System • DBMS Software Database Apollo Database Zeus England France Database Athena United States

  5. Distributed Database Rules C.J. Date Rule 0: Transparency: the user should not know or care that the database is distributed. Local autonomy. No reliance on a central site. Continuous operation. Location independence. Fragmentation independence (physical storage). Replication independence. Distributed query processing. Distributed transaction management. Hardware independence. Operating system independence. Network independence. DBMS independence.

  6. Distributed Features • Each database can continue to run even if portion fails. • Data and hardware can be moved without affecting operations or users. • Expanding operations. • Performance issues. • System expansion and upgrades. • Add new section without affecting others. • Upgrade hardware, network and DBMS.

  7. Advantages and Applications local transactions • Business operations are often distributed • Work and data are segmented by department. • Work and data are segmented by geographical location. • Improved performance • Most updates and queries are performed locally. • Maintain local control and responsibility over data. • Can still combine data across the system. • Scalability and expansion • Add on, not replacement. future expansion

  8. Creating a Distributed Database • Design administration plan. • Choose hardware and DBMS vendor, and network. • Set up network and DBMS connections. • Choose locations for data. • Choose replication strategy. • Create backup plan and strategy. • Create local views and synonyms. • Perform stress test: loads and failures.

  9. Network Transfer Rates • Drives: 60 - 400 MB per sec. (with SSD or RAID) • LANs: 10-100 MB per sec (100-1000 mbps). • WANs: 0.2 - 300 MB per sec. • Faster is possible but expensive! • Goal is to minimize transmissions. • Each system must be capable of evaluating queries--preferably SQL. • Results depend heavily on how the system joins tables. WAN 0.2 - 300 MB 10-100 MB LAN 60 – 400 MB Disk drive

  10. Distributed Query Processing NY • Example • NY: Customers: 1 M rows • LA: Production: 10 M rows • Chicago: Sales: 20 M rows • Query: List customers who bought blue products on March 1 • Bad idea #1 • Transfer all rows to Chicago • Then JOIN and select. • Better idea #2 (probably) • Transfer blue products from LA to Chicago • Better idea #3 • Get sale items on March 1 • Get blue products from LA • Send C# to NY Customers(C#, …) 1,000,000 C# list from desired P# Chicago Matching Customer data Sales(S#, C#, Sdate) 20,000,000 SaleItem(S#, P#,…) 50,000,000 P# sold on March 1 Blue P# sold on March 1 LA Products(P#, Color…) 10,000,000

  11. Data Replication: Publish/Subscribe subscribe Replica Main Published changes Changed data

  12. Data Replication Market research & data corrections. Britain Britain: Customers & Sales • Goals • Minimize transmissions • Improve performance • Support heavy multiuser access. • Problems • Updating copies • Bulk transmissions • Site unavailable • Concurrency • Easier for two people to change the same data at the same time. • Decision support systems. • Data warehouse. France: Customers & Sales Spain: Customers & Sales Periodic updates Spain Britain: Customers & Sales France: Customers & Sales Spain: Customers & Sales Update data.

  13. Concurrency and Locks • Each DBMS must maintain lock facility. • To update, each DBMS must utilize and recognize other lock mechanisms and return codes. • Each DBMS must have a deadlock resolution protocol that recognizes the distributed databases. • Random wait. • Optimistic updates. • Two-phase commit. DBMS #1 Accounts Jones 8898 Transaction A Locked Waiting Transaction B Waiting Locked DBMS #2 Accounts Jones 3561

  14. Transactions & Two-Phase Commit • Two (or more) separate lock managers. • DBMS initiating update serves as the coordinator. • Two phases • Coordinator sends message and data to all machines to “get ready.” • Local machines save data in logs, verify update status and return message. • If all locals report OK, then coordinator writes log and instructs others to proceed. If any fail, it sends Rollback message. Database 1 Initiate Transaction 1. Prepare to commit. All agree? 2. Commit Database 2 Lock tables. Save log. Database 3 Update all tables.

  15. Distributed Transaction Managers Transaction Manager Transaction Manager Resource Manager Resource Manager DBMS DBMS Transaction Manager Transaction Processing Monitor Resource Manager DBMS The distributed transaction coordinator/transaction processing monitor handles the transaction decisions and coordinates across the participating systems.

  16. Distributed Design Questions

  17. Client-Server Server Server Shared Database Front-end User Interface Clients Clients Older, but common for retail stores/checkout computers.

  18. LAN File Server: Slow File Server MyFile.mdb Forms CustID Name … 115 Jenkins … 125 Juarez ... Order ... Application and query transferred. DBMS software transferred. One row at a time transferred, until all rows are examined. SELECT * FROM Customer WHERE City = “Sandy”

  19. Client-Server Databases • One machine machine is dominant (server) and handles data for many clients. • Client machines handle front-end tasks and small data tables that are not shared. File Server DBMS SQL Server Shared Data Return matching data. Send SQL statement. SELECT . . . application

  20. Three-Tier Client-Server Database Servers • Server Databases • Client front-end • Middle • Locate databases • Business rules • Program code Databases. Transactions. Legacy applications. Database links. Business rules. Program code. Middleware Application. Front-end. User Interface. Client

  21. Database Independence on the Client Original DBMS New DBMS ADO ADO Application

  22. Database Independence with Queries Independent Application Query: works with any DBMS SELECT SaleID, SaleDate, CustomerID, CustomerName FROM SaleCustomer Saved Oracle Query SELECT SaleID, SaleDate, CustomerID, LastName || ‘, ‘ || FirstName AS CustomerName FROM Sale, Customer WHERE Sale.CustomerID=Customer.CustomerID Saved SQL Server Query SELECT SaleID, SaleDate, CustomerID, LastName + ‘, ‘ + FirstName AS CustomerName FROM Sale INNER JOIN Customer ON Sale.CustomerID = Customer.CustomerID

  23. The Internet as Client-Server Internet Router Router Server result page request Client Browser Web Server HTML forms SQL http://server.location/page Data Database Server

  24. HTML Limited Clients <HTML> <HEAD> <TITLE>My main page</TITLE></HEAD> <BODY BACKGROUND=“graphics/back0.jpg”> <P>My text goes in paragraphs.</P> <P>Additional tags set <B>boldface</B> and <I>Italic</I>. <P>Tables are more complicated and use a set of tags for rows and columns.</P> <TABLE BORDER=1> <TR><TD>First cell</TD><TD>Second cell</TD></TR> <TR><TD>Next row</TD><TD>Second column</TD></TR> </TABLE> <P>There are form tags to create input forms for collecting data. But you need CGI program code to convert and use the input data.</P> </BODY> </HTML>

  25. HTML Output

  26. Web Server Database Fundamentals DBMS Web Server SQL <body> <form id="form1" runat="server"> <asp:Label ID="PageTitleLabel" runat="server" … <asp:SqlDataSource ID="CustomerSqlDataSource“ DeleteCommand="DELETE FROM [Customer] … SelectCommand="SELECT [CustomerID], … UpdateCommand="UPDATE [Customer] SET… <DeleteParameters> <asp:Parameter Name="CustomerID" Type="Int32" /> </DeleteParameters> <asp:FormView ID="CustomersFormView" runat="server“… DataSourceID="CustomerSqlDataSource"> … </asp:FormView> </form> </body> Data Web Browser Data Object Customers CustomerID 1653 LastName FirstName … Jones Mary Save CSS Style Sheet .PageTitle { font-weight: bold; font-size: larger; text-align: center; }

  27. Database Example: Client Side 0 Request Server/Form.html Server Initial form 1 3 Results Call Web page 2

  28. Client-Server Data Transfer Order Form Order ID 1015 Jones, Martha Customer Order Date 12-Aug What if there are 10,000 customers? How much time to load the combo box? How do you refresh/reload the combo box? Alternatives?

  29. Latency Server Generate form Receive form data Transmission delay Transmission delay time Form received Client User delay

  30. Cloud Computing Data Data is replicated to multiple, connected servers in the Internet cloud. Client requests are filled from the nearest available server, spreading the bandwidth and processor demands across the network.

  31. Cloud Databases: Amazon S3 Amazon S3 Web server Bucket Name Key, Object HTML, Code, Bucket+Key HTML Page User Developer Developer opens Amazon S3 account and creates/names a bucket. Developer uploads content (objects) to bucket and assigns a key to each object. Developer writes Web server code with HTML page and background code that calls the S3 service with the bucket name and object key. Web page is constructed on request and delivered with the object.

  32. Web Databases (and Cloud) • Many cloud systems are proprietary and not relational. • Designed to handle data that rarely changes—typically through bulk uploads. • So little fear of concurrency or transaction issues. • Microsoft Azure SQL is an exception. • Need to handle concurrency (and transactions/logs) in your own code. • Always use optimistic concurrency (not pessimistic locking). • See Chapter 7, but keep the original values when you read data and use it in the WHERE clause when you alter data to see if it has changed.

More Related