110 likes | 291 Vues
PMIT-6102 Advanced Database Systems. By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University. Lecture 04 Distributed Database Design. Outline. Distributed Database Design Distributed Design Problem Distributed Design Issues Fragmentation Data Allocation.
E N D
PMIT-6102Advanced Database Systems By- JesminAkhter Assistant Professor, IIT, Jahangirnagar University
Outline • Distributed Database Design • Distributed Design Problem • Distributed Design Issues • Fragmentation • Data Allocation
Distributed Design Problem • The design of a distributed computer system involves • making decisions on the placement of data and programs across the sites of a computer network • as well as possibly designing the network itself • In the case of distributed DBMSs, the distribution of applications involves two things: • distribution of the distributed DBMS software • distribution of the application programs that run on it. Are not significant problem • Assume a copy of the distributed DBMS software exists at each site where data are stored • Network has already been designed • We concentrate on distribution of data
The following set of interrelated questions covers the entire issue. Why fragment at all? How should we fragment? How much should we fragment? How to test correctness? How should we allocate? What is the necessary information for fragmentation and allocation? Distribution Design Issues
Reasons for Fragmentation • The important issue is the appropriate unit of distribution. • A relation is not a suitable unit, for a number of reasons. • First, application views are usually subsets of relations. • Therefore, the locality of accesses of applications is defined not on entire relations but on their subsets • Hence consider subsets of relations as distribution units.
Reasons for Fragmentation • The relation is not replicated and is stored at only one site, • results in an unnecessarily high volume of remote data accesses • The relation is replicated at all or some of the sites where the applications reside. • May has unnecessary replication, which causes problems in executing updates • may not be desirable if storage is limited. • Finally, the decomposition of a relation into fragments, each being treated as a unit, permits a number of transactions to execute concurrently. • Thus fragmentation typically increases the level of concurrency and therefore the system throughput.
Fragmentation Alternatives • Relation instances are essentially tables, so the issue is one of finding alternative ways of dividing a table into smaller ones. • There are clearly two alternatives for this: • dividing it horizontally or dividing it vertically.
PROJ1 : projects with budgets less than $200,000 PROJ2 : projects with budgets greater than or equal to $200,000 P1 Instrumentation 150000 Montreal P2 Database Develop. 135000 New York Fragmentation Alternatives – Horizontal PROJ PNO PNAME BUDGET LOC P1 Instrumentation 150000 Montreal P2 Database Develop. 135000 New York P3 CAD/CAM 250000 New York P4 Maintenance 310000 Paris P5 CAD/CAM 500000 Boston PROJ1 PROJ2 LOC LOC PNO PNAME BUDGET PNO PNAME BUDGET P3 CAD/CAM 250000 New York P4 Maintenance 310000 Paris P5 CAD/CAM 500000 Boston Example of Horizontal Partitioning
PROJ1: information about project budgets PROJ2: information about project names and locations Fragmentation Alternatives – Vertical PROJ PNO PNAME BUDGET LOC P1 Instrumentation 150000 Montreal P2 Database Develop. 135000 New York P3 CAD/CAM 250000 New York P4 Maintenance 310000 Paris P5 CAD/CAM 500000 Boston PROJ1 PROJ2 PNO PNAME LOC PNO BUDGET P1 Instrumentation Montreal P1 150000 P2 135000 P2 Database Develop. New York P3 CAD/CAM New York P3 250000 P4 310000 P4 Maintenance Paris P5 CAD/CAM Boston P5 500000 Example of Vertical Partitioning
Thank You Slide 11