1 / 1

Hash-Based Algorithms For Operator Load-Balancing In Database Middleware Systems

Client. Client. DSS 1. DSS 1. Q. Q. R. R. R. R. QSB 1. QSB 1. Q 1. Q 3. R 2. Q 2. R 3. R 1. QSB 2. QSB 2. QSB 3. QSB 3. QSB 4. QSB 4. 6. Q. Q 1. Q. Q 2. Q 3. Q. Project Status. R. R 1. R 2. R. R 3. R. Centralized Query Optimizer. Traditional DBMS Plan.

paxton
Télécharger la présentation

Hash-Based Algorithms For Operator Load-Balancing In Database Middleware Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Client Client DSS1 DSS1 Q Q R R R R QSB1 QSB1 Q1 Q3 R2 Q2 R3 R1 QSB2 QSB2 QSB3 QSB3 QSB4 QSB4 6 Q Q1 Q Q2 Q3 Q Project Status R R1 R2 R R3 R Centralized Query Optimizer Traditional DBMS Plan IG1 IG1 IG2 IG2 IG3 IG3 3 2 NetTraveler Architecture WAN Scenario 4 7 8 5 Future Work & Results Proposed Work Methodology References 1 Project Description Decentralized Query Optimizer Distributed and Parallel DBMS Plan Hash-Based Algorithms For Operator Load-Balancing In Database Middleware Systems Angel L. Villalain-Garcia – M.S. Student Prof. Manuel Rodriguez – Advisor ADM Group, ECE Department, University of Puerto Rico, Email: angel.villalain@ece.uprm.edu Mayagüez Campus Conventional DBMS are usually modelled as a centralized architecture with several pipelined steps. One important step is the Query Optimizer (QO). A QO is the process that determines the needed actions to solve a query. Image below represents the most important aspects of a QO and its output. Currently the system provides an environment for remote query execution and the orchestration of the several services to bring up the result for the client in the way explain by the graphic shown below. Scientific applications are becoming highly dependant on Wide Area Network environments to access data and computing resources. Traditional database middleware systems have succeeded on integrating heterogeneous data sources but fail on scaling to WANs because they are focused on centralized architectures with static characteristics. Our goal is to design a distributed database middleware system that takes into consideration the dynamic characteristics of each host on the WAN to efficiently run queries on it. We call this system NetTraveler Our goal with NetTraveler is to design an efficient query execution environment for distributed and parallel query execution to improve response time based on the characteristics of the hosts on the WAN. Due to several reasons as cited on [5] centralized QO fails to scale to WAN environment. With that in mind we proposed the idea of a decentralized QO that could create plans that exploits parallel and distributed execution on WAN environment (see image below). Actual systems support for parallel query execution of operations and static distribution of the load across participating sites. • Study Scheduling Effect and improvements • Hash Join operators and functionalities • Second Phase Optimizer Both the query optimizer and the query parallelization step would need information of the available resources on each host. Our basic assumption is that data sources are replicated, we just need to select the better host for each step of the query execution. The result is explained on the next figure. Query Service Broker (QSB) • Coordination of the Execution Process. • Interface exposed to the client • P2P behavior Registration Server (RS) Coordinates federations and acts as a Catalog Manager. Data Processing Server (DPS) Interface for grid services and sensors Information Gateway (IG) Interface for DB access • David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider, Allan Bricker, Hui-I Hsiao, and Rick Rasmussen. “The gamma database machine project”. pages 609–626, 1994. • Sumit Ganguly, Waqar Hasan, and Ravi Krishnamurthy. Query optimization for parallel execution. In SIGMOD ’92: Proceedings of the 1992 ACM SIGMOD international conference on Management of data, pages 9–18, New York, NY, USA, 1992. ACM Press. • Wei Hong and Michael Stonebraker. Optimization of parallel query execution plans in xprs. In PDIS ’91: Proceedings of the first international conference on Parallel and distributed information systems, pages 218–225, Los Alamitos, CA, USA, 1991. IEEE Computer Society Press. • Manuel Rodriguez-Martinez, Nick Roussopoulos, “MOCHA: A Self-Extensible Database Middleware System for Distributed Data Sources”, In Proceeding of the ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, pp. 213-224, May 2000. • Waqar Hasan,Daniela Florescu, Patrick Valduriez, Open Issues in Parallel Query Optimization,SIGMOD Rec., Vol. 25 No. 3 pp. 171-179, September 1996. Data Synchronization Server (DSS) • Acts as a data source when needed. • Used for recovery of queries. • Use of different Java technology for implementing our proposed architecture: • Web Services, Axis • Data access using JDBC, Hibernate

More Related