300 likes | 433 Vues
This paper discusses the need for integrated database services within Grid computing frameworks. It emphasizes the role of Database Management Systems (DBMS) in e-Science applications, pointing out the importance of standards that enable access to existing autonomous databases. The proposed development aims to create a service-based approach within the Open Grid Services Architecture (OGSA) to ensure efficient database integration. The work highlights various functionalities such as querying, updating, and delivering structured data across distributed systems, aiming for a unified interface for diverse DBMS technologies.
E N D
Database Access and Integration Services on the Grid* N. Paton, M. Atkinson, V. Dialiani, D. Pearson, T. Storey, P. Watson Authors: Presented by: Ariel Cary Florida International University School of Computing and Information Sciences Summer 2006 * http://www.cs.man.ac.uk/grid-db/papers/dbtf.pdf DAIS Grid
Agenda • Introduction • Scope and Context of Proposal • Proposed Database Services • DS in OGSA • Current DAIS Standards and Systems • Conclusion DAIS Grid
Introduction • Grid research generally focus on applications where data is stored in files • DBMS systems have a central role in data organization for numerous applications, e-Science: particle physics (LHC@CERN), earth sciences, bio-informatics • There is a need to interconnect pre-existing and independently operated databases DAIS Grid
Introduction (cont) • This work seeks to encourage the development of standards that can meet those needs. • A (preliminary) proposal is made for the staged development of a collection of Grid Database Services that allow access to existing, autonomous databases within Grid • Follows a service-based approach within OGSA framework for DBMS integration DAIS Grid
Introduction (cont) • Services definitions essentially state what functionality is to be supported • How functionalities are supported may come to be implemented in different ways (performance characteristics, etc.) DAIS Grid
Scope and Context of Proposal DAIS Grid
Scope • The proposal has several characteristics • Independent of any specific Grid toolkit (could skew and restrict it) • It does not propose the development of a new DBMS for the Grid, but wrapping existing systems to a consistent interface and developing distributed managers • Independent of any specific data model or access language DAIS Grid
Context • Relevant terms related to Databases • Database Service is any service that supports a database interface (WSDL) • Service interfaces are abstract and not prescriptive on how they are supported, or the data model that underpins a DBMS • Specific DBMS services could provide access to relational or object DBMS, XML repositories, specialist storage systems … DAIS Grid
Context • Grid Database Service (GDS) provides capabilities for querying, updating and evolving a database • The interface also describes: • Data delivery: transmitting structured data • Transactions: coordinating collections of operations • Database Metadata: accessing information about the data a DB service provides DAIS Grid
Proposed Database Services DAIS Grid
Database Discovery • It is assumed that a registry lookup returns a Grid Service Handle (GSH), globally unique name for a service instance • A service provider publishes description (WSDL) of a service to a service registry • Later consulted by a requestor, and binding created that allow calls to the service DAIS Grid
Database Statements • Thus, it is a point of tension with the proposal being independent of the data model • Statements allow queries or change operations to be sent to a DBMS • This implies that the underlying DBMS supports a query or command language, different on every database model DAIS Grid
Database Statements (cont.) • The pairs (queryNotation, query), … are introduced to allow flexibility (like MIME types for e-mail attachments) • For example: • queryNotation=“SQL’92” • query=“Select * from EMP Where Salary>1000” DAIS Grid
Database Statements (cont.) • The final results of an operation are managed via: • resultHandle: generated dynamically • expires: an expiry time up for the result to be claimed • The optional txHandle indicates if the operation is part of a transaction, provided the DBMS supports transactions DAIS Grid
Database Statements (cont.) • The operations on a GDS will be atomic: • Preparation and Validation: consistency check • Application: operation is performed • Result Delivery: results available to the caller • Usually involve transfer of large amounts of data which may take long time to execute (prone to interruptions!) • The implementation of the DBMS service should handle such failures to achieve atomicity DAIS Grid
Delivery System • Means by which (potentially large amounts of) structured data is moved from one locations to one or more others • Should be considered complementary to protocols such as GridFTP, which could be used as a delivery mechanism DAIS Grid
Delivery System (cont.) • Single data source to be delivered, represented as a URI • Several destinations represented by URI with delivery mechanisms associated • The deliver operation initiates delivery of the data from the single source to multiple destinations • A more elaborated delivery system would include encryption, progress monitoring, etc. DAIS Grid
Distributed Transactions • A minimal transaction interface: performs the role of conferring a guaranteed unique identity on the transaction • Given a transaction handle, other operations over a database service can be put explicitly within the context of a transaction, using the txHandle parameter DAIS Grid
Distributed Transactions (cont.) • For a transaction to span multiple DBMS services, they must provide operations for use by the transaction manager that is overseeing the distributed transaction • startTransaction includes an expires param. to limit the consumption of resources • prepareCommit operation can be used by a two-phase commit protocol to ensure that all participating database services commit DAIS Grid
Database Metadata • Metadata that could be useful to have access to includes: • Content description: DB schema – data model, logical & physical structures, stats (could be obtained from the data dictionary) • Capability description: language (query /update operations supported), transactional capabilities, protocols supported • The metadata should be described in a standard representation, e.g. XML document given by the data service provider DAIS Grid
Distributed Query Service • Query DS1 (DQS) • Parsed & optimized • Sub-queries to relevant DB’s • Results collected & joined by DQS DAIS Grid
Database Services in OGSA DAIS Grid
DS in OGSA • The Open Grid Services Architecture (OGSA) represents an evolution towards a Grid system architecture based on Web services concepts and technologies* • The described interfaces can be used as the basis of database services through participation in the OGSA • Thus many features of this architectural framework can be obtained for service creation, authorization, notification, etc. * http://www.globus.org/ogsa DAIS Grid
Requirements from OGSA • The secure connection and authentication mechanism underpins all GDS security and authentication • The lifetime management model carries over unchanged as the lifetime management model for GDS • The notification mechanism specified in OGSA appears to satisfy the GDS needs DAIS Grid
Requirements from OGSA (cont.) • It is required information about the user authorization (potentially through many intermediate grid services) • User identification services, referenced from a certificate • Certification of the services themselves may be necessary. A discovery service could be tricked to mimic the intended GDS and get the data sent • Some databases charge for their use. It is necessary to support a digital payment process DAIS Grid
Current DAIS Standards and Systems DAIS Grid
DAIS Standards • Global Grid Forum • “The Global Grid Forum (GGF) is the community of users, developers, and vendors leading the global standardization effort for grid computing.” http://www.ggf.org/ • Part of the GGF: DAIS-WG • “The group seeks to promote standards for the development of grid database services, focusing principally on providing consistent access to existing, autonomously managed databases.” https://forge.gridforum.org/projects/dais-wg DAIS Grid
OGSA-DAI System • “The aim of the OGSA-DAI project is to develop middleware to assist with access and integration of data from separate sources via the grid…and is working closely with the Global Grid Forum DAIS-WG...”http://www.ogsadai.org/ • OGSA-DAI Overview http://www.ggf.org/GGF17/materials/303/Overview.ppt • Architecture + Extensibility http://www.ggf.org/GGF17/materials/303/GGF17ArchitectureExtensibility.ppt • Supported Data Resources http://www.ggf.org/GGF17/materials/303/GGF17ArchitectureExtensibility.ppt DAIS Grid
Conclusion DAIS Grid
Conclusion • This document has made a preliminary, service-oriented proposal for integrating database functionality into a Grid setting • It is hoped that the document will provoke discussion on how best databases can be integrated with Grid middleware • There is an establish community dedicated to defining DBMS service standards, and emerging system are adopting them DAIS Grid