120 likes | 268 Vues
This document explores the concept of Data Grid Federation, which facilitates the controlled sharing of files and resources across multiple data grids, particularly in the context of the BaBar experiment at SLAC and in Lyon, France. It details the mechanisms for managing access control, metadata consistency, and updates within federated environments. The paper discusses various federation models, constraints, and technologies used, including SRB zones, metadata catalogs, and the integration of multiple databases and file systems. Key insights into federation approaches and management practices are presented, contributing to advancements in collaborative data management.
E N D
Data Grid Federation Arcot Rajasekar Michael Wan Reagan Moore (sekar, mwan, moore)@sdsc.edu
Data grids provide the ability to name, organize, and manage data on distributed storage resources Use case - BaBar experiment Implement independent data grids at SLAC and Lyon, France Implement federation environment for controlled sharing of files between the data grids Data Grid federation provides a way to share resources, user names, data and metadata between multiple data grids (Virtual Organizations) Data Grid Federation
Cross-register a digital entity from one collection into another Who manages the access control lists? Who maintains the context (metadata)? Consistency constraints on updates Who manages the updates (system or individual)? Do differences in constraints lead to standard federation approaches? What types of federations are possible? Federation Constraints
Each SRB zone (data grid) uses a metadata catalog (MCAT) to manage the context associated with digital content Context includes: Storage resource names User names Logical name space for files Administrative, descriptive, integrity attributes Data Grids - SRB Zones
Mechanisms to impose consistency and access constraints on: Resources Controls on which zones may use a resource User names (user-name / domain / SRB-zone) Users may be registered into another domain, but retain their home zone, similar to Shibboleth Data files Controls on who specifies replication of data Context metadata Controls on who manages updates to metadata SRB Federation Constraints
C, C++, Java Libraries Unix Shell Databases DB2, Oracle, Sybase, SQLserver,Postgres, mySQL, Informix File Systems Unix, NT, Mac OSX Archives - Tape, HPSS, ADSM, UniTree, DMF, CASTOR,ADS Data Grid Federation - zoneSRB Application DLL / Python, Perl Linux I/O OAI, WSDL, OGSA Java, NT Browsers HTTP Federation Management Consistency & Metadata Management / Authorization-Authentication-Audit Logical Name Space Latency Management Data Transport Metadata Transport Catalog Abstraction Storage Abstraction Databases DB2, Oracle, Sybase, Postgres, mySQL, Informix ORB SRM
Occasional Interchange - for specified users Replicated Catalogs- entire state information replication Resource Interaction- share resources Replicated Data Zones- no user interactions between zones Master-Slave Zones- slaves replicate data from master zone Snow-Flake Zones- hierarchy of data replication zones User / Data Replica Zones- user access from remote to home zone Nomadic Zones “SRB in a Box”- synchronize local zone to parent zone Free-floating “myZone” - synchronize without a parent zone Archival “BackUp Zone”- synchronize to an archive SRB Version 3.0.1 released December 19, 2003 Types of Federation
Characterizing federation approaches (1536 possible combinations)
Federation Approaches Peer-to-Peer Zones Free Floating Partial User-ID Sharing Occasional Interchange Partial Resource Sharing Replicated Data Hierarchical Zone Organization One Shared User-ID No Metadata Synch System Set Access Controls Complete User-ID Sharing System Controlled Complete Synch Resource Interaction Nomadic System Managed Replication System Set Access Controls System Controlled Partial Metadata Synch No Resource Sharing User and Data Replica System Managed Replication Connection From Any Zone Complete Resource Sharing Snow Flake Super Administrator Zone Control Master Slave Replicated Catalog System Controlled Complete Metadata Synch Complete User-ID Sharing Replication Zones Archival Hierarchical Zones
For More Information Reagan W. Moore San Diego Supercomputer Center moore@sdsc.edu http://www.npaci.edu/DICE http://www.npaci.edu/DICE/SRB http://www.npaci.edu/dice/srb/mySRB/mySRB.html