1 / 11

NAREGI Data Grid: Efficient Data Management System for Grid Environments

NAREGI Data Grid is a beta version software stack that provides efficient data management functionalities for grid environments. It includes a grid file system, metadata management system, and data resource management system. The system enables data import, staging, retrieval, and sharing, along with metadata assignment and storage. It also offers access security features and supports grid-wide data sharing services.

leftwich
Télécharger la présentation

NAREGI Data Grid: Efficient Data Management System for Grid Environments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NAREGI WP4(Data Grid Environment) Hideo Matsuda Osaka University

  2. NAREGI Software Stack (Beta Ver. 2006) Grid-Enabled Nano-Applications (WP6) Grid PSE Grid Visualization Grid Programming (WP2) -Grid RPC -Grid MPI WP3 Grid Workflow (WFML (Unicore+ WF)) Distributed Information Service(CIM) Data (WP4) Super Scheduler WP1 Packaging WP4 (WSRF (GT4+Fujitsu WP1) + GT4 and other services) Grid VM (WP1) Grid Security and High-Performance Grid Networking(WP5) SuperSINET NII Research Organizations IMS Major University Computing Centers Computing Resources and Virtual Organizations

  3. NAREGI Data Grid Overview Composed of three parts • Data Access Management System • Metadata Management System • Data Resource Management System Functionalities • Grid File System using AIST Gfarm • Logical to physical fIle name mapping, File replication, etc • File Import to WFT (Workflow Tool) • File Staging Support to SS (Super Scheduler) • File Retrieval by Metadata • For storing file attributes (owner, created date & condition (what application creates the file, etc.). SQL access to Metadata DB (OGSA-DA)

  4. Computation node Computation node File File Job Job NAREGI Data Grid Architecture(beta 1) WFT SS User File Import Data Access Management Metadata Management Metadata DB File Mapping File Staging or File Access Data Resource Management Fileserver#2 Fileserver#1 Fileserver#3 File3 File1 File2 Grid File System (AIST Gfarm)

  5. NAREGI Data Grid Functionalities Grid Workflow Job 1 Job 2 Job n Data Grid Components Data 1 Data 2 Data n Import data into workflow Data Access Management Place & register data on the Grid Job 1 Job 2 Metadata Management Assign metadata to data Grid-wide Data Sharing Service Meta- data Meta- data Job n Meta- data Data Resource Management Data 1 Data 2 Data n Store data into distributed file nodes Grid-wide File System

  6. NAREGI WP4: Standards Employed in the Architecture Workflow (NAREGI WFML =>BPEL+JSDL) Data Access Management Job 1 Job n Import data into workflow Tomcat 5.0.28 Data 1 Data n Place data on the Grid Super Scheduler (SS) (OGSA-RSS) Globus Toolkit 4.0.1 Data Staging OGSA-RSSFTS SC Metadata Construction Computational Nodes GridFTP Data Resource Management OGSA-DAI WSRF2.0 Job 2 Job n Job 1 OGSA-DAI WSRF2.0 PostgreSQL 8.0 Data 1 Data 2 Data n PostgreSQL 8.0 Data Specific Metadata DB Data Resource Information DB Filesystem Nodes Gfarm 1.2 PL4(Grid FS)

  7. Grid File System • Adoption of AIST Gfarm 1.2.9 http://datafarm.apgrid.org/index.en.html • UNIX file I/O API (open, close, read, write, etc.), and file/directory operations (mkdir, rmdir, ls, etc.). • Single Virtual Filesystem (Logical to Physical File Name Mapping) Client library, File servers, Metadata server (mapping & space balancing) Mapping “/gfarm/file1” to “HOST:/GFARM_DIR/file1” Extensible file space by adding many file servers • File replication to user-specified hosts. • File Metadata (management with PostgreSQL) File location (host, pathname), Username, Access/ Modify/ Creation dates, Size • Access Security with GT4 GSI API Certificate-based file access, Access using gridmapfile • WSRF interface to File Metadata with OGSA-DAI • Access Control not yet implemented (planned in Gfarm 2.0)

  8. File Import & Staging • Import files in GFS (Grid File System) to user workflows. • Imported files can be used from jobs in their local disks (file staging). • SS transfers GFS files to Job nodes.

  9. File Metadata Management • Many applications (or many instances of the same application with different. parameters) can be coupled in a workflow. • Need many input and output files of the applications. • This system provides a metadata DB storing application and parameter information as file attributes. • User can retrieve files on GFS by their metadata.

  10. Metadata Node User Client Computation Node GridVM GridFTP Server Job Local Disk Gfarm Client PostgreSQL Metadata DB OGSA-DAI Portal Server Node Metadata Management Frontend OGSA-DAI Client Gfarm Node Gfarm Server Data Access Management Frontend Local Disk File-Import Tool Data Resource Management Frontend OGSA-DAI Client Gfarm Node Gfarm Server Data Access Management Client Local Disk Workflow Tool Client Gfarm Staging Node GridFTP Server Local Disk Workflow Tool (WFT) Node WFT FIle Staging / import Interface GRAM4 Super Scheduler (SS) Node File Staging Tool Gfarm Client SS OGSA-DAI PostgreSQL Gfarm DB

  11. EGEE-NAREGI Interoperation Issues in Data Management • Data Transfer • both groups use GridFTP ? • Storage Element • NAREGI does not have this. Adoption of SRMv2.1 ? or SRB? • Replication Catalog • NAREGI only provides simple file replication (flat table). Adoption of LFC? or RLS? • Metadata Catalog • NAREGI Metadata Schema & Structure are not fixed yet. Again it is just a flat table. Adoption ?

More Related