250 likes | 377 Vues
This paper presents an innovative approach to data replication in grid environments via stripped replication (SR) methods, implemented as web services. By creating multiple copies of data across the grid infrastructure, we enhance data availability. The SR technique involves transferring file portions in parallel from various storage elements, significantly speeding up the replication process. We discuss our prototype implementation, experimental results, and optimizations, highlighting the adaptability of SR to varying network conditions. Future work aims to improve the management of the SR process.
E N D
Stripped replication for the Grid environment as a web service Marek Ciglan, Ondrej Habala, Ladislav Hluchý Institute of informatics Slovak Academy of Sciences CGW 04, Stripped replication for the grid environment as a web service
Overview • Replication in Grid environment • Principles of stripped replication (SR) method • Optimization of stripped replication • Prototype Implementation as a Web Service • Experimental Results • Future Work CGW 04, Stripped replication for the grid environment as a web service
Replication in Grid environment • Creation of multiple copies of single data source across Grid infrastructure • Replication increases data availability • RLS - Replica Location Service • Grid monitoring services – network monitoring CGW 04, Stripped replication for the grid environment as a web service
Replication in Grid environment Storage Element 1 Storage Element 2 File 1 Storage Element 3 CGW 04, Stripped replication for the grid environment as a web service
Replication in Grid environment Storage Element 1 Storage Element 2 File 1 File 1 Storage Element 3 CGW 04, Stripped replication for the grid environment as a web service
Replication in Grid environment Storage Element 1 Storage Element 2 File 1 File 1 Storage Element 3 CGW 04, Stripped replication for the grid environment as a web service
Replication in Grid environment Storage Element 1 Storage Element 2 File 1 File 1 Storage Element 3 File 1 CGW 04, Stripped replication for the grid environment as a web service
Stripped Replication - Principles • Transfer from multiple Grid sites, in parallel • Transfer only a portion of file from each Storage Element (SE) • Different file portions (stripes) are obtained from different SEs • Parallel transfer increases replication speed • If SR is not managed properly, process could be time consuming • Optimization of SR management is required CGW 04, Stripped replication for the grid environment as a web service
Stripped Replication - Optimization Replicated data source Replica 1 Replica 2 Replica 3 CGW 04, Stripped replication for the grid environment as a web service
Stripped Replication - Optimization Replicated data source Replica 1 Replica 2 Replica 3 Replica 1 Replica 2 Replica 3 CGW 04, Stripped replication for the grid environment as a web service
Stripped Replication - Optimization Replica 1 Replica 2 Replica 3 CGW 04, Stripped replication for the grid environment as a web service
Stripped Replication - Optimization Replica 1 Replica 2 Replica 3 Replica 1 Replica 2 Replica 3 CGW 04, Stripped replication for the grid environment as a web service
Stripped Replication - Optimization Replica 1 Replica 2 Replica 3 Replica 1 Replica 2 Replica 3 Replica 1 Replica 2 Replica 3 CGW 04, Stripped replication for the grid environment as a web service
SR Prototype Implementation • Java programming language • CoG 1.2 API (GridFTP interface) • Integrated with EDG Replica Location Service • EDG RLS API (RLS interface) • File Chunks – basic data units for transfer • Implemented as a Web Service ( motivation :OGSA, WSRF) CGW 04, Stripped replication for the grid environment as a web service
Service Workflow Stripped Replication Service LFN Get GUID CGW 04, Stripped replication for the grid environment as a web service
Service Workflow Stripped Replication Service LFN Get GUID Replica Metadata Catalog CGW 04, Stripped replication for the grid environment as a web service
Service Workflow Stripped Replication Service LFN Get GUID Get PFNs Replica Metadata Catalog CGW 04, Stripped replication for the grid environment as a web service
Service Workflow Stripped Replication Service LFN Get GUID Get PFNs Replica Metadata Catalog Local Replica Catalog CGW 04, Stripped replication for the grid environment as a web service
Service Workflow Stripped Replication Service LFN Stripped Replication Algorithm Get GUID Get PFNs Replica Metadata Catalog Local Replica Catalog CGW 04, Stripped replication for the grid environment as a web service
Service Workflow Stripped Replication Service LFN Stripped Replication Algorithm Get GUID Get PFNs . . . Replica Metadata Catalog Local Replica Catalog GridFTP Site 1 . . . GridFTP Site N CGW 04, Stripped replication for the grid environment as a web service
Service Workflow Stripped Replication Service LFN Stripped Replication Algorithm Register Replica Get GUID Get PFNs . . . Replica Metadata Catalog Local Replica Catalog GridFTP Site 1 . . . GridFTP Site N CGW 04, Stripped replication for the grid environment as a web service
Properties of Stripped Replication • Parallel transfer from multiple sites increases replication process speed • Proposed optimization does not use network monitoring services • SR adapts to varying nature of network load • SR optimally distributes network load CGW 04, Stripped replication for the grid environment as a web service
Experimental Results • Motivation test case • File size 223.9Mb • Best replica transfer with standard replication tool (EDG rm) - 713 sec • Stripped replication (2 replicas) – 405 sec (43 %) • Stripped replication (3 replicas) – 209 sec (71 %) • Average time saving • 2 replicas – 37% time saving • 3 replica – 55% time saving CGW 04, Stripped replication for the grid environment as a web service
Future Work • Implementation refinement • Add logging functionality • Refine error states handling • Evaluation of SR integration in Grid projects CGW 04, Stripped replication for the grid environment as a web service
Thank you for your attention ! CGW 04, Stripped replication for the grid environment as a web service