1 / 27

Data Grid Services for PNC

Data Grid Services for PNC. Wei-Long, Ueng Academia Sinica Grid Computing Center wlueng@twgrid.org. Outlines. Introduction Data Grid Architecture Common Data Grid Services Application Integration Summary. Information Management Technologies. Data collecting

lyle
Télécharger la présentation

Data Grid Services for PNC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Grid Services for PNC Wei-Long, Ueng Academia Sinica Grid Computing Center wlueng@twgrid.org

  2. Outlines • Introduction • Data Grid Architecture • Common Data Grid Services • Application Integration • Summary

  3. Information Management Technologies • Data collecting • Sensor systems, object ring buffers and portals • Data organization • Collections, manage data context • Data sharing • Data grids, manage heterogeneity • Data publication • Digital libraries, support discovery • Data preservation • Persistent archives, manage technology evolution • Data analysis • Processing pipelines, manage knowledge extraction

  4. Managing Data • Historically data has been STORED rather than MANAGED • Problems arising from this include: • Scaling • Distribution • Access Control, Authentication, Security • Data Migration • Data Creation

  5. Data Management Concepts • Collection • The organization of digital entities to simplify management and access. • Context • The information that describes the data objects in a collection. • Content • The data objects in a collection

  6. Data Management Challenges • Distributed data sources • Management across administrative domains • Heterogeneity • Multiple types of storage repositories • Scalability • Support for billions of digital entities,PetaBytes of data • Preservation • Management of technology evolution

  7. Data Grids • Distributed data sources • Inter-realm authentication and authorization • Heterogeneity • Storage repository abstraction • Scalability • Differentiation between context and content management • Preservation • Support for automated processing (migration, archival processes)

  8. Data Grid Transparencies • Find data without knowing the identifier • Descriptive attributes • Access data without knowing the location • Logical name space • Access data without knowing the type of storage • Storage repository abstraction • Retrieve data using your preferred API • Access abstraction • Provide transformations for any data collection • Data behavior abstraction

  9. Data Grid Goals • Automate all aspects of data analysis • Data discovery • Data access • Data transport • Data manipulation • Automate all aspects of data collections • Metadata generation • Metadata organization • Metadata management • Preservation

  10. Data Grid Components • Federated client-server architecture • Servers can talk to each other independently of the client • Infrastructure independent naming • Logical names for users, resources, files, applications • Collective ownership of data • Collection-owned data, with infrastructure independent access control lists • Context management • Record state information in a metadata catalog from data grid services such as replication • Abstractions for dealing with heterogeneity

  11. Storage Resource Broker • Developed at San Diego Supercomputer Center • A distributed file management system (Data Grid), based on a client-server architecture. • Allows users to access files seamlessly across a distributed environment, based upon their attributes rather than just their names or physical locations. • It replicates, syncs, archives, and connects heterogeneous resources in a logical and abstracted manner.

  12. Oracle RDBMS Oracle Client SRB Server SRB Server SRB Server SRB Server User @ location X Storage Driver Storage Driver Storage Driver Storage Space Storage Space Storage Space SRB Physical Structure SRB Vault @ location B SRB Vault @ location B SRB Vault @ location D

  13. PNC SRB Server SRB Server SRB Server SRB Server SRB Storage Servers SRB Storage Servers SRB Server SRB Server SRB Server SRB Server SRB Storage Servers SRB Storage Servers PNC Data Grid Service Framework App App App App App PNC-SRB Multiple Servers Web Server MES MES MES MES MCAT Server Oracle Client DB-Instance-1 DB-Instance-2 Oracle RAC Database Server MCAT Database Schema1 Schema2 Schema3 Schema4

  14. SRB DB For inter-organizational collaboration Trust Relation Trust Relation TWGrid Zone PNC Zone ECAI Zone SRB SRB DB DB

  15. Common Services • Data Services • Data Object Service • Profile Service • Web Grid Service • Data I/O Service • Application Service • Security Services • VO Service • CA Service

  16. Common Services (Cont.) • Catalog and Archive Services • Catalog and Archive Web Application • Metadata Services • Metadata Service • Metadata Web Application • Data Identifier Services • Query Service • Query Expression

  17. Common Services (Cont.) • Server Management Services • Server/Network Controller • Server/Network Monitor • Server Manager Application • Resources Monitoring

  18. Portal

  19. Applications Integration • Digital Archives/Digital Library • Bioinformatics • Atmosphere Science • GIS • Earth Observation Science • Biodiversity • …

  20. Digital Archives Data Grid

  21. Statistics of DADG (till 26 Oct. 2005)

  22. Applications Portal/Web Client Users Linux MCAT / srb001 NTU/monsoon(TB) windows NTNU/dms ASCC/lcg00104(TB) NCU/databank ASCC/srb002 ASCC/lcg00105(TB) ASCC/gis212(TB) NTU/dbar_rs1, dbar_rs_2 ASCC/gis252(TB) Atmosphere Science Integration Command Lines

  23. GIS Applications Integration

  24. Web User Interface

  25. Resources Monitoring

  26. Summary • By integrating data grids, digital libraries, and persistent archives we will be able to maintain the consistency of federated data collections while flowing information and data from digital entities through grid services into preservation environments. • Imagine what we can do for your project.

  27. Many Thanks for Your Attention

More Related