1 / 24

iRODS : integrated Rule Oriented Data System

iRODS : integrated Rule Oriented Data System. Ray Idaszak Director , Collaborative Environments RENCI University of North Carolina at Chapel Hill. iRODS. Integrated Rule-Oriented Data System What It I s Origins, How it works, What’s different about it Why It Is Context, Role it serves

bella
Télécharger la présentation

iRODS : integrated Rule Oriented Data System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. iRODS:integrated Rule Oriented Data System Ray Idaszak Director , Collaborative Environments RENCI University of North Carolina at Chapel Hill

  2. iRODS • Integrated Rule-Oriented Data System • What It Is • Origins, How it works, What’s different about it • Why It Is • Context, Role it serves • Where It’s Going (Today, Future) • Funding, Key efforts

  3. iRODS Talk Outline • Integrated Rule-Oriented Data System • What is the Integrated Rule-Oriented Data System? • Origins, Technology, How it works • Why It Is • Context, Role it serves • Where It’s Going (Today, Future) • Funding, Key efforts

  4. What’s Different about iRODS? • iRODS lets you manage your data with your rules and in your way… Against a backdrop of federatable community data worldwide via Policies

  5. iRODS Background • Integrated Rule-Oriented Data System • Open-source initiative that represents 12+ years of development and over $10M of NSF grant funding • Another $8M+ funding pending (via NSF DataNet) • Collaboration between • UNC Chapel Hill • Data Intensive Cyber Environments group (DICE) • RENCI • State-funded Cyberinfrastructure Institute at UNC Chapel Hill • San Diego Supercomputing Center

  6. iRODS Data and Policy Virtualization UserClient Views & Manages Data Data Grid User Sees Single “Virtual Collection” /cuahsi/catalog /cuahsi/modeling /cuahsi/terrain SDSC /cuahsi/terrain RENCI /cuahsi/modeling Utah State Univ /cuahsi/catalog The iRODS Data Grid installs in a “layer” over storage systems, so you can view, manage, access, add, and share part or all of your data in a unified Collection.

  7. iRODS Server Rule Engine iRODS Server Rule Engine Using a Data Grid - Details SDSC RENCI USU iCAT Metadata Catalog iRODS Server Rule Engine • User asks for data using logical properties (client-server) • Data request goes to 1st Server • Server looks up information in Catalog (applies rules) • Catalog responds 3rd Server has data • 1stServer peer-to-peer asks 3rd Server to serve up data • 3rd Server applies rules and serves data

  8. iRODS Server Rule Engine iRODS Server Rule Engine Using a Data Grid – NEAR FUTURE (DB Resource) MySQL PostgreSQL Oracle SDSC RENCI USU iCAT Metadata Catalog iRODS Server Rule Engine • User not running SQL Server locally makes query • Query goes to 1st Server • Server looks up information in Catalog (applies rules) • Catalog responds that 3rd Server has SQL db • 1stServer sends 3rdServer SQL query • 3rd Server applies rules and serves query result

  9. Example Clients & Client Interfaces (i.e. iRODS is client agnostic) • C library calls - Application level • .NET - Windows client API • Unix shell commands - Scripting languages • Java I/O class library (JARGON) - Web services • SAGA - Grid API • Web browser (Java-python) - Web interface • Windows browser - Windows interface • WebDAV - iPhone interface • Fedora digital library middleware - Digital library middleware • Dspace digital library - Digital library services • Parrot - Unification interface • Kepler workflow - Grid workflow • Fuse user-level file system - Unix file system • iDrop • Drag and drop GUI • User actions can be mapped to policies

  10. iRODS Policies • iRODS is described as a “Policy-based” data management system • Policy def’n: A proposed or adopted course of action • ergo iRODS associates a “course of action” for all data • Pre- and Post- “Policy Enforcement Points” (PEP) • Pre: Course of action for data coming into iRODS • Post: Course of action for data going out of iRODS

  11. iRODS Policies • Retention, disposition, distribution, arrangement • Authenticity, provenance, description • Integrity, replication, synchronization • Deletion, trash cans, versioning • Archiving, staging, caching • Authentication, authorization, redaction • Access, approval, IRB, audit trails, report generation • Assessment criteria, validation • Derived data product generation, format parsing • Federation

  12. iRODS Rule Engine, Workflows • iRODS has its own built-in imperative interpreted programming language called the Rule Engine • The iRODS Rule Engine executes Microservices • An iRODS “program” is called a Workflow • A Microservice is one “step” of an iRODS Workflow • iRODS Workflows are executed on the iRODS Server • Arbitrary external WEB-SERVICES can be one “step” of an iRODS Workflow • Encapsulated as a microservice

  13. iRODS Microservices • Microservices are written in C and provide: Well, really anything that can be done in C, and that’s in part what makes iRODS so extensible, but typically: • Standard operations; e.g. file or format conversion • Queries on metadata catalog • Interaction with web services • Triggering external HPC workflows • Remote and delayed execution control • Microservices communicate through • Arguments, session variables, user space variables, etc.

  14. Differentiating Workflows • iRODS data grid workflows • Low-complexity, a small number of operations compared to the number of bytes in the file • Server-side workflows • Data sub-setting, filtering, metadata extraction • Grid workflows • High-complexity, a large number of operations compared to the number of bytes in the file • Client-side workflows • Computer simulations, pixel re-projection

  15. A few more iRODS notes… Authentication GSI (PKI), Kerberos, Shibboleth, Challenge-response Authorization Roles, user groups, resource groups, policy constraints, ACLs Transport TCP/IP (parallel I/O streams), Reliable Blast UDP Metadata catalog PostgreSQL, mySQL, Oracle Distributed rule engine Scheduler, messaging system, execution engine, rule base

  16. iRODS Talk Outline • Integrated Rule-Oriented Data System • What is the Integrated Rule-Oriented Data System? • Origins, Technology, How it works • Why is there an Integrated Rule-Oriented Data System? • Context, Role it serves • Where It’s Going (Today, Future) • Funding, Key efforts

  17. Entire Data Life Cycle: The iRODS Vision Each data life cycle stage increases the value and usability of the original collection Project Collection Private Local Policy Data Grid Shared Distribution Policy Data Processing Pipeline Analyzed Service Policy Digital Library Published Description Policy Reference Collection Preserved Representation Policy Federation Sustained Re-purposing Policy Hydrology Datagrid grows in value to ecology and biology and federated Jeff et. al. hit jackpot: collection now accepted as ref collection for decades Results peer-reviewed and published Together w/ colleagues, analyzes data and produces results Jeff gets data from a sensor Jeff shares data with colleagues

  18. iRODS Talk Outline • Integrated Rule-Oriented Data System • What is the Integrated Rule-Oriented Data System? • Origins, Technology, How it works • Why is there an Integrated Rule-Oriented Data System? • Context, Role it serves • Where Is iRODS going Today and in the Future? • Funding, Key efforts

  19. iRODS: Future • Pending 2011 NSF DataNet • DataNet Federation Consortium (DFC) • Includes CUAHSI!! (and several others) • RENCI: Creating an “Enterprise” version of iRODS • http://iren-web.renci.org/irods-meeting/irods@renci-2011UserMeeting-contribution.pdf

  20. Summary • iRODSfills an important niche • Differentiation: It’s a Policy-driven distributed data management system formally supporting the entire Data LifeCycle • E.g. an iRODSDataGrid is a vehicle to fulfilling NSF’s Data Management Plan requirement at the community scale • Classification: Middleware • iRODS is not intended to be all encompassing, but rather work with other DataNets, Workflow Engines, systems like CUAHSI HIS, etc. in canvasing a National Cyberinfrastructure • i.e. Falls primarily in the “Data Services/Storage” portion of NSF’s Data Enabled Science description • With iRODS, the community is still responsible for: • Schema, data formats, defining policies, defining web interfaces, building analysis and knowledge tools, etc.

  21. iRODS Credits Principal Investigators Richard Marciano, Reagan Moore (PI), ArcotRajasekar Additional Contributors William Sims Bainbridge, Leesa Brieger, Luis Carriço, Sheau-Yen Chen, Michael Conway, Jason Coposky, Vijay Dantuluri, Antoine de Torcy, Wei Ding, Kevin Gamiel, Lucas Gilbert, NunoGuimarães, Chien-Yi Hou, Bernard J. ( Jim) Jansen, Oleg Kapeljushnik, MouniaLalmas, Christopher A. Lee, Xia Lin, Gary Marchionini, Cathy Marshall, Jason Reilly, Meredith Ringel Morris, Stefan Rüger, Wayne Schroeder, Michael Stealey, Lisa Stilwell, Jaime Teevan, Paul Tooby, Michael Wan, Bing Zhu

  22. iRODS Credits Research Supported By • NSF ITR 0427196, Constraint-Based Knowledge Systems for Grids, Digital Libraries, and Persistent Archives (2004–2007) • NARA supplement to NSF SCI 0438741, Cyberinfrastructure; From Vision to Reality—Developing Scalable Data Management Infrastructure in a Data Grid-Enabled Digital • NARA supplement to NSF SCI 0438741, Cyberinfrastructure; From Vision to Reality—Research Prototype Persistent Archive Extension (2006–2007) • NSF SDCI 0721400, SDCI Data Improvement: Data Grids for Community Driven Applications (2007–2010) • NSF/NARA OCI-0848296, NARA Transcontinental Persistent Archive Prototype (2008–2012)

  23. iRODS Credits For More Information http://www.irods.org http://diceresearch.org/ http://dice.unc.edu/ http://www.renci.org/news/releases/renci-teams-with-dice

  24. Thank You. http://www.renci.org

More Related