1 / 129

Building a caGrid Node

Building a caGrid Node. Getting Started. This course is self-paced, but is also hands-on. Therefore, you will need to download an accompanying code base to complete the exercises associated with this course. The code base can be downloaded from:

Télécharger la présentation

Building a caGrid Node

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building a caGrid Node

  2. Getting Started • This course is self-paced, but is also hands-on. Therefore, you will need to download an accompanying code base to complete the exercises associated with this course. • The code base can be downloaded from: • http://pir.georgetown.edu/~suzek/caBIG_Bootcamp_July2008/gridPIR_Training.zip •  From the download: • 1) Copy gridPIR_Training.zip to "C:\“ on your machine • 2) Unzip gridPIR_Training.zip (overwrite existing files if needed) • ** Do not proceed with this course, until you have downloaded and unzipped the above files. You will not be able to complete the exercises without this zip file.**

  3. Session Goals By the end of this session, you will be able to: • Use Introduce to create a node on the Grid • More specifically: • Install a public data service on caGrid (no security) • Deploy a data service using caCORE SDK-generated artifacts and the Introduce toolkit • Use caGrid Data Services • Use caGrid Metadata Service APIs • Use caGrid for Semantic Interoperability

  4. Lessons • Lesson 1: Installing caGrid for Data Service Deployment • Lesson 2: Deploying a caGrid Data Service • Lesson 3: Using caGrid Data Services • Lesson 4: Using caGrid Metadata Service APIs • Lesson 5: Using caGrid for Semantic Interoperability

  5. Lesson 1: Installing caGrid for Data Service Deployment

  6. Lesson 1: Installing caGrid for Data Service DeploymentOverview • In this Lesson, we will cover: • caGrid Overview • Steps involved in caGrid installation • There will be an exercise to install caGrid 1.2 on a Windows machine

  7. Lesson 1: Installing caGrid for Data Service Deployment Outline • Overview • caGrid • caGrid Infrastructure • Step-by-step caGrid Installation for Data Service Deployment

  8. Lesson 1: Installing caGrid for Data Service Deployment What is caGrid? Development project of Architecture Workspace Service oriented infrastructure that supports caBIG An architecture that allows building a grid of your own Enables collaborating institutions to share information and analytical resources efficiently and securely

  9. Lesson 1: Installing caGrid for Data Service Deployment caGrid Community Involvement caGrid itself provides no real “data” or “analysis” to caBIG™; its the enabling infrastructure which allows the community to develop Analytical Services Data Services Community members add value to the grid as applications, services (data/analytical), and processes caGrid provides the necessary core services, APIs, and tooling Community members develop end user applications/clients which consume the resources provided on the grid

  10. Lesson 1: Installing caGrid for Data Service Deployment caGrid Infrastructure Client and service APIs are object oriented, and operate over well-defined and curated data types Objects are defined in UML and converted into ISO/IEC 11179 Administered Components, which are in turn registered in the Cancer Data Standards Repository (caDSR) Object definitions are drawn from controlled terminology and the vocabulary is registered in the Enterprise Vocabulary Services (EVS), and their relationships are thus semantically described Objectsare serialized to XML that adhere to XML schemas registered in the Global Model Exchange (GME)

  11. Lesson 1: Installing caGrid for Data Service Deployment caGrid Infrastructure – cont’d • Service and the hosting center metadata is registered in Index Service

  12. Lesson 1: Installing caGrid for Data Service Deployment caGrid Installation: Before starting • Dowload caGrid 1.2 Installer http://gforge.nci.nih.gov/frs/download.php/3738/caGrid-installer-1.2.zip • Setting environment variables • JAVA_HOME : Location of Java JDK 1.5.X • ANT_HOME: Location of Ant 1.6.5 • CATALINA_HOME: Location of Tomcat ver. 5.0.28 • GLOBUS_LOCATION: Location of Globus Toolkit ver. 4.0.3 If not available, caGrid Installer installs Ant, Globus Toolkit and/or Tomcat • Unzip caGrid-installer-1.2.zip • Run caGrid installer: • java -jar caGrid-installer-1.2.jar

  13. Lesson 1: Installing caGrid for Data Service Deployment caGrid Installation: License Agreement

  14. Lesson 1: Installing caGrid for Data Service Deployment caGrid Installation: Installation Types • Choose any combination of installation types to install one or more caGrid components • For data service deployment, choose options “Install caGrid” and “Configure Container” • Not a “secure” installation since our data service is public; additional steps such as securing container is required for a secure caGrid node

  15. Lesson 1: Installing caGrid for Data Service Deployment caGrid Installation: Service Container • Choose Tomcat or Globus as service containers

  16. Lesson 1: Installing caGrid for Data Service Deployment caGrid Installation: Prerequisites Ant Tomcat Globus Toolkit • Install (or reinstall) prerequisite software

  17. Lesson 1: Installing caGrid for Data Service Deployment caGrid Installation: Location • Provide the directory where caGrid will be installed

  18. Lesson 1: Installing caGrid for Data Service Deployment caGrid Installation: Target Grid • Choose one of the available grids: • NCICB Development • NCICB Production • NCICB QA • OSU Development • OSU Training • and more Each target grid basically uses different URLs for caGrid core services. For instance service URLs for OSU Training Grid are: cagrid.master.index.service.url=http://training03.cagrid.org:6080/wsrf/services/DefaultIndexService cagrid.master.cadsr.service.url=http://training02.cagrid.org:6080/wsrf/services/cagrid/CaDSRService cagrid.master.gme.service.url=http://training02.cagrid.org:6080/wsrf/services/cagrid/GlobalModelExchange cagrid.master.gridgrouper.service.url=https://training03.cagrid.org:6443/wsrf/services/cagrid/GridGrouper cagrid.master.dorian.service.url=https://dorian.cagrid.org:6443/wsrf/services/cagrid/Dorian

  19. Lesson 1: Installing caGrid for Data Service Deployment caGrid Installation: Container Configuration • Securing container is needed to host secure services. • Secure services are those that require clients to use one of the Globus Security Infrastructure (GSI) authentication mechanisms.

  20. Lesson 1: Installing caGrid for Data Service Deployment caGrid Installation: Completion

  21. Lesson 1: Installing caGrid for Data Service Deployment Additional Information • caGrid Wiki: • http://www.cagrid.org/mwiki/index.php?title=CaGrid • caBIG™ Architecture WS caGrid Web Page: • https://cabig.nci.nih.gov/workspaces/Architecture/caGrid/

  22. Lesson 1: Installing caGrid for Data Service Deployment Exercise • Install a caGrid 1.2 on your local machine using instructions locatedat: https://gforge.nci.nih.gov/docman/view.php/196/13043/01_Installing_caGrid.doc

  23. Lesson 2: Deploying a caGrid Data Service

  24. Lesson 2: Deploying a caGrid Data ServiceOverview • In this Lesson, we will cover: • Steps involved in deployment/creation of caGrid Data Service using caCORE SDK 3.2.1 generated artifacts • Selecting type of service and template • Selecting domain model • Selecting the schema • Providing metadata (hosting site/POC etc) • Deploying the service • There will be exercises to create and deploy gridPIR on local caGrid node

  25. Lesson 2: Deploying a caGrid Data Service Outline • Overview • Major steps for deployment • Introduce Toolkit • Step-by-step deployment of a Data Service; gridPIR

  26. Lesson 2: Deploying a caGrid Data Service caGrid Data Service Deployment – Major steps Provide client and service APIs that are object oriented Provide objects that are defined in UML and registered in the Cancer Data Standards Repository (caDSR) Provide object definitions drawn from controlled terminology and vocabulary registered in the Enterprise Vocabulary Services (EVS) Provide XML schemas used for XML serialization of objects that are registered in the Global Model Exchange (GME)

  27. Lesson 2: Deploying a caGrid Data Service caGrid Data Service Deployment – Major steps • Provide service metadata about the service and the center where service is deployed

  28. Lesson 2: Deploying a caGrid Data Service Service Metadata (Domain Model Portion) <ns135:UMLAttributedataTypeName="CHARACTER" description="UniProtKB primary accession number." name="uniprotkbPrimaryAccession" publicID="2322254" version="1.0"> <ns135:SemanticMetadata conceptCode="C25402" conceptDefinition="A control number unique …..” conceptName="Accession Number" order="1"/> ….. <ns135:ValueDomain longName="Protein UniProtKB Primary Accession Number Genomic Identifier"> <ns135:enumerationCollection/> </ns135:ValueDomain> </ns135:UMLAttribute>

  29. Lesson 2: Deploying a caGrid Data Service Introduce: Grid Service Authoring Toolkit • An open-source and extensible toolkit • Supports easy development and deployment of WS/WSRF compliant Grid services by hiding low level details of the Globus Toolkit • Enables the implementation of strongly-typed Grid services • Facilitates caGrid data service development using caCORE SDK artifacts through pluggable service styles

  30. Lesson 2: Deploying a caGrid Data Service Grid-enablement of Protein Information Resource (gridPIR) • A data service to provide comprehensive and fully annotated protein related information for genomic and proteomic cancer research • Developed using model driven approach and caCORE SDK 3.2.1 • All data is public so no security layer implemented

  31. Lesson 2: Deploying a caGrid Data Service Introduce: Create a caGrid Service ant introduce Modify an existing service Deploy an existing service Browse Data Types from caDSR or GME

  32. Lesson 2: Deploying a caGrid Data Service Introduce: Enter service information • An analytical service exposes operation(s) with input/output objects • A data service exposes objects that presents the data resource

  33. Lesson 2: Deploying a caGrid Data Service Introduce: Data Service Configuration Different Service Styles (including caCORE SDK) supported. gridPIR is generated using caCORE SDK v3.2.1 Optional extensions for Bulk Data Transfer or Web Services Enumeration

  34. Lesson 2: Deploying a caGrid Data Service Introduce: caCORE SDK-generated Client Selection Two options for client selection: Option 1:Use remote API if data service caCORE-like system (API) and caGrid Data Service are on the different machines Option 2:Use local API if both caCORE-like system (API) and caGrid Data Service are deployed on the same machine

  35. Lesson 2: Deploying a caGrid Data Service Introduce: Remote API Selection (Option 1) Library folder (including client jar) generated by caCORE SDK

  36. Lesson 2: Deploying a caGrid Data Service Introduce: Remote API Selection(Option 1) (cont’d) Treat all queries case-insensitive Use Common Security Module Enter URL for remote caCORE-like gridPIR API (publicly accessible)

  37. Lesson 2: Deploying a caGrid Data Service Introduce: Local API Selection (Option 2) Library (including client jar) and configuration folders are generated by caCORE SDK

  38. Lesson 2: Deploying a caGrid Data Service Introduce: Local API Selection (Option 2)(cont’d) Treat all queries case-insensitive

  39. Lesson 2: Deploying a caGrid Data Service Introduce: Choose objects (model) service exposes 1. Fetch models from caDSR 4. Add selected packages 2. Select gridPIR model v1.2 3. Select package from gridPIR model

  40. Lesson 2: Deploying a caGrid Data Service Introduce: Choose XML Schema Find schemas from GME (if registered) OR Resolve schemas manually

  41. Lesson 2: Deploying a caGrid Data Service Introduce: Choose XML Schema – Manual Resolution (cont’d) XSD generated by caCORE SDK

  42. Lesson 2: Deploying a caGrid Data Service Introduce: Enter Service Description 1. Select Metadata Tab 2. Select ServiceMetadata row 3. Edit Property

  43. Lesson 2: Deploying a caGrid Data Service Introduce: Enter Service Metadata (cont’d) • Enter: • POC • Hosting Center • Address

  44. Lesson 2: Deploying a caGrid Data Service Introduce: Deploy gridPIR Data Service Deploy an existing service

  45. Lesson 2: Deploying a caGrid Data Service Introduce: Select Data Service Location in the File System Compiled service stubs Metadata files Library files XML schemas Source code for service stubs

  46. Lesson 2: Deploying a caGrid Data Service Introduce: Select Data Service Location in the File System Container information Register to Index Service? URL for Index Service

  47. Lesson 2: Deploying a caGrid Data Service Verify Deployment URL for deployed service

  48. Lesson 2: Deploying a caGrid Data Service Additional Information • caGrid Wiki: • http://www.cagrid.org/mwiki/index.php?title=CaGrid • Introduce Toolkit Wiki: • http://www.cagrid.org/mwiki/index.php?title=Introduce • caGrid Data Services Wiki: • http://www.cagrid.org/mwiki/index.php?title=Data_Services • caBIG™ Architecture WS caGrid Web Page: • https://cabig.nci.nih.gov/workspaces/Architecture/caGrid/

  49. Lesson 2: Deploying a caGrid Data Service Exercise • Deploy a caGrid data service for gridPIR on your local machine using instructions located at: https://gforge.nci.nih.gov/docman/view.php/196/13044/02_Creating_caGrid_Node.doc and https://gforge.nci.nih.gov/docman/view.php/196/13045/03_Deploying_caGrid_Node.doc

  50. Lesson 3: Using caGrid Data ServicesOverview • In this Lesson, we will cover: • Ways to query data services • Creating and running queries in caBIG Query Language (CQL) • There will be exercise to create and run CQL queries using command line against gridPIR production.

More Related