1 / 29

Grid application development with gLite and P-GRADE Portal

Grid application development with gLite and P-GRADE Portal. Miklos Kozlovszky MTA SZTAKI m.kozlovszky@sztaki.hu. Presenter. MTA SZTAKI (Hungarian Academy of Sciences) Laboratory of Parallel and Distributed Systems www.lpds.sztaki.hu Miklos Kozlovszky EGEE-III (Enabling Grids for E-sciencE)

overbeck
Télécharger la présentation

Grid application development with gLite and P-GRADE Portal

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid application development with gLite and P-GRADE Portal Miklos Kozlovszky MTA SZTAKI m.kozlovszky@sztaki.hu

  2. Presenter • MTA SZTAKI (Hungarian Academy of Sciences)Laboratory of Parallel and Distributed Systemswww.lpds.sztaki.hu • Miklos Kozlovszky • EGEE-III (Enabling Grids for E-sciencE) • GASUC Team • Trainings and dissemination activities • SEE-GRID2 / SEE-GRID-SCI (South Eastern European GRID-enabled eInfrastructure Development) • Manager of “Dissemination and Training” (WP5/NA3)

  3. Introduction of LPDS(Lab of Parallel and Distr. Systems) • Research division of MTA SZTAKI from 1998 • Head: Peter Kacsuk, Prof. • 22 research fellows • Foundation member • Central European Grid Consortium (2003) • Hungarian Grid Competence Center (2003) • Participant or coordinator in many European and national Grid research, infrastructure,andeducational projects (from 2000) • FP5: GridLab, DataGrid • FP6: EGEE I-II, SEE-GRID I-II, CoreGrid, ICEAGE, CancerGrid • FP7: EGEE III, SEE-GRID-SCI, EDGeS (coordinator), ETICS, S-CUBE • Central European Grid Training Center in EGEE (from 2004) www.lpds.sztaki.hu

  4. Webpage http://indico.in2p3.fr/conferenceDisplay.py?confId=1265 Find it from EGEE User Forum Webpage OR EGEE Training webpage (Google EGEE NA3) • http://www.egee.nesc.ac.uk/  Events and registration (top menu)  ..., Paris, December10-13 Save the direct link! • Long term storage of training material • Presentations in PPT • Tutorials in HTML/DOC/PDF

  5. Feedback form • Your comments and feedbacks are highly valuable for EGEE training • Please fill in the feedback form and return at the end of the course • Anonymous • Scores: 1 - 6 (very bad - very good) • Comments are highly appreciated

  6. Goals of the day • Basic concepts of • Workflow • Parameter study on EGEE • Implementation in P-GRADE Portal • Further information • How to learn more • How to get access to EGEE • How to port your own application to EGEE

  7. Agenda • Application development on gLite * • Workflow and parameter study concepts on EGEE • Workload management and data services in gLite • Workflow and parameter study support in P-GRADE Portal • Hands-on • Workflow exercises • Parameter study exercises • How to learn more * = (mostly skipped, please refer to previous presentations from yesterday)

  8. Agenda • Application development on gLite • Workflow and parameter study concepts on EGEE • Workload management and data services in gLite • Workflow and parameter study support in P-GRADE Portal • Hands-on • Workflow exercises • Parameter study exercises • How to learn more

  9. Where computer science meets the application communities! The tools, services used by the VO’s applications NA4 Recommended External Software Packages for Egee CommuniTies Current RESPECT tools: GridWay P-GRADE Portal http://egeena4.lal.in2p3.fr/  “Grid software” menu EGEE grid, gLite middleware Application Application toolkits Command line & APIs Higher-level gLite services (WMS,...) Production infrastructure contains these services • High level services: help the users building their computing infrastructure but should not be mandatory • Basic services: Must be complete and robust; Should not assume the use of Higher-Level Grid Services Basic gLite services:CE, SE, info, security

  10. VO concept • gLite middleware runs on each EGEE site to provide • Data services: Computing Element • Computation services:Storage Element • Security service • Sites and users form Virtual Organisations: basis for collaboration • Each VO can / must have central software services and support groups INTERNET P-GRADEPortal

  11. File and Replica Catalog User Interface Resource Broker Computing Element Storage Element Site X Basic gLite use case:Job submission Information System Submit job (executable + small inputs) query Retrieve status & (small) output files create proxy query publish state Submit job Retrieve output Job status Logging Register file Input file(s) Job status process VO Management Service(DB of VO users) Output file(s) Logging and bookkeeping

  12. Obtain a certificate from a recognized CA: www.gridpma.org – Find the official CA of your country 1 year long, renewable certificates Accepted in every EGEE VO GILDA CA – two weeks long, renewable certificate Accepted only in GILDA training VO (VO to be used today) Find and register at a VO List of VOs with Usage rules: CIC Operations portal: http://cic.gridops.org/ Scientific discipline Geographical region Use the VO services Through (low level) command line tools of gLite (Not today) Through high level tools E.g. P-GRADE Portal, GENIUS, GANGA, ... Access mechanism varies from tool to tool Obtaining certificate:Annually Joining VO:Once How can I get access to EGEE? CA VO manager VO Membership Service VOMS database Grid sites

  13. Application developer’s questions • I have a computational intensive problem How does it relate to this scenario? • What is a grid job for me? • How many jobs do I have, how they relate to each other and to my data? • What is the input / output data for each job? • How to write a job to access input / output data? • How to submit, monitor the job? How to access their results? • Do I need to use additional services to my the application demands? • Answers • Now (sometimes specifically on P-GRADE Portal) • Or any time later for general purpose from Grid Application Support group (GASuC) www.lpds.sztaki.hu/gasuc

  14. Functional Vs Data parallelism • Functional Decomposition (Functional Parallelism) • Decomposing the problem into different jobs which can be distributed to different CEs for simultaneous execution • Different executablesrun on different CEs (and may or may not process the same data) • Good to use when • When the data cannot be partitioned • there is not static structure or fixed determination of number of calculations to be performed

  15. Functional decomposition The problem Job submission Job 3on Computing Element #3 Job 4on Computing Element #4 Job 1on Computing Element #1 Job 2on Computing Element #2 Job monitoring Result download time

  16. Functional decomposition in practice: workflow The problem e.g. P-GRADE Portal server Job submission Workflow manager Job monitoring Datadependency Datadependency Result transfer Job submission Job monitoring Datadependency Datadependency Job submission Job monitoring Result download time

  17. Functional Vs Data parallelism • Data Decomposition (Data Parallelism) • Partitioning the problem's data domain and distributing portions to multiple instances of the same job for simultaneous execution • Same executableruns on different CEs and processdifferent data • Good to use for problems where: • data is static (e.g. factoring, solving large matrix or finite difference calculations, parameter studies) • dynamic data structure tied to single entity where entity can be subsetted (large multi-body problems) • domain is fixed but computation within various regions of the domain is dynamic (fluid vortices models) • > 90% of grid applications employ data parallelism (parameter study, parametric study)

  18. Data decomposition The problem Algorithm Data segment 1 Data segment 2 Data segment 3 Data segment 4 Job submission Job 2on Computing Element #2 Job 4on Computing Element #4 Job 1on Computing Element #1 Job 3on Computing Element #3 Job monitoring Result download time

  19. Data decomposition in practice:Master-slave Master process, e.g. P-GRADE Portal server Generate inputs Master job Inputs Spawn slaves Job submit Monitor slaves Slave job Slave job Slave job Slave job Collect results Get job output Results Generate final result Final result

  20. Generate inputs Master job Input Spawn slaves Job submit Monitor slaves Slave job Slave job Slave job Slave job Check job status Collect results Get job output Results Generate final result Final result Multi-level master-slave Generate inputs Master job Input Spawn slaves Job submit Monitor slaves Slave job Slave job Slave job Slave job Check job status Collect results Get job output Results

  21. Complex master-slave Master job Generate inputs input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect results results Generate inputs input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect results results Generate inputs input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect results results Generate final result Final result

  22. 3 input 9 input 9file 3file 3 x 9 = 27WF 27output Complex master-slave = Parameter study workflow Generate local inputs Master job Workflow manager input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect local results results Generate local inputs input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect local results results Generate local inputs input Spawn slaves Monitor slaves Slave job Slave job Slave job Slave job Collect local results results Generate result Final result

  23. Defining a job • Executable (EGEE runs Scientific Linux v3 or v4) • Script: • No compilation is necessary • Can invoke real executable which is statically installed on the CE (VOBox) • Binary: • Must be compiled on the User Interface  binary compatibility with EGEE is guaranteed • Statically linked  to avoid errors caused by library versions • Input / output data • Input files • Smaller than 20 MByte? • If YES transfer them from client side (“InputSandbox” ) • If NOT upload them into Storage element before job submission • Output files • Smaller than 20 MByte? • If YES transfer them back to client side (“OutputSandbox”) • if NOT upload them into Storage element from Computing Element

  24. Distribution of large datasets • Puts large files into Storage Elements and register them in Logical File Catalog (LFC) (covered already during previous sessions) • Large files do not go through the broker Inputs Master job Generate local inputs LFC & SEs Logical File Names Spawn slaves Broker Job submit Monitor slaves Slave job Slave job Slave job Slave job Check job status Collect local results Broker Get job output Logical File Names Generate result LFC & SEs Results

  25. File services in gLite • Users’ files are stored on Storage Elements • A file on a SE is identified by a Storage URL (e.g. sfn://grid005.iucc.ac.it/flatfiles/SE00/gilda/generated/2007-06-23/filec79a9e3c-2485-4206-a2a5-235f) • User refer to files by Logical File Names (LFN) • LFC = directory structure of LFNs + pointers to SURLs (Files can have replicas) lfn:/grid/gilda/kozlovszky/run2/ input1 Storage Element 1sfn://grid005.iucc.ac.il/storage/gilda/generated/2007-06-23/fileb233d43f-5bc6-4ede-a5fe-611d48be2ba5 input2 Storage Element 2srm://aliserv6.ct.infn.it/dpm/ct.infn.it/home/gilda/generated/2007-06-23/filea21ab3e2-8ff6-4a44-82a7-f2 input3 Storage Element 3sfn://trigriden01.unime.it/flatfiles/SE00/gilda/generated/2007-06-23/filec79a9e3c-2485-4206-a2a5-235f LFC Storage Element 4sfn://grid005.iucc.ac.it/flatfiles/SE00/gilda/generated/2007-06-23/filec79a9e3c-2485-4206-a2a5-235f

  26. LFC has a directory tree structure lfn:/grid/<VO_name>/<you create it> LFC Namespace Defined by the user Name conventions • Users primarily access and manage files through “logical filenames” Today: lfn:/grid/gilda/parisXX/. . .

  27. Managing a workload with gLite command line tools • Login to the User Interface machine • Write your jobs. Operations in a job: • Access LFC, resolve LFN • Access SE, get file content • Process file • Write result to SE • Register file in LFC • (Compile your jobs to get the executables) • Write a job description for each job using Job Description Language (JDL) • Text file • Specifies Executable, Input and Output LFNs • Specifies resource requirements and preferences (Which CE) • Write the description of your workload • Workflow JDL or parametric job JDL (No parametric workflow!)  myworkload.jdl • Use shell commands to • Submit the workload: glite-wms-job-submit myworkload.jdl  wlID • Monitor the status: glite-wms-job-status wlID • Get the output sandbox:glite-wms-job-output wlID • Write a program (e.g. script) to • Register input files in LFC before the workload is started • Resubmit failed jobs • Download result files from Storages when wokrload is finished

  28. Managing a workload with gLite command line tools • Login to the User Interface machine • Write your jobs. Operations in a job: • Access LFC, resolve LFN • Access SE, get file content • Process file • Write result to SE • Register file in LFC • (Compile your jobs to get the executables) • Write a job description for each job using Job Description Language (JDL) • Text file • Specifies Executable, Input and Output LFNs • Specifies resource requirements and preferences (Which CE) • Write the description of your workload • Workflow JDL or parametric job JDL (No parametric workflow!)  myworkload.jdl • Use shell commands to • Submit the workload: glite-wms-job-submit myworkload.jdl  wlID • Monitor the status: glite-wms-job-status wlID • Get the output sandbox:glite-wms-job-output wlID • Write a program (e.g. script) to • Register input files in LFC before the workload is started • Resubmit failed jobs • Download result files from Storages when wokrload is finished Or use P-GRADE Portal `

  29. Further information, references • EGEE • http://www.eu-egee.org/ • gLite middleware • http://www.glite.org • gLite manuals, documentation • http://glite.web.cern.ch/glite/documentation/(gLite user guide) • Recommended External Software Packages for EGEE Communities (RESPECT) • http://egeena4.lal.in2p3.fr/ • P-GRADE Grid Portal • http://portal.p-grade.hu/ • P-GRADE Grid Portal (Here to login…) • http://portal.p-grade.hu/multi-grid

More Related