380 likes | 542 Vues
Deployment and Wrapping of legacy applications with ProActive. Stephane Mariani Stephane.Mariani@sophia.inria.fr http://ProActive.ObjectWeb.org OASIS Team. March, 13th. Outline. Motivation Research Goals Infrastructure Model Deploying Simply deploy Simply program Interacting
E N D
Deployment and Wrapping of legacy applications with ProActive Stephane Mariani Stephane.Mariani@sophia.inria.fr http://ProActive.ObjectWeb.org OASIS Team March, 13th
Outline • Motivation • Research Goals • Infrastructure Model • Deploying • Simply deploy • Simply program • Interacting • Concept • System overview • Programming • Conclusion • Future Works Stephane Mariani
Motivation • Grid Computing: large scale resource sharing, high performance. • We want to run parallel programs, or couple parallel applications. • Parallel programming requires a suitable programming model: e.g message passing (MPI), threads (OpenMP)… • MPI implementations for Grids are available: e.g MPICH-G2 • weaknesses: not designed for grid requirements: e.g MPICH-G2 requires the Globus toolkit to be installed in all domains (to adress the issues of security, cross domain communications). • ProActive: provides an API to simplify the programming of Grid Computing applications: on LAN, cluster, Internet Grids. Stephane Mariani
Research Goals • Ease the deployment and interaction between several MPI (legacy) codes on the Grid • Use available resources transparently • Manage the execution and interoperability of parallel MPI applications on multiple ressources in Grid environment (intranet desktop grids, clusters) • High-level programming model Stephane Mariani
Infrastructure model DeployingInteracting Stephane Mariani
Deploying Stephane Mariani
Internet Deploying Cluster Cluster Intranet Desktop Grid Stephane Mariani
Internet Deploying mpi application mpi application Cluster mpi application Cluster Intranet Desktop Grid Stephane Mariani
Deploying Simply deploy A key principle: Virtual Node (VN) + XML deployment file • Mapping of VN to JVMs • Register or Lookup VNs • Create or Acquire JVMs • Identified as a string name • Used in program source • Configured (mapped) in the XML descriptor file --> Nodes Stephane Mariani
VirtualNode Internet VirtualNode Deploying mpi application mpi application Cluster (PBS) mpi application VirtualNode Cluster (LSF) Intranet Desktop Grid Stephane Mariani
Deploying Simply deploy Virtual Node definition <virtualNodesDefinition> <virtualNode name="deskGridVN"/> <virtualNode name="clustLSFVN"/> <virtualNode name="clustPBSVN"/> </virtualNodesDefinition> <map virtualNode="deskGridVN"> <jvmSet> <vmName value="deskGridJvm"/> </jvmSet> </map> <map virtualNode="clustLSFVN"> <jvmSet> <vmName value="clustLSFJvm"/> </jvmSet> </map> <map virtualNode="clustPBSVN"> <jvmSet> <vmName value="clustPBSJvm"/> </jvmSet> </map> How to deploy? Definition of Virtual Nodes Definitions and mapping Mapping of Virtual Nodes Stephane Mariani
Deploying Simply deploy JVMs definition <jvm name="deskGridJvm"> <creation> <processRef id=”deskGridProcess”/> </creation> </jvms> <jvm name="clustLSFJvm"> <creation> <processRef id=”clustLSFProcess”/> </creation> </jvms> <jvm name="clustPBSJvm"> <creation> <processRef id=”clustPBSProcess”/> </creation> </jvms> How to deploy? mapping Stephane Mariani
Deploying Simply deploy Process or Service definition <processDefinition id=”deskGridProcess”> <dependentProcessSequence class=”org.objectweb.proactive.core.process.DependentListProcess”> <serviceRef refid=”p2pservice”/> <processRef refid=”mpiProcessDeskGrid”/> </processDefinition> <processDefinition id=”clustLSFProcess”> <dependentProcessSequence class=”org.objectweb.proactive.core.process.DependentListProcess”> <processRef refid=”LSFProcess”/> <processRef refid=”mpiProcessClustLSF”/> </processDefinition> <processDefinition id=”clustPBSProcess”> <dependentProcessSequence class=”org.objectweb.proactive.core.process.DependentListProcess”> <processRef refid=”PBSProcess”/> <processRef refid=”mpiProcessClustPBS”/> </processDefinition> How to deploy? Infrastructure informations Stephane Mariani
Deploying Simply deploy MPI process definition <processDefinition id=”mpiProcessClustPBS”> <mpiProcess class=”org.objectweb.proactive.core.process.mpi.MPIDependentProcess” mpiFileName=”myexec” mpiCommandOptions=”-option input.data output.res”> <commandPath value="/opt/mpich/gnu/bin/mpirun"/> <mpiOptions> <processNumber>16</processNumber> <localRelativePath> <relativePath origin="user.home" value="MPIRep" /> </localRelativePath> <remoteAbsolutePath> <absolutePath value="/home/smariani/MPIRep" /> </remoteAbsolutePath> </mpiOptions> </mpiProcess> </processDefinition> How to deploy? Infrastructure informations Stephane Mariani
Deploying Simply program • Descriptor pad = ProActive.getDescriptor("file:DeploymentDescriptor.xml"); • VirtualNode vnDeskGrid = pad.getVirtualNode ("deskGridVN"); • VirtualNode vnClustLSF = pad.getVirtualNode ("clustLSFVN"); • VirtualNode vnClustPBS = pad.getVirtualNode ("clustPBSVN"); • // triggers JVMs creation and creates MPISpmd Active Object from Virtual Nodes • MPISpmd deskGridSpmd = MPI.newMPISpmd ("vnDeskGrid"); • MPISpmd clustLSFSpmd = MPI.newMPISpmd ("vnClustLSF"); • MPISpmd clustPBSSpmd = MPI.newMPISpmd ("vnClustPBS"); • // triggers the mpi codes execution and return a Future on MPIResult object • MPIResult deskGridRes = deskGridSpmd.startMPI(); • MPIResult clustLSFRes = clustLSFSpmd.startMPI(); • MPIResult clustPBSRes = clustPBSSpmd.startMPI(); Stephane Mariani
mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process mpi process Deploying mpi application mpi application mpi process mpi process Cluster mpi process MPISpmd mpi application MPISpmd mpi process MPISpmd mpi process Cluster mpi process Intranet Desktop Grid Stephane Mariani
Deploying Simply program • MPI_UNSTARTED • MPI_RUNNING • MPI_STOPPED • MPI_KILLED • MPI_FINISHED // prints current status of object log(" Current Status of MPI code on Desktop Grid: "+deskGridSpmd.getStatus()) reStartMPI MPI.newMPISpmd(...) startMPI MPI UNSTARTED MPI RUNNING reStartMPI wait-by-necessity stopMPI reStartMPI startMPI reStartMPI stopMPI MPI STOPPED MPI FINISHED IllegalMPIStateException startMPI stopMPI stopMPI startMPI reStartMPI stopMPI startMPI killMPI killMPI MPI KILLED killMPI killMPI Stephane Mariani
Deploying Simply program Code synchronization // prints result value of mpi code execution log(" Result of MPI code on Desktop Grid: "+deskGridRes.getReturnValue()); Wait-by-necessity blocks the thread until deskGridRes object is available. Stephane Mariani
Interacting Stephane Mariani
Internet mpi application mpi application 0 0 0 0 0 1 1 1 1 1 0 3 8 2 1 0 3 2 1 1 1 1 0 0 0 0 0 4 6 7 5 9 5 4 7 6 Interacting Concept • Just few changes in source code • No extension to MPI • Usage of vendor implemented MPI for internal communication. • Usage of ProActive for external communication. Stephane Mariani
JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM Creating or acquiring JVMs Interacting Cluster Intranet Desktop Grid System overview Cluster Stephane Mariani
JVM JVM JVM JVM JVM JVM JVM AO AO AO JVM JVM JVM JVM AO AO AO JVM JVM JVM JVM JVM JVM JVM JVM JVM Deploying Active Objects using Groups Interacting Cluster Intranet Desktop Grid AO AO AO AO AO AO AO AO System overview AO Cluster AO AO AO AO AO AO AO AO AO AO AO Stephane Mariani
AO Req. queue AO Req. queue JVM JVM C2S QUEUE S2C QUEUE JVM AO Loading Library and initializing Message queues Interacting Cluster JVM JVM Active Object Active Object idJob=0 JNI (Java Native Int) JNI (Java Native Int.) System overview C2S QUEUE AO Cluster idJob=1 AO Stephane Mariani
IPC: message queues Interacting Basic idea: • Two or more processes can exchange information via access to a common system message queue. • Can be seen as an internal linked list stored inside the kernel’s memory space. • New messages are added at the end of the queue. Efficiency: • Synchronization is provided automatically by the kernel unlike Shared Memory. • Messages may be obtained from the queue either in a FIFO manner (default) or by requesting a specific type of message. System overview Stephane Mariani
AO Req. queue AO Req. queue JVM JVM C2S QUEUE S2C QUEUE JVM AO Triggering MPI execution using API Interacting Cluster JVM JVM Application Application Active Object Active Object idJob=0 MPI MPI IPCmodule Network Network JNI (Java Native Int) JNI (Java Native Int.) System overview C2S QUEUE AO Cluster idJob=1 AO Stephane Mariani
Registering Active Object with mpi process rank AO Req. queue AO Req. queue JVM JVM C2S QUEUE C2S QUEUE S2C QUEUE JVM AO Interacting Cluster JVM JVM Application Application Active Object Active Object idJob=0 MPI MPI IPCmodule IPCmodule Network Network JNI (Java Native Int) JNI (Java Native Int.) rank System overview AO Register(idJob, rank) Cluster idJob=1 AO Stephane Mariani
Point-to-point inter-communication AO Req. queue AO Req. queue AO Req. queue JVM JVM C2S QUEUE C2S QUEUE C2S QUEUE C2S QUEUE S2C QUEUE JVM AO S2C QUEUE Interacting Cluster JVM JVM Application Application Active Object Active Object idJob=0 MPI MPI IPCmodule Network Network JNI (Java Native Int.) JNI (Java Native Int) data System overview AO send(idJob, rank, data ) send Cluster idJob=1 JVM JVM AO Application Application Active Object Active Object MPI MPI IPCmodule Network Network JNI (Java Native Int.) JNI (Java Native Int) Stephane Mariani
JVM WAN 1 1 2 2 0 0 0 1 1 1 0 0 Using IPCModule functions: point-to-point communication Interacting Cluster idJob=0 Intranet Desktop Grid MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Send(2,&bufSnd,cnt,myrank,MPI_CHAR,0,0); } Else if (myrank==1) { ProActiveMPI_Recv(1, &bufRcv, cnt, 1, MPI_CHAR, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(2, &buf, cnt, 1, MPI_CHAR, 0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(0, &buf, cnt, myrank, MPI_CHAR, 1, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); idJob=1 Programming idJob=2 Cluster MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(0,&buf,cnt,0,MPI_CHAR,0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(1,&buf,cnt,myrank,MPI_CHAR,0,0); } ProActiveMPI_Finalize(); MPI_Finalize(); Stephane Mariani
JVM WAN 1 1 2 2 0 0 1 0 1 0 1 0 1 Using IPCModule functions: point-to-point communication Interacting Cluster idJob=0 Intranet Desktop Grid MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Send(2,&bufSnd,cnt,myrank,MPI_CHAR,0,0); } Else if (myrank==1) { ProActiveMPI_Recv(1, &bufRcv, cnt, 1, MPI_CHAR, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(2, &buf, cnt, 1, MPI_CHAR, 0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(0, &buf, cnt, myrank, MPI_CHAR, 1, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); idJob=1 Programming 1 idJob=2 Cluster MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(0,&buf,cnt,0,MPI_CHAR,0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(1,&buf,cnt,myrank,MPI_CHAR,0,0); } ProActiveMPI_Finalize(); MPI_Finalize(); Stephane Mariani
JVM WAN 1 1 2 2 0 0 1 0 1 0 1 0 1 Using IPCModule functions: point-to-point communication Interacting Cluster idJob=0 Intranet Desktop Grid MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Send(2,&bufSnd,cnt,myrank,MPI_CHAR,0,0); } Else if (myrank==1) { ProActiveMPI_Recv(1, &bufRcv, cnt, 1, MPI_CHAR, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(2, &buf, cnt, 1, MPI_CHAR, 0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(0, &buf, cnt, myrank, MPI_CHAR, 1, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); idJob=1 Programming 1 idJob=2 Cluster MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(0,&buf,cnt,0,MPI_CHAR,0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(1,&buf,cnt,myrank,MPI_CHAR,0,0); } ProActiveMPI_Finalize(); MPI_Finalize(); Stephane Mariani
JVM WAN 1 1 2 2 0 0 1 0 1 0 1 0 1 2 2 Using IPCModule functions: point-to-point communication Interacting Cluster idJob=0 Intranet Desktop Grid MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Send(2,&bufSnd,cnt,myrank,MPI_CHAR,0,0); } Else if (myrank==1) { ProActiveMPI_Recv(1, &bufRcv, cnt, 1, MPI_CHAR, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(2, &buf, cnt, 1, MPI_CHAR, 0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(0, &buf, cnt, myrank, MPI_CHAR, 1, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); idJob=1 Programming 1 idJob=2 Cluster MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(0,&buf,cnt,0,MPI_CHAR,0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(1,&buf,cnt,myrank,MPI_CHAR,0,0); } ProActiveMPI_Finalize(); MPI_Finalize(); Stephane Mariani
JVM WAN 1 1 2 2 0 0 1 0 1 0 1 0 1 2 2 Using IPCModule functions: point-to-point communication Interacting Cluster idJob=0 Intranet Desktop Grid MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Send(2,&bufSnd,cnt,myrank,MPI_CHAR,0,0); } Else if (myrank==1) { ProActiveMPI_Recv(1, &bufRcv, cnt, 1, MPI_CHAR, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(2, &buf, cnt, 1, MPI_CHAR, 0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(0, &buf, cnt, myrank, MPI_CHAR, 1, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); idJob=1 Programming 1 idJob=2 Cluster MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(0,&buf,cnt,0,MPI_CHAR,0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(1,&buf,cnt,myrank,MPI_CHAR,0,0); } ProActiveMPI_Finalize(); MPI_Finalize(); Stephane Mariani
JVM WAN 1 1 2 2 0 0 1 0 1 0 1 0 3 1 2 3 2 Using IPCModule functions: point-to-point communication Interacting Cluster idJob=0 Intranet Desktop Grid MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Send(2,&bufSnd,cnt,myrank,MPI_CHAR,0,0); } Else if (myrank==1) { ProActiveMPI_Recv(1, &bufRcv, cnt, 1, MPI_CHAR, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(2, &buf, cnt, 1, MPI_CHAR, 0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(0, &buf, cnt, myrank, MPI_CHAR, 1, 0); } ProActiveMPI_Finalize(); MPI_Finalize(); idJob=1 Programming 1 idJob=2 Cluster MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_Recv(0,&buf,cnt,0,MPI_CHAR,0); MPI_Send(&buf, cnt, MPI_CHAR, 1, 0, MPI_COMM_WORLD); } Else if (myrank==1) { MPI_Recv(&buf, cnt, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status); ProActiveMPI_Send(1,&buf,cnt,myrank,MPI_CHAR,0,0); } ProActiveMPI_Finalize(); MPI_Finalize(); Stephane Mariani
JVM WAN 1 1 2 2 0 0 1 0 1 0 1 0 1 1 Global communication: Broadcast Interacting Cluster idJob=0 Intranet Desktop Grid MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_AllSend(2,&buf,cnt,myrank,MPI_CHAR,0); ProActiveMPI_AllSend(1,&buf,cnt,myrank,MPI_CHAR,0); } ProActiveMPI_Finalize(); MPI_Finalize(); MPI_Init(); ProActiveMPI_Init(myrank); ProActiveMPI_AllRecv(0, &buf, cnt, MPI_CHAR, 0); ProActiveMPI_Finalize(); MPI_Finalize(); idJob=1 Programming idJob=2 Cluster MPI_Init(); ProActiveMPI_Init(myrank); ProActiveMPI_AllRecv(0,&buf,cnt,MPI_CHAR,0); ProActiveMPI_Finalize(); MPI_Finalize(); Stephane Mariani
JVM WAN 1 1 2 2 0 0 1 0 1 0 1 0 2 1 2 1 Global communication: Broadcast Interacting Cluster idJob=0 Intranet Desktop Grid MPI_Init(); ProActiveMPI_Init(myrank); If (myrank==0) { ProActiveMPI_AllSend(2,&buf,cnt,myrank,MPI_CHAR,0); ProActiveMPI_AllSend(1,&buf,cnt,myrank,MPI_CHAR,0); } ProActiveMPI_Finalize(); MPI_Finalize(); MPI_Init(); ProActiveMPI_Init(myrank); ProActiveMPI_AllRecv(0, &buf, cnt, MPI_CHAR, 0); ProActiveMPI_Finalize(); MPI_Finalize(); idJob=1 Programming idJob=2 Cluster MPI_Init(); ProActiveMPI_Init(myrank); ProActiveMPI_AllRecv(0,&buf,cnt,MPI_CHAR,0); ProActiveMPI_Finalize(); MPI_Finalize(); Stephane Mariani
Interacting Programming • Descriptor pad = ProActive.getDescriptor("file:DeploymentDescriptor.xml"); • VirtualNode vnDeskGrid = pad.getVirtualNode ("deskGridVN"); • VirtualNode vnClustLSF = pad.getVirtualNode ("clustLSFVN"); • VirtualNode vnClustPBS = pad.getVirtualNode ("clustPBSVN"); • // triggers JVMs creation and creates MPISpmd Active Object from Virtual Nodes • MPISpmd deskGridSpmd = MPI.newMPISpmd ("vnDeskGrid"); • MPISpmd clustLSFSpmd = MPI.newMPISpmd ("vnClustLSF"); • MPISpmd clustPBSSpmd = MPI.newMPISpmd ("vnClustPBS"); • // creates a JobManager • JobManager j_m = (JobManager) ProActive.newActive (JobManager.class.getName()); • ArrayList myJobs = new ArrayList(); • myjobs.add (deskGridSpmd); myjobs.add (clustLSFSpmd); myjobs.add (clustPBSSpmd); • // starts the JobManager with a list of jobs • j_m.startJobManager (myJobs); Stephane Mariani
Conclusion Stephane Mariani
Conclusion • Non-interacting context • Provides user the way to run automatically and transparently an MPI executive without worrying about resources booking, just by finalizing an XML file descriptor and 3 lines of code. • Interacting context • Just few changes in source code • No extension to MPI • Usage of vendor implemented MPI for intra-communication. • Usage of ProActive for external communication. • Ongoing Works • Specification of an API for point-to-point and collectives operations. • Fortran interface. Stephane Mariani