380 likes | 580 Vues
GARUDA National Grid Computing Initiative. N. Mohan Ram Chief Investigator – GARUDA 9 th February 2006 Kolkatta. Presentation Outline. Overview Technologies and Research Initiatives Communication Fabric Resources Partners Applications. Project Overview.
E N D
GARUDA National Grid Computing Initiative N. Mohan Ram Chief Investigator – GARUDA 9th February 2006 Kolkatta
Presentation Outline • Overview • Technologies and Research Initiatives • Communication Fabric • Resources • Partners • Applications
Project Overview • Precursor to the National Grid Computing Initiative • Test Bed for the grid technology/concepts and applications • Provide inputs for the main grid proposal • Major Deliverables • Technologies, Architectures, Standards & Research Initiatives • Nation-wide high-speed communication fabric • Aggregation of Grid Resources • Deployment of Select applications of National Importance • Grid Strategic User Group
Technologies, Architectures, Standards and Research Initiatives
Deliverables • Technologies • Garuda Component Architecture & Deployment • Access Portal • Problem Solving Environments • Collaborative Environments • Program Development Environments • Management and Monitoring • Middleware and Security • Resource Management and Scheduling • Data Management • Clustering Technologies • Research Initiatives • Integrated Development Environments • Resource Brokers & Meta Schedulers • Mobile Agent Framework • Semantic Grid Services (MIT Chennai) • Network Simulation
GARUDA Components Grid Access Grid Access C C - - DAC Grid Portal DAC Grid Portal Problem Solving Environments Problem Solving Environments Problem Solving Environments Problem Solving Environments C C - - DAC Grid Portal DAC Grid Portal Methods Methods C C C C C C - - - - - - Benchmarks Benchmarks Grid Probes Grid Probes Grid Applications Grid Applications Grid Probes Grid Probes Grid Applications Grid Applications & Applications & Applications DAC GridMon DAC GridMon DAC GridMon DAC GridMon DAC GridMon DAC GridMon PDE PDE DIViA for Grid DIViA for Grid IDE IDE Workflow Workflow Profilers Profilers Cactus Cactus DIViA for Grid DIViA for Grid IDE IDE Workflow Workflow Profilers Profilers Cactus Cactus Collaborative Collaborative Access GRID Access GRID Video Conferencing over IP Video Conferencing over IP Access GRID Access GRID Video Conferencing over IP Video Conferencing over IP Environment Environment Monitoring & Management Monitoring & Management Monitoring & Management Storage & Storage & NMS NMS NMS NMS NMS NMS Storage Resource Broker Storage Resource Broker Visualization Software Visualization Software Storage Resource Broker Storage Resource Broker Visualization Software Visualization Software Visualization Visualization GLOBUS 2.x/4.x GLOBUS 2.x/4.x Semantic Grid Services Semantic Grid Services Resource Broker Resource Broker MPICH MPICH - - G2 G2 Semantic Grid Services Semantic Grid Services Resource Broker Resource Broker MPICH MPICH - - G2 G2 Middleware & Middleware & Integration & Engineering Integration & Engineering Security Security MDS MDS Grid Schedulers Grid Schedulers Integration & Engineering Integration & Engineering MDS MDS Certificate Authority Certificate Authority Grid Schedulers Grid Schedulers Certificate Authority Certificate Authority Ganglia Ganglia Ganglia Ganglia Ganglia Ganglia SUN Grid Engine SUN Grid Engine Loadleveler Loadleveler SUN Grid Engine SUN Grid Engine Loadleveler Loadleveler Grid Security Grid Security Grid Security Grid Security C C - - DAC Development & Deployment Collaborations DAC Development & Deployment Collaborations Research Initiatives Open Research Initiatives Open Source Commercial Commercial
Resource Manager for Grids Resource Manager for Grids High Availability High Availability High Availability High Availability Garuda Resource Deployment(at C-DAC centres) Other Users Bangalore Users Pune Users End Users access the grid through the Garuda Access Portal Garuda Access Portal Bangalore Pune Hyderabad Shared User Space Shared User Space Chennai Shared Data Space Shared Data Space Resource Manager for Grids Resource Manager for Grids Solaris Cluster Linux Cluster AIX Cluster Linux Cluster Linux Cluster
Garuda Access Portal • Addresses the usability challenges of the Grid • Supports submission of parallel and sequential jobs • Support for Accounting • Integration with Grid Scheduler under progress
Collaborative Environments • Enable Collaborative environment for the Grid developers, users/partners. Will facilitate development team meetings and collaborative project design/progress reviews • IP based video conferencing over the high speed communication Fabric • Initial Target : Enable all centres of C-DAC participating in the Garuda development & deployment to collaborate through video conferencing • Also exploring Access Grid environment
Program Development Environment • Enable users to carry out entire program development life cycle for the Grid • DIViA for the Grid • Features • Supports MPICH-G2 debugging • Communication and computational statistics in different graphical formats • Identification of potential bottlenecks • Unique method of tracing, leads to enhanced information with reduced log file size • Debugger in Design Phase
Management and Monitoring • Monitors status & utilization of the Grid components : compute, network, softwares etc. • Used by System Administrators and end users • Being deployed at the Grid Monitoring and Management Centre(GMMC) • User friendly interface
Middleware & Security • Deployed using Globus Toolkit, Commercial and C-DAC developed components • GT2 for operational requirements • GT4 for research projects • Resource Management and Scheduling • Moab from Cluster Resources for Grid Scheduling • Local Scheduling using Load Leveler for AIX Clusters and Torque for Solaris and Linux Clusters • Data Management • Storage Resource Broker from Nirvana for Data Grid functionalities
Resource Management and Scheduling • Grid Scheduler from Cluster Resources • Industry Leading Scheduler • Components include Moab Workload Manager, Moab Grid Scheduler and Moab Cluster Manager • Integrates with Globus • Data Management through GASS and GridFTP • Job staging with GRAM/Gatekeeper services • User management through Globus user mapping files • Security through X509-based client authentication • Grid Scheduler Features • Intelligent Data Staging • Co-Allocation & Multi-Sourcing • Service Monitoring and Management • Sovereignty (Local vs. Central Management Policies) • Virtual Private Cluster and Virtual Private Grid • Local Resource Managers • Load Leveler on AIX • Torque on Solaris/Linux clusters
Wide Area Grid Administrator: Sets policies and manages via Cluster Manager for his or her own cluster and via Grid Resource Manager for Grid policies Grid FTP Grid Resource Manager Interacts with Grid FTP to Stage Data to each of the Clusters Grid Resource Manager Globus Grid Resource Manager Leverages the Security and Access Control provided in Globus User Space 1 User Space 2 User Space N End Users: (In Multiple User Spaces) Submit jobs via Garuda Grid Access Portal
Single User Space Administrators: Sets policies and manages via Moab Cluster Manager End Users: (In a Single User Space) Submit jobs via web form Interface Garuda Access Portal Moab Workload Manager Moab Cluster Manager Moab Cluster Manager: Acts as the Interface, using wizards and forms to improve ease of use and to unify interface to Workload and Resource Managers Moab Workload Manager: Enforces policies, monitors workload and controls submissions through resource manager Torque Load Leveler Torque OS and Communication OS and Communication OS and Communication Linux Cluster Head Node AIX Cluster Head Node Solaris Cluster Head Node ….. ….. ….. Unified Data Space Local Area Grid (C-DAC Bangalore)
Data Management • Enable data-oriented applications via an integrated but distributed storage and data management infrastructure • Requirements • Heterogeneous Data Access across Multiple Locations • Data Security • Reliability and Consistency of Data • Support for Unified Namespace and Multiple File Systems • Optimal turn-around for Data Access • Parallel I/O • Bulk Operations • Intelligent Resource Selection and Data Routing • Latency Minimization • Vertical and Horizontal Scalability • Garuda Data Grid • Storage Resource Broker from Nirvana
Clustering Technologies • Software • High Performance Compilers • Message Passing Libraries • Performance and Debugging Tools • I/O Libraries, Parallel File System • Cluster Management Software • Available for AIX, Solaris and Linux Clusters • Hardware • 5Gbps SAN Technologies completed • Reconfigurable Computing Systems for bioinformatics & cryptanalysis under progress
Research Initiatives • Resource Broker • Standards are yet to be formulated • Match the user requirements with the available resources • Address co-allocation of computation and communication • Forecasting the availability of resources • Grid IDE • Writing and enabling applications to exploit the Grid • Compiling/cross-compiling across different platforms • Seamless integration of complex functionalities • Support for multiple programming interfaces • Semantic Grid Services (MIT, Chennai) • Publishing Grid Services • Intelligent discovery of Grid services • Integration with Garuda Portal • Mobile Agent Framework • Monitoring of resources in the Grid • Grid software deployment and maintenance • Network Simulation • Inputs for the next phase fabric architecture • To study impact of changes in traffic profile on the performance
Objectives & Deliverables • Objective • Provide an ultra-high speed multi services communication fabric connecting user organizations across 17 cities in the country • Provide seamless & high speed access to the compute, data & other resources on the Grid • In Collaboration with ERNET • Deliverables • High-speed Communication Fabric connecting 17 cities • Grid Management & Monitoring Centre • IP based Collaborative Environment among select centres
Features • Ethernet based High Bandwidth capacity • Scalable over entire geographic area • High levels of reliability • Fault tolerance and redundancy • Interference resilience • High security • Effective Network Management
Grid Management & Monitoring Centre(GMMC) • To provide an integrated Grid Resource Management & Monitoring Framework • Network Traffic Analysis and Congestion Management • Change and Configuration Management
Objective and Deliverables • Objective • Provide heterogeneous resources in the Grid including Compute, Data, Software and Scientific Instruments • Deploy Test facilitates for Grid related research and development activities • Deliverables • Grid enablement of C-DAC resources at Bangalore and Pune • Aggregation of Partner Resources • Setting up of PoC Test Bed and Grid Labs at Bangalore, Pune, Hyderabad and Chennai
Resources • HPC Clusters & Storage from C-DAC • Bangalore : 128 CPU AIX Cluster,5 TB Storage • Pune : 64 CPU Solaris Cluster : 16 CPU Linux Cluster, 4 TB Storage • Chennai : 16 CPU Linux Cluster, 2 TB Storage • Hyderabad : 16 CPU Linux Cluster, 2 TB Storage • The proposed 5 TF system to be part of the Grid • Satellite Terminals from SAC Ahmedabad • 2 TF Computing Cycles from IGIB Delhi • 32 way SMP from Univ. of Hyderabad • 64 CPU cluster from MIT, Chennai • 64 CPU cluster from PRL, Ahmedabad
Motivation and Status • Motivation • Setup a User Group to Collaborate on Research and Engineering of Technologies, Architectures, Standards and Applications in HPC and Grid Computing • To Contribute to the aggregation of resources in the Grid • Current Status • 37 research & academic institutions in the 17 cities have agreed in principle to participate • ERNET-HQ in Delhi • 7 centres of C-DAC • Total of 45 institutions
Partner Participation • Institute of Plasma Research, Ahmedabad • Physical Research Laboratory, Ahmedabad • Space Applications Centre, Ahmedabad • Harish Chandra Research Institute, Allahabad • Motilal Nehru National Institute of Technology, Allahabad • Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore • Indian Institute of Astrophysics, Bangalore • Indian Institute of Science, Bangalore • Institute of Microbial Technology, Chandigarh • Punjab Engineering College, Chandigarh • Madras Institute of Technology, Chennai • Indian Institute of Technology, Chennai • Institute of Mathematical Sciences, Chennai
Partner Participation (Contd.) • Indian Institute of Technology, Delhi • Jawaharlal Nehru University, Delhi • Institute for Genomics and Integrative Biology, Delhi • Indian Institute of Technology, Guwahati • Guwahati University, Guwahati • University of Hyderabad, Hyderabad • Centre for DNA Fingerprinting and Diagnostics, Hyderabad • Jawaharlal Nehru Technological University, Hyderabad • Indian Institute of Technology, Kanpur • Indian Institute of Technology, Kharagpur • Saha Institute of Nuclear Physics, Kolkatta • Central Drug Research Institute, Lucknow • Sanjay Gandhi Post Graduate Institute of Medical Sciences, Lucknow
Partner Participation (Contd.) • Bhabha Atomic Research Centre, Mumbai • Indian Institute of Technology, Mumbai • Tata Institute of Fundamental Research, Mumbai • IUCCA, Pune • National Centre for Radio Astrophysics, Pune • National Chemical Laboratory, Pune • Pune University, Pune • Indian Institute of Technology, Roorkee • Regional Cancer Centre, Thiruvananthapuram • Vikram Sarabhai Space Centre, Thiruvananthapuram • Institute of Technology, Banaras Hindu University, Varanasi
Objectives and Deliverables • Objectives • Enable applications of national importance requiring aggregation of geographically distributed resources • Deliverables • Grid enablement of illustrative applications and some demonstrations such as • Bioinformatics • Disaster Management
Bioinformatics • Bioinformatics Resources & Applications Facility (BRAF) on PARAM Padma • Supports highly optimized Bioinformatics codes on the PARAM Padma • Web computing portal providing all computational facility to solve related problems
GRID Communication Fabric User Agencies User Agencies Disaster Management Grid Partner Resource Flight data transmission from nearby Airport High Speed Communication PARAM Padmaat Bangalore at Pune
Disaster Management (contd..) • Requirements • Timely dissemination of disaster information to user agencies • Organize logistics around automated and secure work flow and data base • Challenges • Widely spread application resources and types : disaster sensors, compute, application experts • Turn around time for the work flow