1 / 36

Cluster

Cluster. Trần Hữu Lộc (00706140) Nguyễn Thành Trung(00706151). Outline. Introduction Cluster architectures System Design Parallel Programming Environments and Tools Cluster Applications. Introduction.

lena
Télécharger la présentation

Cluster

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cluster Trần Hữu Lộc (00706140) Nguyễn Thành Trung(00706151)

  2. Outline • Introduction • Cluster architectures • System Design • Parallel Programming Environments and Tools • Cluster Applications

  3. Introduction • Solving grand challenge applications using computer modeling, simulation and analysis (Weather Forecasting, Military Applications, Simulation, astrophysics …) • Mini computers were large and expensive • The development of powerful microprocessors • High speed LAN

  4. How to Run Applications Faster ? • Using faster hardware • Optimized algorithms and techniques used to solve computational tasks • Multiple computers to solve a particular task

  5. History • In the 1960s, or even late 1950s • Research clusters in hand with that of both networks and the Unix operating system from the early 1970s • The first commercial clustering product was ARCnet, developed by Datapoint in 1977 • VAXcluster in 1984 • Tandem Himalaya and the IBM S/390 Parallel Sysplex in 1994 • …

  6. What is Cluster ? • A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource. • A node:a single or multiprocessor system with memory, I/O facilities, & OS • A cluster: • generally 2 or more computers (nodes) connected together in a single cabinet, or physically separated & connected via a LAN • Provide a cost-effective way to gain features and benefits

  7. PC/Workstation PC/Workstation PC/Workstation PC/Workstation Communications Software Communications Software Communications Software Communications Software Network Interface Hardware Network Interface Hardware Network Interface Hardware Network Interface Hardware Cluster Architecture Parallel Applications Parallel Applications Parallel Applications Sequential Applications Sequential Applications Sequential Applications Parallel Programming Environment Cluster Middleware (Single System Image and Availability Infrastructure) Cluster Interconnection Network/Switch

  8. System Design • Performance Requirements • Hardware Platforms • Operating Systems • Single System Image (SSI) • Middleware

  9. Performance Requirements • Common Cluster Modes • High Performance (dedicated). • High Throughput (idle cycle harvesting). • High Availability (fail-over). • A Unified System – HP and HA within the same cluster

  10. Performance Requirements • The Need for Performance Evaluation • Hardware – Idle processors due to conflicts over memory access & communications paths. • Operating System – Inefficient internal scheduler, file systems and memory allocation/de-allocation. • Middleware – Inefficient distribution and coordination of tasks, high inter-processor communications latency due to inefficient middleware. • Applications – Inefficient algorithms that do not exploit the natural concurrency of a problem.

  11. Performance Requirements • Some indices for global measurements • Execution rate: The execution rate measures the machine output per unit of time, measured in MIPS (million instructions per second) • Speedup (Sp) • Efficiency (Ep)

  12. Hardware Platforms • Multiple High Performance Computers • PCs • Workstations • SMPs (CLUMPS)

  13. Hardware Platforms • Processors • Intel x86 Processors • Pentium Pro and Pentium Xeon • AMD x86, Cyrix x86, etc. • Digital Alpha • Alpha 21364 processor integrates processing, memory controller, network interface into a single chip • IBM PowerPC • Sun SPARC • SGI MIPS • HP PA

  14. Network Technology • Communication Protocols • Connection-oriented or connectionless • Offering various levels of reliability, including fully guaranteed to arrive in order (reliable), or not guaranteed (unreliable) • Not buffered (synchronous), or buffered (asynchronous) • Internet Protocols: TCP/IP, UDP • Low-latency Protocols:Active Messages, Fast Messages, the VMMC (Virtual Memory-Mapped • Communication) system, U-net, and Basic Interface for Parallelism (BIP),

  15. Network Technology • Hardware Products • Ethernet (10Mbps), • Fast Ethernet (100Mbps), • Gigabit Ethernet (1Gbps) • SCI (Scalable Coherent Interface- MPI- 12µsec latency) • ATM (Asynchronous Transfer Mode) • Myrinet (1.28Gbps) • QsNet (Quadrics Supercomputing World, 5µsec latency for MPI messages) • Digital Memory Channel • FDDI (fiber distributed data interface) • InfiniBand

  16. Operating Systems • The operating system for a cluster lies at every node • 2 fundamental services for users • make the computer hardware easier to use • create a virtual machine that differs markedly from the real machine • share hardware resources among users • Processor - multitasking • The new concept in OS services • support multiple threads of control in a process itself • parallelism within a process • multithreading

  17. Operating Systems

  18. Operating Systems • Node Operating System • Linux Clusters (e.g., Beowulf) • Solaris Clusters (e.g., Berkeley NOW) • NT Clusters (e.g., HPVM) • AIX Clusters (e.g., IBM SP2) • SCO/Compaq Clusters (Unixware) • Digital VMS Clusters • HP-UX clusters • Microsoft Wolfpack clusters

  19. Single System Image (SSI) • Hides the heterogeneous and distributed nature of the available resources, presents them to users and applications as a single unified computing resource • High availability • Transparency of resource management • Scalable performance

  20. Single System Image (SSI) • Services and Benefits • Single entry point • Single user interface • Single process space • Single I/O space (SIOS) • Single file hierarchy • Single virtual networking • Single job-management system • Single control point and management • Checkpointing and Process Migration

  21. PC/Workstation PC/Workstation PC/Workstation PC/Workstation Communications Software Communications Software Communications Software Communications Software Network Interface Hardware Network Interface Hardware Network Interface Hardware Network Interface Hardware Middleware Parallel Applications Parallel Applications Parallel Applications Sequential Applications Sequential Applications Sequential Applications Parallel Programming Environment Cluster Middleware (Single System Image and Availability Infrastructure) Cluster Interconnection Network/Switch

  22. Middleware • Introduction + A layer of software sandwiched between the operating system and applications. + A means of integrating software applications running in a heterogeneous environment. • Heterogeneity + Hardware platform become heterogeneous + Must support very different applications • Overview + Help application developer overcome these heterogeneities. + Provides services for the management and administration of a heterogeneous system

  23. Middleware – Technological scope • Message-based Middleware • RPC-based Middleware • CORBA • OLE/COM • Internet Middleware • Java Technologies • Cluster Management Software

  24. Middleware – Technological scope • Message-based Middleware + Uses common communications protocol to exchange data between applications which hides low level message passing primitives from application developer + Parallel Virtual Machine (PVM) and MPI • RPC-based Middleware + Remote Procedure Call (RPC) allows request process directly executing a procedure on another and receive a response + use Marshalling to transfer data structures in RPC from one to another + Network Information Services [9] (NIS) and Network File Services [10] (NFS)

  25. Middleware – Technological scope • COBRA + An architectural framework that specifies the mechanisms for processing distributed objects + Object Management Architecture (OMA): Object Request Broker (ORB), Object services, Application services, Application objects. • COM/OLE + Object Linking and Embedding (OLE): highly generic object model and a set of interfaces (Object Oriented) allowing apps to intercommunicate + Component Object Model (COM) model defines mechanisms for the creation of objects and communication between clients and objects that are distributed across distributed environment.

  26. Middleware – Technological scope • Internet Middleware + HyperText Transport Protocol (HTTP) and Common Gateway Interface (CGI), v.v. • Java Technologies + Java Remote Method Invocation (RMI) + Jini: a set of APIs and network protocols used to create and deploy distributed systems organized as federations of services • Cluster Management Software + Administer and manage jobs submitted to workstation clusters + Optimize the use of the available resources, set priority, steal CPU cycle, task-migration, ensure task complete

  27. System Administration • Introduction • Manageability of a system: how usable in terms of actually producing computations value and what “comfort level” for users • Computer science research: performance testing, benchmarking, and software tuning. • Production-computing environment: provide reliable computing cycles with dependable networking, application software, and OS • Good systems manageability will directly equate to better results

  28. System Administration • System Planning • Hardware Considerations: low cost/compute cycle ratio workstations • Performance Specifications: performance testing, benchmarking, and software tuning. • Memory speed and interleave • Processor core speed vs. bus speed • PCI bus speed and width • Multiprocessor issues: single- or multiprocessor building blocks • Cluster Interconnect Considerations: require efficient data transfers, effective drain on processor cycles associated with transfers, highly optimized network interconnects

  29. System Administration • Software Considerations • Remote Access: Windows (Telnet, Terminal service, IIS), Unix (SSH, Telnet, XWindows, FTP). • System Installation:Windows (Remote Installation Service, third-party tool: Norton Ghost, Imagecast), Unix (Linux Utility for cluster Install (LUI) of IBM, VA SystemImager of VA Linux) • System Monitoring & Remote Control of Nodes • Probing by direct access to kernel memory • Probing by File System Interface • Collecting the Performance Information • Scalability • Optimizing the Network Traffic • Reducing The Intrusiveness

  30. System Administration

  31. System Administration • Remote Management: Tools and Technology • Remote monitoring and control of nodes, copy/move/remove files, remote shutdown, restart, security maintenance, parallel execution • Scheduling Systems

  32. Parallel Programming Environments and Tools • Threads (PCs, SMPs, NOW..) • POSIX Threads • Java Threads • MPI (Message Passing Interface) • Linux, NT, on many Supercomputers • PVM (Parallel Virtual Machine) • Parametric Programming

  33. Parallel Programming Environments and Tools • Software DSMs (Shmem) • Compilers • C/C++/Java • Parallel programming with C++ (MIT Press book) • RAD (rapid application development tools) • GUI based tools for PP modeling • Debuggers • Performance Analysis Tools • Visualization Tools

  34. Applications • Sequential • Parallel / Distributed (Cluster-aware app.) • Grand Challenging applications • Weather Forecasting • Quantum Chemistry • Molecular Biology Modeling • Engineering Analysis (CAD/CAM) • ………………. • PDBs, web servers, data-mining

  35. Operational Benefits • High Performance: aggregate computing power across nodes to solve a problem faster. • Expandability and Scalability: easily to expand and increase size of nodes. • High Throughput: harness the ever-growing power of desktop computing resources while protecting the rights and needs of their interactive users. • High Availability: provide high availability of service

  36. References • Cluster Computing White Paper - Mark Baker, University of Portsmouth, UK • Cluster Computing - Architectures, Operating Systems, Parallel Processing & Programming Languages - Richard S. Morrison • High Performance Cluster Computing: Architectures and Systems – slide (Hai Jin and Raj Buyya)

More Related