230 likes | 377 Vues
Connection Machine Architecture. Greg Faust, Mike Gibson, Sal Valente CS-6354 Computer Architecture Fall 2009. Historic Timeline. 1981: MIT AI-Lab Technical Memo on CM 1982: Thinking Machines Inc. Founded 1985: Danny Hillis wins ACM “Best PhD” Award 1986: CM-1 Ships 1987: CM-2 Ships
E N D
Connection MachineArchitecture Greg Faust, Mike Gibson, Sal Valente CS-6354 Computer Architecture Fall 2009
Historic Timeline • 1981: MIT AI-Lab Technical Memo on CM • 1982: Thinking Machines Inc. Founded • 1985: Danny Hillis wins ACM “Best PhD” Award • 1986: CM-1 Ships • 1987: CM-2 Ships • 1991: CM-5 Announced • 1991: CM-5 Ships • 1994: TMI Chapter 11 – Sun/Oracle pick bones • Heavily DARPA funded/backed $16M+ Direct Contracts plus subsidized CM sales
Involved Notables • Danny Hillis – CM inventor and TMI Founder • Charles Leiserson – Fat tree inventor • Richard Feynman – Noble Prize winning Physicist • Marvin Minsky – MIT AI Lab “Visionary” • Guy Steele – Common Lisp, Grace Hopper Award • Stephen Wolfram – Mathematica inventor • Doug Lenat – Mind/Body problem philosopher • Greg Papadopoulos – MIT Media lab, Sun CTO • various others
CM-1 and CM-2 Architecture • Original design goal to support neuron like simulations • Up to 64K single bit processors (actually 3 bits in and 2 out) • 16 Processors/chip, 32chips/PCB, 16 PCBs/cube, 8cubes/hypercube • Hypercube architecture – Each 16-Proc chip a hyper-node • Each proc has 4K bits of bit addressable RAM • Distributed Physical Memory • Global Memory Addresses • Up to 4 front-end computers talk to sequencers via 4x4 crossbar • “Sequencers” issue SIMD instructions over a Broadcast Network • Bit procs communicate via 2D local HW grid connections (“NEWS”) • Bit procs communicate via hypercube network using MSG passing • Lots of Twinkling Lights!!
CM-1 and CM-2 Programming • ISA supports: • Bit-oriented operations • Arbitrary precision multi-bit scalar Ops using bit-serial implementation on bit procs • Full Multi-Dimensional Vector Ops • “Virtual Processor” idea similar to CUDA threadsbut they are statically allocated • OS and Programming Tools run on front-ends • *Lisp as the initial programming language • Later C* and CM-Fortran
CM-2 Improvements • 1 Weitek IEEE FP coprocessor per 32 1-bit procs • Up to 256K bits of memory per processor • Added ECC to Memory • Implemented the IO subsystem • Up to 80 GByte RAID array called “Data Vault”uses 39 Striped Disks and ECC, plus spare disks on standby • High Speed Graphics Output • En-route MSG combining in H-Cube router • New implementation of Multi-DimensionalNEWS on top of H-Cube (special addressing mode)
CM-5 vs CM-1 and CM-2 • Significant departure from CM-1 and CM-2 • Targeted at more scientific and business applications • More Commercial Off-The-Shelf components (“COTS”) • Large Array of SPARC Processing Nodes • 1-bit processors are abandoned • Abandoned “NEWS” Grid and Hyper-Cube Networks • Delivered 1024 node machine, with claims 16K nodes possible • Even More Twinkling Lights!
CM-5 Overall Architecture • "Coordinated Homogeneous Array of RISC Processors“ or “CHARM” • Asymmetric CoProcessors Model • Large Array of Processor Nodes • Small Collection of Control Nodes • 2 Separate scalable networks • One for data • One for control and synchronization • Still uses striped RAID for high disk BandWidth
Division of Labor • Processor Nodes can be assigned to a “Partition” • One Control Node per Partition • Control Node runs scalar code, then broadcasts parallel work to Processor Nodes • Processor Nodes receive a program, not an instruction stream, have own Program Counter • Processor nodes can access other node's memory by reading or writing a global memory address • Processor Nodes also communicate via MSG passing • Processor Nodes cannot issue system calls
Control Nodes • Full Sun Workstations • Running UNIX • Connected to the “Outside World” • Handles Partition Time Sharing • Connected to both data and control networks • Performs System Diagnostics
Processor Nodes • Nodes are a 5-chip microprocessor • Off the Shelf SPARC processor @ 40 MHz • 32MBytes local node memory • Multi-port memory controller for added BW • “Caching techniques do not perform as well on large parallel machines” • Proprietary 4-FPU Vector coprocessor • Proprietary network controller
Data Network Architecture • Point to Point Inter-node communication and I/O • Implemented as a Fat Tree • Fat Trees invented by TMI employee Charles Leiserson • Claim: Onsite BandWidth Expandable • Delivering 5GB/sec Bisection BW on 1024 node machine • Data router chip is a 8x8 crossbar switch • Faulty nodes are mapped out of network • Programs can not assume a network topology • Network can be flushed when Time Share swaps occur • Network, not processors, guarantee end to end delivery
Separate Control Network • Synchronization & control network • Complete Binary Tree organization • Provides broadcast capability • Implements barrier operations • Implements interrupts for timesharing • Performs reduction operators (Sum, Max, AND, OR, Count, etc)
CM-5 Programming • Supports multiple Parallel High Level Languages and Programming Styles • Including Data Parallel Model from CM-1 and CM-2 • Goal: Hide many decisions from programmers • CM-1, CM-2 vs CM-5 ISA changes • Use of Processor Node CPU vs Vector CoProcessors • Partition Wide Synchronizations generate by Compiler • Is it MIMD, SPMD, SIMD? • “Globally Synchronized MIMD”
Sample CM Apps • Machine Learning • Neural Nets, concept clustering, genetic algorithms • VLSI Design • Geophysics (Oil Exploration), Plate Tectonics • Particle Simulation • Fluid Flow Simulation • Computer Vision • Computer Graphics , Animation • Protein Sequence Matching • Global Climate Model Simulation
References • Danny Hillis PhD: The Connection Machine • Inc: The Rise and Fall of Thinking Machines • Wiki: Connection Machine • ACM: The CM-5 Connection Machine • ACM: The Network Architecture of the CM-5 • IEEE: Architecture and Applications of the Connection Machine • IEEE: Fat-trees: universal networks for hardware-efficient supercomputing • Encyclopedia of Computer Science and Technology