1.15k likes | 1.29k Vues
This paper explores the intersection of computing, media, and physical sciences, emphasizing the role of matter in computational processes. It discusses the double implications of our physical world executing computations and how understanding these processes can lead to better control and programming of physical systems. With a focus on spatial challenges in modern computing components and systems, the paper details paradigms like distributed robotics and self-assembling materials. This work aims to address the inadequacies of traditional abstractions in handling spatial issues in computation.
E N D
Computing Media and Languages for Space-Oriented Computations (opening remarks)
Big Idea: Matter Computes • Our physical world implements computations • Double implication • Computing landscape determined by laws of physical world • Understand our physical world in terms of the computation it performs • Control our physical world by programming the computation it performs.
Convergence of Concerns • Dealing with space as a physical issue when implementing modern computing components and systems • DSM IC, sublithographic effects • Realizing shapes/behaviors/properties using computations • Distributed robotics, programmable matter • Programming physical systems • Self-assembly, protein networks, …
Viewpoint • Traditional/mainstream • Abstractions, models, algorithms, languages • Have not adequately dealt with spatial issues • Either as optimization • Or as computational goal • Now have several communities approaching this from different perspectives
Monday 9:00am Opening Remarks 9:15am DeHon – Spatial Compute 11:15am Coore – Amorphous Computing 12:15pm LUNCH 1:30pm Goldstein – Programmable Matter 2:30pm Gruau – Blob Computing 3:30pm Coffee/Cake 4:00pm Giavitto –Data Structures as Space
Challenges and Opportunities for Spatial Computing André DeHon <andre@acm.org>
Message • Opportunity • Large and capable computing systems • Continued scaling primarily in spatial capacity • Performance capabilities from parallelism • Dynamically (re)programmable/adaptive • Spatial Challenges • Distance=Delay • Communications take up space and energy • Demands • New models/abstractions/algorithms
Convergence of Concerns • Dealing with space as a physical issue when implementing modern computing components and systems • DSM IC, sublithographic effects • Realizing shapes/behaviors/properties using computations • Distributed robotics, programmable matter • Programming physical systems • Self-assembly, protein networks, …
Outline • Scaling • Spatial vs. Temporal Computation • Ground spatial examples: FPGAs, nanoPLA • Spatial Challenges: • Scaling, Interconnect Delay and Requirements • Defects, faults, lifetime effects • Opportunities • Capacity, parallelism, scaling, adaptation • Why not: C, VHDL… • Design Patterns • System Architectures
Spatial Capacity Scaling Continues • Tl2 by end of Silicon roadmap • Another decade • Over Billion gates • Molecular-scale promises • Two orders of magnitude more than that • All 2D still have third dimension to exploit • [paper at NanoNets2006 in 2 weeks]
Implication • Qualitatively not in the same world we were in 1945 1985 • Orders of magnitude shift in resources • Suggest dramatic changes in strategy • We have been on exponential curve • But…up to about 1990 • Shrinking same kind of computers down to one chip
Example • Compute: Y=Ax2+Bx+c
Temporal Implementation • Single Operator • Reuse in time • Store instructions • Store intermediates • Communication across time • One cycle per operation
Spatial Implementation • One operator for every operation • Instruction per operator • Communication in space • Computation in single cycle
Large instruction and data memory per active processing element • Economize on instruction memory with word-wide SIMD organization w op op op op Conventional Processors • Have temporal organization
Conventional Processors • Economize on Area • Pack Large computation • Into small area • By storing description of computation compactly • Reusing small number of processing elements in time • Trade time for area • Absolutely the right thing for 1985 Silicon • (and pre-integrated circuits)
Early Challenge • How do I make my large program fit on an economical computer? • Can compute with 10K vacuum tubes? • Fit in caches hold 100 instructions • 64K address space • Heavy sequentialization was a good engineering solution…for 19451990
Field-Programmable Gate Array (FPGA) K-LUT (typical k=4) Compute block w/ optional output Flip-Flop LUT = Look Up Table
Small instruction area per active operator • pack more computation on die Field-Programmable Gate Arrays • Have spatial computing organization • Bit-level control • use more of available ops
Field-Programmable Gate Arrays • Put more area into computing • Have more compute elements per die • Support more computation per cycle • Trade area for time • With more capacity • More applications fit spatially • More appropriate 2000+
Component Example XC4085XL-09 3,136 CLBs 4.6ns 682 Bit Ops/ns Alpha 1996 264b ALUs 2.3ns 55.7 Bit Ops/ns • Single die in 0.35mm
ALU bitops l2s EmpiricalRaw Density Comparison Computational Density Time
Spatial Computing • Enabled by high capacity • Has a density advantage • Now have sufficient capacity to hold large range of interesting problems • 100,000 bit-level operators on a single chip • More on the way • Can exploit the kinds of capacities now becoming available
100,000s of LUTs Embedded blocks Many small distributed memories Megabits of memory Data rates ~10Tb/s 10—100x over uP Operate 100s of MHz Easily scale up spatially Step and repeat Today’s FPGAs
Simple Nanowire-Based PLA NOR-NOR = AND-OR PLA Logic FPGA 2004
Tile into Arrays FPGA 2005
10mm x 5mm subarrays Millions on single-layer modest die 100 Product Terms per subarray Include memory blocks Stack in 3D nanoPLA Capacity
Interconnect Challenge • With 100,000 processing elements cooperating on a task • (can get today with FPGAs) • Must communicate • Interconnect becomes dominate • Area, delay, energy • Replaces memory for communications • Less heavily studied
Large Memories • Build larger memory • Simple model: multiplex together more cells
Delay vs. Memory (1) • How does delay grow with memory size (N) ? • Tmem = Tdecode+Tcell+Tmux • Tmux = log(N) Tmux4
Delay vs. Memory (2) • Tmem = O(log(N)) • Does this make sense for large N? • Speed of light? • Tmem = Tlogic + Twire • 2D memory: • Twire = O(N) • Tmem=C1log(N) + C2N
Chips >> Cycles • Chips growing • Gate delays shrinking • Wire delays aren’t scaling down • Will take many cycles to cross chip
Clock Cycle Radius • Radius of logic can reach in one cycle (45 nm) • Radius 10 • Few hundred PEs • Chip side 600-700 PE • 400-500 thousand PEs • 100s of cycles to cross
Communication Expensive • What if we just built a crossbar? • Interconnect area scales as N2 • Must exploit typical locality in design to reduce area • Rent’s Rule: IO=cNp • (0.5p0.75) typical • How well can we engineer low p? • Where show up in algorithm/computation design?
Optimizing • Must exploit physical locality (placement) • Reduce wire requirement (reduce p) • Reduce distance traveled over wires • new meaning to spatial locality • Interconnect must show up in our design • Run-time management • Algorithms
Old New Probability Distribution Delay Clock Cycle Scaling has Ended • Up to ~2000 scaled down clock cycle • Architecture scaling: fewer gates/clock • Now down to ~10 gates/clock • Energy-limited computation • Could run a few devices faster…but not all of them • Variation at nanoscale diminishing clock frequencies Future scaling is spatial
Atomic-Scale Physical Effects • As our devices approach the atomic scale, we must deal with statistical effects governing the placement and behavior of individual atoms and electrons.
Three Atomic-Scale “Problems” • Defects: Manufacturing imperfection • Occur before operation; persistent • Shorts, breaks, bad contact • Faults: • Occur during operation; transient • node X value flips: crosstalk, ionizing particles, bad timing, tunneling, thermal noise • Operational/lifetime defects: • Parts become bad during operational lifetime • Fatigue, electromigration, burnout…. • …slower • NBTI, Hot Carrier
Message • Opportunity • Large and capable computing systems • Continued scaling primarily in spatial capacity • Performance capabilities from parallelism • Dynamically (re)programmable/adaptive • Spatial Challenges • Distance=Delay • Communications take up space and energy • Demands • New models/abstractions/algorithms