1 / 113

Computing Media and Languages for Space-Oriented Computations

Computing Media and Languages for Space-Oriented Computations. (opening remarks). Big Idea: Matter Computes. Our physical world implements computations Double implication Computing landscape determined by laws of physical world

summer
Télécharger la présentation

Computing Media and Languages for Space-Oriented Computations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computing Media and Languages for Space-Oriented Computations (opening remarks)

  2. Big Idea: Matter Computes • Our physical world implements computations • Double implication • Computing landscape determined by laws of physical world • Understand our physical world in terms of the computation it performs • Control our physical world by programming the computation it performs.

  3. Convergence of Concerns • Dealing with space as a physical issue when implementing modern computing components and systems • DSM IC, sublithographic effects • Realizing shapes/behaviors/properties using computations • Distributed robotics, programmable matter • Programming physical systems • Self-assembly, protein networks, …

  4. Viewpoint • Traditional/mainstream • Abstractions, models, algorithms, languages • Have not adequately dealt with spatial issues • Either as optimization • Or as computational goal • Now have several communities approaching this from different perspectives

  5. Week Outline

  6. Monday 9:00am Opening Remarks 9:15am DeHon – Spatial Compute 11:15am Coore – Amorphous Computing 12:15pm LUNCH 1:30pm Goldstein – Programmable Matter 2:30pm Gruau – Blob Computing 3:30pm Coffee/Cake 4:00pm Giavitto –Data Structures as Space

  7. Challenges and Opportunities for Spatial Computing André DeHon <andre@acm.org>

  8. Message • Opportunity • Large and capable computing systems • Continued scaling  primarily in spatial capacity • Performance capabilities from parallelism • Dynamically (re)programmable/adaptive • Spatial Challenges • Distance=Delay • Communications take up space and energy • Demands • New models/abstractions/algorithms

  9. Convergence of Concerns • Dealing with space as a physical issue when implementing modern computing components and systems • DSM IC, sublithographic effects • Realizing shapes/behaviors/properties using computations • Distributed robotics, programmable matter • Programming physical systems • Self-assembly, protein networks, …

  10. Outline • Scaling • Spatial vs. Temporal Computation • Ground spatial examples: FPGAs, nanoPLA • Spatial Challenges: • Scaling, Interconnect Delay and Requirements • Defects, faults, lifetime effects • Opportunities • Capacity, parallelism, scaling, adaptation • Why not: C, VHDL… • Design Patterns • System Architectures

  11. Capacity

  12. Capacity Scaling from Intel

  13. Area Perspective

  14. Spatial Capacity Scaling Continues • Tl2 by end of Silicon roadmap • Another decade • Over Billion gates • Molecular-scale promises • Two orders of magnitude more than that • All 2D  still have third dimension to exploit • [paper at NanoNets2006 in 2 weeks]

  15. Implication • Qualitatively not in the same world we were in 1945  1985 • Orders of magnitude shift in resources • Suggest dramatic changes in strategy • We have been on exponential curve • But…up to about 1990 • Shrinking same kind of computers down to one chip

  16. Temporal vs. Spatial

  17. Example • Compute: Y=Ax2+Bx+c

  18. Temporal Implementation • Single Operator • Reuse in time • Store instructions • Store intermediates • Communication across time • One cycle per operation

  19. Spatial Implementation • One operator for every operation • Instruction per operator • Communication in space • Computation in single cycle

  20. Large instruction and data memory per active processing element • Economize on instruction memory with word-wide SIMD organization w op op op op Conventional Processors • Have temporal organization

  21. Conventional Processors • Economize on Area • Pack Large computation • Into small area • By storing description of computation compactly • Reusing small number of processing elements in time • Trade time for area • Absolutely the right thing for 1985 Silicon • (and pre-integrated circuits)

  22. Early Challenge • How do I make my large program fit on an economical computer? • Can compute with 10K vacuum tubes? • Fit in caches hold 100 instructions • 64K address space • Heavy sequentialization was a good engineering solution…for 19451990

  23. Field-Programmable Gate Array (FPGA) K-LUT (typical k=4) Compute block w/ optional output Flip-Flop LUT = Look Up Table

  24. Small instruction area per active operator • pack more computation on die Field-Programmable Gate Arrays • Have spatial computing organization • Bit-level control • use more of available ops

  25. Field-Programmable Gate Arrays • Put more area into computing • Have more compute elements per die • Support more computation per cycle • Trade area for time • With more capacity • More applications fit spatially • More appropriate 2000+

  26. Component Example XC4085XL-09 3,136 CLBs 4.6ns 682 Bit Ops/ns Alpha 1996 264b ALUs 2.3ns 55.7 Bit Ops/ns • Single die in 0.35mm

  27. ALU bitops l2s EmpiricalRaw Density Comparison Computational Density Time

  28. Spatial Computing • Enabled by high capacity • Has a density advantage • Now have sufficient capacity to hold large range of interesting problems • 100,000 bit-level operators on a single chip • More on the way • Can exploit the kinds of capacities now becoming available

  29. Spatially Programmable FPGA

  30. Ground Examples

  31. 100,000s of LUTs Embedded blocks Many small distributed memories Megabits of memory Data rates ~10Tb/s 10—100x over uP Operate 100s of MHz Easily scale up spatially Step and repeat Today’s FPGAs

  32. Simple Nanowire-Based PLA NOR-NOR = AND-OR PLA Logic FPGA 2004

  33. Tile into Arrays FPGA 2005

  34. 10mm x 5mm subarrays Millions on single-layer modest die 100 Product Terms per subarray Include memory blocks Stack in 3D nanoPLA Capacity

  35. Interconnect

  36. Interconnect Challenge • With 100,000 processing elements cooperating on a task • (can get today with FPGAs) • Must communicate • Interconnect becomes dominate • Area, delay, energy • Replaces memory for communications • Less heavily studied

  37. Motivating Example: Memories(Memory from mux bits)

  38. Large Memories • Build larger memory • Simple model: multiplex together more cells

  39. Delay vs. Memory (1) • How does delay grow with memory size (N) ? • Tmem = Tdecode+Tcell+Tmux • Tmux = log(N)  Tmux4

  40. Delay vs. Memory (2) • Tmem = O(log(N)) • Does this make sense for large N? • Speed of light? • Tmem = Tlogic + Twire • 2D memory: • Twire = O(N) • Tmem=C1log(N) + C2N

  41. Chips >> Cycles • Chips growing • Gate delays shrinking • Wire delays aren’t scaling down • Will take many cycles to cross chip

  42. Clock Cycle Radius • Radius of logic can reach in one cycle (45 nm) • Radius 10 • Few hundred PEs • Chip side 600-700 PE • 400-500 thousand PEs • 100s of cycles to cross

  43. Communication Expensive • What if we just built a crossbar? • Interconnect area scales as N2 • Must exploit typical locality in design to reduce area • Rent’s Rule: IO=cNp • (0.5p0.75) typical • How well can we engineer low p? • Where show up in algorithm/computation design?

  44. Optimizing • Must exploit physical locality (placement) • Reduce wire requirement (reduce p) • Reduce distance traveled over wires •  new meaning to spatial locality • Interconnect must show up in our design • Run-time management • Algorithms

  45. Old New Probability Distribution Delay Clock Cycle Scaling has Ended • Up to ~2000 scaled down clock cycle • Architecture scaling: fewer gates/clock • Now down to ~10 gates/clock • Energy-limited computation • Could run a few devices faster…but not all of them • Variation at nanoscale diminishing clock frequencies  Future scaling is spatial

  46. Atomic-Scale Physical Effects • As our devices approach the atomic scale, we must deal with statistical effects governing the placement and behavior of individual atoms and electrons.

  47. Three Atomic-Scale “Problems” • Defects: Manufacturing imperfection • Occur before operation; persistent • Shorts, breaks, bad contact • Faults: • Occur during operation; transient • node X value flips: crosstalk, ionizing particles, bad timing, tunneling, thermal noise • Operational/lifetime defects: • Parts become bad during operational lifetime • Fatigue, electromigration, burnout…. • …slower • NBTI, Hot Carrier

  48. Message • Opportunity • Large and capable computing systems • Continued scaling  primarily in spatial capacity • Performance capabilities from parallelism • Dynamically (re)programmable/adaptive • Spatial Challenges • Distance=Delay • Communications take up space and energy • Demands • New models/abstractions/algorithms

  49. Questions so far?

  50. Challenge

More Related