1 / 31

Samira Khan University of Virginia Jan 23, 2019

ADVANCED COMPUTER ARCHITECTURE Fundamental Concepts: Computing Models. Samira Khan University of Virginia Jan 23, 2019. The content and concept of this course are adapted from CMU ECE 740. AGENDA. Review from last lecture Fundamental concepts Computing models Data flow architecture.

dallon
Télécharger la présentation

Samira Khan University of Virginia Jan 23, 2019

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ADVANCED COMPUTER ARCHITECTURE Fundamental Concepts: Computing Models Samira Khan University of Virginia Jan 23, 2019 The content and concept of this course are adapted from CMU ECE 740

  2. AGENDA • Review from last lecture • Fundamental concepts • Computing models • Data flow architecture

  3. THE VON NEUMANN MODEL/ARCHITECTURE • Also called stored program computer (instructions in memory). Two key properties: • Stored program • Instructions stored in a linear memory array • Memory is unified between instructions and data • The interpretation of a stored value depends on the control signals • Sequential instruction processing • One instruction processed (fetched, executed, and completed) at a time • Program counter (instruction pointer) identifies the current instr. • Program counter is advanced sequentially except for control transfer instructions When is a value interpreted as an instruction?

  4. THE DATA FLOW MODEL (OF A COMPUTER) • Von Neumann model: An instruction is fetched and executed in control flow order • As specified by the instruction pointer • Sequential unless explicit control flow instruction • Dataflow model: An instruction is fetched and executed in data flow order • i.e., when its operands are ready • i.e., there is no instruction pointer • Instruction ordering specified by data flow dependence • Each instruction specifies “who” should receive the result • An instruction can “fire” whenever all operands are received • Potentially many instructions can execute at the same time • Inherently more parallel

  5. VON NEUMANN VS DATAFLOW • Consider a Von Neumann program • What is the significance of the program order? • What is the significance of the storage locations? • Which model is more natural to you as a programmer? a b v <= a + b; w <= b * 2; x <= v - w y <= v + w z <= x * y + *2 - + Sequential * Dataflow z

  6. MORE ON DATA FLOW • In a data flow machine, a program consists of data flow nodes • A data flow node fires (fetched and executed) when all it inputs are ready • i.e. when all inputs have tokens • Data flow node and its ISA representation

  7. DATA FLOW NODES

  8. An Example

  9. What does this model perform? val = a ^ b

  10. What does this model perform? val = a ^ b val =! 0

  11. What does this model perform? val = a ^ b val =! 0 val &= val - 1

  12. What does this model perform? val = a ^ b val =! 0 val &= val - 1; dist = 0 dist++;

  13. Hamming Distance inthamming_distance (unsigned a, unsigned b) { intdist=0; unsigned val= a ^ b; // Count the number of bits set while (val!=0) { // A bit is set, so increment the count and clear the bit dist++; val&=val-1; } // Return the number of differing bits returndist; }

  14. Hamming Distance •  Number of positions at which the corresponding symbols are different. • The Hamming distance between: • "karolin" and "kathrin" is 3 • 1011101 and 1001001 is 2 • 2173896 and 2233796 is 3

  15. RICHARD HAMMING • Best known for Hamming Code • Won Turing Award in 1968 • Was part of the Manhattan Project • Worked in Bell Labs for 30 years • You and Your Research is mainly his advice to other researchers • Had given the talk many times during his life time • http://www.cs.virginia.edu/~robins/YouAndYourResearch.html

  16. HOW TO BUILD A DATAFLOW MACHINE?

  17. Monsoon Dataflow Processor 1990

  18. Review Set 2 • Due Jan 30 • Choose 2 from a set of four • Dennis and Misunas, “A Preliminary Architecture for a Basic Data Flow Processor,” ISCA 1974. • Arvind and Nikhil,“Executinga Program on the MIT Tagged-Token Dataflow Architecture”, IEEE TC 1990. • H. T. Kung, “Why Systolic Architectures?,” IEEE Computer 1982. • Annaratone et al., “Warp Architecture and Implementation,” ISCA 1986.

  19. b a + *7 1 2 x y ip = 3 p = L token - + 4 3 * 5 < ip , p , v > port data instruction ptr Dataflow Graphs {x = a + b; y = b * 7 in (x-y) * (x+y)} • Values in dataflow graphs are represented as tokens • An operator executes when all its input tokens are present; copies of the result token are distributed to the destination operators no separate control flow

  20. Control Flow vs. Data Flow

  21. Static Dataflow • Allows only one instance of a node to be enabled for firing • A dataflow node is fired only when all of the tokens are available on its input arcs and no tokens exist on any of its its output arcs • Dennis and Misunas, “A Preliminary Architecture for a Basic Data Flow Processor,” ISCA 1974.

  22. b a 1 + *7 1 2 2 x 3 y 4 - + 4 3 5 * 5 Static Dataflow Machine:Instruction Templates Destination 2 Destination 1 Operand 1 Operand 2 Opcode + 3L 4L *3R 4R - 5L + 5R *out Presence bits Each arc in the graph has an operand slot in the program

  23. Static Dataflow Machine (Dennis+, ISCA 1974) Receive • Many such processors can be connected together • Programs can be statically divided among the processors Instruction Templates Op dest1 dest2 p1 src1 p2 src2 1 2 . . . FU FU FU FU FU Send <s1, p1, v1>, <s2, p2, v2>

  24. Static Data Flow Machines • Mismatch between the model and the implementation • The model requires unbounded FIFO token queues per arc but the architecture provides storage for one token per arc • The architecture does not ensure FIFO order in the reuse of an operand slot • The static model does not support • Reentrant code • Function calls • Loops • Data Structures

  25. Problems with Re-entrancy • Assume this was in a loop • Or in a function • And operations took variable time to execute • How do you ensure the tokens that match are of the same invocation?

  26. Dynamic Dataflow Architectures • Allocate instruction templates, i.e., a frame, dynamically to support each loop iteration and procedure call • termination detection needed to deallocate frames • The code can be shared if we separate the code and the operand storage <fp, ip, port, data> a token instruction pointer frame pointer

  27. 1 1 b a 2 2 3 + *7 1 2 4 4 x 5 5 y - + 4 3 1 2 3 * 5 4 5 Need to provide storage for only one operand/operator A Frame in Dynamic Dataflow + Program 3L, 4L 3R, 4R * - 5L 3 + 5R out * <fp, ip, p , v> L 7 Frame

  28. op r d1,d2 Code Monsoon Processor (ISCA 1990) InstructionFetch ip OperandFetch fp+r Token Queue Frames ALU FormToken Network Network

  29. Concept of Tagging • Each invocation receives a separate tag

  30. token in frame 0 token in frame 1 Procedure Linkage Operators an f a1 ... get frame extract tag change Tag 0 change Tag n change Tag 1 Like standard call/return but caller & callee can be active simultaneously n: 1: Fork Graph for f change Tag 1 change Tag 0

  31. ADVANCED COMPUTER ARCHITECTURE Fundamental Concepts: Computing Models Samira Khan University of Virginia Jan 23, 2019 The content and concept of this course are adapted from CMU ECE 740

More Related