1 / 41

Chapter One Introduction to Pipelined Processors

Chapter One Introduction to Pipelined Processors. Principle of Designing Pipeline Processors. (Design Problems of Pipeline Processors). Register Tagging. Example : IBM Model 91 : Floating Point Execution Unit. Example : IBM Model 91-FPU. The floating point execution unit consists of :

corby
Télécharger la présentation

Chapter One Introduction to Pipelined Processors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter One Introduction to Pipelined Processors

  2. Principle of Designing Pipeline Processors (Design Problems of Pipeline Processors)

  3. Register Tagging

  4. Example : IBM Model 91 : Floating Point Execution Unit

  5. Example : IBM Model 91-FPU • The floating point execution unit consists of : • Data registers • Transfer paths • Floating Point Adder Unit • Multiply-Divide Unit • Reservation stations • Common Data Bus

  6. Example : IBM Model 91-FPU • There are 3 reservation stations for adder named A1, A2 and A3 and 2 for multipliers named M1 and M2. • Each station has the source & sink registers and their tag & control fields • The stations hold operands for next execution.

  7. Example : IBM Model 91-FPU • 3 store data buffers(SDBs) and 4 floating point registers (FLRs) are tagged • Busy bits in FLR indicates the dependence of instructions in subsequent execution • Common Data Bus(CDB) is to transfer operands

  8. Example : IBM Model 91-FPU • There are 11 units to supply information to CDB: 6 FLBs, 3 adders & 2 multiply/divide unit • Tags for these stations are :

  9. Example : IBM Model 91-FPU • Internal forwarding can be achieved with tagging scheme on CDB. • Example: • Let F refers to FLR and FLBi stands for ith FLB and their contents be (F) and (FLBi) • Consider instruction sequence ADD F,FLB1 F  (F) + (FLB1) MPY F,FLB2 F  (F) x (FLB2)

  10. Example : IBM Model 91-FPU • During addition : • Busy bit of F is set to 1 • Contents of F and FLB1 is sent to adder A1 • Tag of F is set to 1010 (tag of adder) F

  11. 6 Storage Bus Instruction Unit 5 4 Control 3 2 1 Decoder Adder Multiplier (Common Data Bus) Floating Point Buffers (FLB)

  12. Example : IBM Model 91-FPU • Meantime, the decode of MPY reveals F is busy, then • F should set tag of M1 as 1010 (Tag of adder) • F should change its tag to 1000 (Tag of Multiplier) • Send content of FLB2 to M1 F

  13. 6 Storage Bus Instruction Unit 5 4 Control 3 2 1 Decoder Adder Multiplier (Common Data Bus) Before addition Floating Point Buffers (FLB)

  14. 6 Storage Bus Instruction Unit 5 4 Control 3 2 1 Decoder Adder Multiplier (Common Data Bus) After addition Floating Point Buffers (FLB)

  15. Example : IBM Model 91-FPU • When addition is done, CDB finds that the result should be sent to M1 • Multiplication is done when both operands are available

  16. Hazard Detection and Resolution

  17. Hazard Detection and Resolution • Hazards are caused by resource usage conflicts among various instructions • They are triggered by inter-instruction dependencies Terminologies: • Resource Objects: set of working registers, memory locations and special flags

  18. Hazard Detection and Resolution • Data Objects: Content of resource objects • Each Instruction can be considered as a mapping from a set of data objects to a set of data objects. • Domain D(I) : set of resource of objects whose data objects may affect the execution of instruction I.(e.g.Source Registers)

  19. Hazard Detection and Resolution • Range R(I): set of resource objects whose data objects may be modified by the execution of instruction I .(e.g. Destination Register) • Instruction reads from its domain and writes in its range

  20. Hazard Detection and Resolution • Consider execution of instructions I and J, and J appears immediately after I. • There are 3 types of data dependent hazards: • RAW (Read After Write) • WAW(Write After Write) • WAR (Write After Read)

  21. RAW (Read After Write) • The necessary condition for this hazard is

  22. RAW (Read After Write) • Example: I1 : LOAD r1,a I2 : ADD r2,r1 • I2 cannot be correctly executed until r1 is loaded • Thus I2 is RAW dependent on I1

  23. WAW(Write After Write) • The necessary condition is

  24. WAW(Write After Write) • Example I1 : MUL r1, r2 I2 : ADD r1,r4 • Here I1 and I2 writes to same destination and hence they are said to be WAW dependent.

  25. WAR(Write After Read) • The necessary condition is

  26. WAR(Write After Read) • Example: • I1 : MUL r1,r2 • I2 : ADD r2,r3 • Here I2 has r2 as destination while I1 uses it as source and hence they are WAR dependent

  27. Hazard Detection and Resolution • Hazards can be detected in fetch stage by comparing domain and range. • Once detected, there are two methods: • Generate a warning signal to prevent hazard • Allow incoming instruction through pipe and distribute detection to all pipeline stages.

  28. Job Sequencing and Collision Prevention

  29. Job Sequencing and Collision Prevention • Consider reservation table given below at t=1

  30. Job Sequencing and Collision Prevention • Consider next initiation made at t=2 • The second initiation easily fits in the reservation table

  31. Job Sequencing and Collision Prevention • Now consider the case when first initiation is made at t = 1 and second at t = 3. • Here both markings A1 and A2 falls in the same stage time units and is called collision and it must be avoided

  32. Terminologies

  33. Terminologies • Latency: Time difference between two initiations in units of clock period • Forbidden Latency: Latencies resulting in collision • Forbidden Latency Set: Set of all forbidden latencies

  34. General Method of finding Latency Considering all initiations: • Forbidden Latencies are 3 and 6

  35. Shortcut Method of finding Latency • Forbidden Latency Set = {1,6} U {1,3} U {1,3} = { 1, 3, 6 }

  36. Terminologies • Latency Sequence : Sequence of latencies between successive initiations • For a RT, number of valid initiations and latencies are infinite

  37. Terminologies • Latency Cycle: • Among the infinite possible latency sequence, the periodic ones are significant. E.g. { 1, 3, 3, 1, 3, 3,… } • The subsequence that repeats itself is called latency cycle. E.g. {1, 3, 3}

  38. Terminologies • Period of cycle: The sum of latencies in a latency cycle (1+3+3=7) • Average Latency: The average taken over its latency cycle (AL=7/3=2.33) • To design a pipeline, we need a control strategy that maximize the throughput (no. of results per unit time) • Maximizing throughput is minimizing AL

  39. Terminologies • Latency sequence which is aperiodic in nature is impossible to design • Thus design problem is arriving at a latency cycle having minimal average latency.

More Related