1 / 20

OOO vs. EPIC

OOO vs. EPIC. Yingmin Li Ting Yan Qi Zhao. Outline. “Advantages” of EPIC Critique Conclusion. EPIC: Main Idea. “Smart compiler, dumb machine” Finding parallelism Processor  compiler Software/hardware synergy Processor design Avoid complexity and difficulty ILP, SMT & CMP.

Télécharger la présentation

OOO vs. EPIC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OOO vs. EPIC Yingmin Li Ting Yan Qi Zhao

  2. Outline • “Advantages” of EPIC • Critique • Conclusion

  3. EPIC: Main Idea • “Smart compiler, dumb machine” • Finding parallelism • Processor  compiler • Software/hardware synergy • Processor design • Avoid complexity and difficulty • ILP, SMT & CMP

  4. EPIC: Predication • In OOO: dynamic branch prediction. • Larger basic blocks. • Control dep.  Data dep. • Eliminate misprediction & penalties.

  5. EPIC: Speculation • OOO: dynamic hardware • Data speculation & control speculation • Bigger window • Reduce impact of memory latencies

  6. EPIC: Large Register Set • OOO: register renaming. • Easier to design than reg. Renaming. • “Real” registers benefits some apps. • Encryption alg., Numerical alg. • Avoid loss of invisible registers. • Interruptions in OOO.

  7. EPIC: Unique Features • Register Stack Engine (RSE). • To deal with call/ return costs. • Seems an unlimited stack of phys. Reg. • Rotating register file. • Software pipelining. • Multiple loops at the same time.

  8. Function Call • Register saving/restoring • Processor? • Compiler? • Register file • Expensive • Always idle

  9. Predication • Computation of the branch condition is on the critical path • Increase ICache footprint • Half of the functional units effectively used if both “then” and “else” are scheduled • Hard to implement out-of-order with full predication

  10. Predication To compute if (a) x = t+1:

  11. Control Speculation • Why not just use prefetch which will not cause unexpected exception? • Technique to exploit control speculation such as superblock increase code length

  12. Control prediction

  13. Data Speculation • Moving a load above a possibly conflicting store • An advanced load and a checking load (IA64) • A run-time predictor

  14. Data speculation

  15. Software Pipelining • For high performance technical computing • High trip-count loops • For commercial applications • Low trip-count loops

  16. EPIC: at least not a breakthrough • Design Object of EPIC: • Moving hardware complexity to compiler

  17. EPIC: at least not a breakthrough • The failure of EPIC: • The compiling technique used for EPIC almost also apply well to OOO • Hardware simplicity is not so obvious to offset EPIC’s overhead • Without dynamic information, compiler essentially can’t do sth well enough

  18. The tragedy of cycle time • Why no obvious improvement in cycle time • mechanisms like RSA increase die complexity • Compare and dependent branch in one cycle • Predicted execution dependent on the existence of many function units

  19. Dynamic path length: hey, IA64, you wasted too much here • Speculation • Half of the predicted instructions discarded • Restricted bundling • One base register • No sign-extended loads • No integer multiply or divide in general register

  20. CPI • No dynamic prediction • Longer source code (more GR, Predicate register, template bit, restricted bundling, recovery code) is burdensome for instruction fetching • Recovery code may induce ICache pollution or just a page-fault

More Related