1 / 23

REGISTER FILE ACCESS REDUCTION BY DATA REUSE

REGISTER FILE ACCESS REDUCTION BY DATA REUSE. Hiroshi Takamura Koji Inoue Vasily G. Moshnyaga. Dept. of Electronics Engineering and Computer Science Fukuoka University, Japan. Overview of the talk. Motivation of this work The Data-Reuse approach Experimental Results Conclusion.

fineen
Télécharger la présentation

REGISTER FILE ACCESS REDUCTION BY DATA REUSE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. REGISTER FILE ACCESS REDUCTION BY DATA REUSE Hiroshi Takamura Koji Inoue Vasily G. Moshnyaga Dept. of Electronics Engineering and Computer Science Fukuoka University, Japan

  2. Overview of the talk • Motivation of this work • The Data-Reuse approach • Experimental Results • Conclusion

  3. Motivation of this work Extending battery life time. Making to low-cost. Reducing energy consumption of microprocessors is necessary

  4. Clock : Data path: Controller: 36% 36% 28% Total 100% Power distribution in Motorola’s M-core Source: D.Gonzales, IEEE Micro,19(4)1999 Register file takes 16% of the total power and 42% of the data path power!

  5. Energy = ( Nread + Nwrite ) * Eacc Register File Energy Dissipation Total number of RF writes Total number of RF reads Average energy per RF access To lower N according to operand variation by Architectural optimizations Our goal: Read and write consumesequal energy Assumption:

  6. The value is not updated. Problem of conventional RF operation The first source operand Destination operand The second source operand add $t0, $s1, $t1 (i) mul $t3, $s1, $t1 (ii) 4 read-accesses Register file Rs ALU Rt Therefore there is unnecessary RF reading

  7. Problem of conventional RF operation Almost all results are provided to following instructions via forwarding units, so that they are consumed before RF writing. So, there is a unnecessary RF writing

  8. Register file access reduction approach(Reuse of the same source operand value) The first source operand Destination operand add $t0, $s1, $t1 (i) mul $t3, $s1, $t1 (ii) The second source operand R-mode Register file Rs ALU Rt control

  9. Register file access reduction approach(operand swapping) The first source operand Destination operand add $t0, $s1, $t1 (i) mul $t3, $t1, $s1 (ii) The second source operand S-mode Register file Rs MUX ALU Rt MUX control

  10. RF access reduction approach(Delayed Operand Reuse) The first source operand Destination operand The second source operand sub $t3, $s1, $t1 (i) lw $t2, 20($s2) (ii) sub $t4, $t2, $t1 (iii) J-mode Register file Rs ALU Rt control

  11. Useless writing access Reduction of RF writing(Application of writing operation omission) The first source operand Destination operand The second source operand add $t1, $t1, $s1 (i) sub $t1, $s1, $t1 (ii) c.c.1 c.c.2 c.c.3 c.c.4 c.c.5 c.c.6 (i) (ii) IM Reg DM Reg IM Reg DM Reg

  12. Source1 Dest.s Source2 -An example- Number of accesses Number of accesses in conventional register file add $t0, $s1, $t1 (i) mul $t3, $s1, $t1 (ii) add $t1, $t1, $s1 (iii) sub $t1, $s1, $t1 (iv) lw $t2, 20($s1) (v) sub $t4, $s1, $t1 (vi)

  13. Source1 Dest.s Source2 7 6 Operand reusing between continuous instructions Number of accesses add $t0, $s1, $t1 (i) mul $t3, $s1, $t1 (ii) add $t1, $t1, $s1 (iii) sub $t1, $s1, $t1 (iv) lw $t2, 20($s1) (v) sub $t4, $s1, $t1 (vi)

  14. Source1 Dest.s Source2 3 6 Operand swapping Number of accesses add $t0, $s1, $t1 (i) mul $t3, $s1, $t1 (ii) add $t1, $t1, $s1 (iii) sub $t1, $s1, $t1 (iv) lw $t2, 20($s1) (v) sub $t4, $s1, $t1 (vi)

  15. Source1 Dest.s Source2 Reusing operand between discontinuous instructions Number of accesses add $t0, $s1, $t1 (i) mul $t3, $s1, $t1 (ii) add $t1, $t1, $s1 (iii) sub $t1, $s1, $t1 (iv) lw $t2, 20($s1) (v) sub $t4, $s1, $t1 (vi) 2 6

  16. Source1 Dest.s Source2 Writing operation omission Number of accesses add $t0, $s1, $t1 (i) mul $t3, $s1, $t1 (ii) add $t1, $t1, $s1 (iii) sub $t1, $s1, $t1 (iv) lw $t2, 20($s1) (v) sub $t4, $s1, $t1 (vi) 2 5

  17. RF accesses by the proposed technique add $t0, $s1, $t1 (i) mul $t3, $s1, $t1 (ii) add $t1, $t1, $s1 (iii) sub $t1, $s1, $t1 (iv) lw $t2, 20($s1) (v) sub $t4, $s1, $t1 (vi) Number of reading:11 times > 2 times Number of writing:6 times > 5 times Number of total accesses:17 times > 7 times

  18. Experimental Evaluation • Flexible Architecture Simulation Tool • Cycle-accurate instruction simulation on 5-stage RISC-type microprocessor (similar to MIPS) • Traces user-level instructions and records RF access info as well as operand’s total number of reuse. • 32-entry RF (1 write, 2 reads) • SPEC95 and MediaBench Benchmarks: adpcm_c, adpcm_d, compress, go, mpeg_d, mpeg_e, pegwit_g, pegwit_enc, pegwit_dec • we described a simple RISC microprocessor in Verilog-HDL, and synthesized it by Synopsys Design Compiler. • A 0.35 μm process technology was assumed. • SUN UltraSparc-3 environment

  19. Reduction rate (%) for the RF read RF access reduction: 62.7% (maximum)!

  20. Reduction rate (%) for the RF write RF access reduction: 60% (maximum)!

  21. Reduction rate (%) for read&write RF access reduction: 61% (maximum)!

  22. Area comparison Hardware Overhead: +3.2% (maximum)!

  23. Conclusion • We proposed a technique to reduce energy dissipation of register file by operand reuse • Energy savings vary on application: • Read: 62% (max), 29%(aver.) • Write: 60% (2instr), 55%(1instr) • Total: 61% (max), 39%(aver.) • Hardware overhead • Read: 1.7%, Read&Write: 3.2% Future Work • Verification at a cycle level • Evaluation based on a detailed energy models • A detailed estimation of the control circuitry overhead

More Related