1 / 12

Energy Efficient and High Speed On-Chip Ternary Bus

Energy Efficient and High Speed On-Chip Ternary Bus. Chunjie Duan Mitsubishi Electric Research Labs, Cambridge, MA, USA Sunil P. Khatri Texas A&M University, College Station, TX, USA. Motivation. Trends in VLSI design Shrinking feature size

Télécharger la présentation

Energy Efficient and High Speed On-Chip Ternary Bus

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Energy Efficient and High Speed On-Chip Ternary Bus Chunjie Duan Mitsubishi Electric Research Labs, Cambridge, MA, USA Sunil P. Khatri Texas A&M University, College Station, TX, USA

  2. Motivation • Trends in VLSI design • Shrinking feature size • Deep SubMicron (DSM) and Very Deep SubMicron (VDSM) processes • Scaling down supply voltage • Increasing die-size (e.g. SoC, NoC, CMP) • Impacts • Smaller gate delay (high speed logic) • Lower switching power per gate • High complexity (>billion gates) • Increasing power consumption • Higher leakage current (standby power) • Reduced noise margin • Increasing interconnect delay • Interconnect delay >> gate delay • Global interconnect becomes the performance bottleneck

  3. CI CI CI CI T CL CL CL CL CL CL On-chip Bus Interconnects • The impact of DSM / VDSM: • W↓,P↓ • L↑, T↑ • to avoid quadratic increase in resistance of the wire: • Inter-wire capacitance CIis much greater than substrate capacitance CL, → crosstalk becomes dominant • λ= CI / CL > 10 for metal 4 in a 0.1mm CMOS process W P Earlier process DSM process

  4. Ternary Bus and Mapping • Advantage of a ternary bus • low voltage step: Vdd/2 instead of Vdd • We propose a bit-to-bit binary-ternary mapping scheme • Each binary bit is mapped directly to a line on the ternary bus. • A binary 0 is mapped to a middle value on the ternary bus. i.e. 0b->0t. • A binary 1 is mapped to either high or low value on the ternary bus. i.e. 1b+ or 1b- . • Disadvantage: lower bit density (1 bit/line vs 1.58 bit/line for true ternary bus) • Advantages: direct mapping and flexible polarity • Ternary to binary conversion is very slow and complex • Flexible polarity results in low crosstalk. e.g., the ternary vectors +0+, -0-, +0- and -0+all represent the same binary value 101. • Each ternary value is represented by the polarity Pj and the magnitude Dj Ternary driver truth table

  5. Crosstalk in a Multi-valued Bus • Define the effective crosstalk as • where dj,k = sgn(dj) DVk is the normalized voltage change, and . NOL is the number of logic levels • Delay can be approximated as • for l >> 1, • Energy consumption is • when l >> 1, • For ternary bus, Vstep = Vdd/2, we know • max(Xeff,j)= 8 • min(Xeff,j)=0 • Bus speed/power is highly data pattern dependent! Table 1. Examples of Total Crosstalk

  6. A Low Power, High Speed 4X Ternary Bus • Using direct bit-to-bit mapping • Coding rules: • Rule #1: A direct- ↔ +transition is prohibited. • Rule #2: A 1b0bis mapped as -t0t or +t0t depending only on the current polarity of the 1b. • Rule #3: For a 0b1b transition on bj, if bj-1 is transitioning, Pj is coded so both lines transition in the same direction. • Rule #4: For a 0b1b transition on bj, if bj-1 is not transitioning and and bj+1 is transitioning from 1 to 0, Pj is coded so that the jth and (j+1)th line transition in the same direction. • Rule #5: For a 0b1b transition on bj, if no transition on either neighbor, Pj is coded so {Pj = Pj-1 or Pj = Pj+1} with Pj = Pj-1 having the higher priority. • The 1st rule guarantees max(Xeff,j) = 4, therefore a 2X speed up from a conventional binary bus • The other rules are designed to lower the probability of high value Xeff,j’s occurrence on the bus • Identical encoder/decoder logic for each bit An example of 4X ternary sequences

  7. An Even Faster 3X Ternary Bus • Partition the bus into 5-bit groups • Insert shield wire between groups • Apply the same rules for 4X bus • It can be proven that such a configuration guarantees max(Xeff) = 3 • Additional 33% speed up over 4X ternary bus • At the cost of 20% additional wires 4X bus encoder and driver circuit 3X bus encoder and driver circuit

  8. I ref V dd 2 I r e f I r e f to D j + 1 M 1 M 2 out 1 d ENC d in out 2 out C I R I - driver M 5 M 3 C M 4 bus L w xtalk I - receiver to D j - 1 ( A ) current mode shared V - ref V dd V ref 1 V dd V / 2 V dd dd M 2 to D j + 1 V ref 2 M 1 V ref 1 C I ENC din R V dd M 3 C L bus V - driver d out to D j - 1 V ref 2 V - receiver ( B ) Voltage mode Circuit Implementations • Encoder implemented based on the 5 rules • Decoder is extremely simple (implemented with two 2-input gates) • Ternary driver and receiver can be implemented in current or voltage mode • Current mode is more power hungry (static current) • Voltage mode requires a low impedance Vdd/2 supply

  9. Experimental Results • The power saving comes from the redistribution of the Xeff • More transitions are pushed towards lower Xeff • The average power saving is ~27% Crosstalk distribution and normalized energy consumption comparison (code ternary vs. half-swing binary) 4X: ternary bus using 4X code; HB: half-swing binary bus; RP: ternary bus with random polarity; TT: true ternary bus

  10. Experimental Results • The proposed 4X and 3X busses are advantageous over other bus coding schemes. • EF: Normalized total energy • PDP: power delay product 4XT: ternary bus using 4X code; 3XT: ternary bus with 3X code; SB: binary bus with shielding; HB: half-swing binary bus; RP: ternary bus with random polarity; TT: true ternary bus Bus performance comparison

  11. Experimental Results Eye diagrams for uncoded an coded busses (10mm)

  12. Summary • Crosstalk classification was extended to multi-valued buses • We proposed a direct bit-to-bit binary-ternary mapping scheme which results in a simple CODEC design. • We proposed a 4X coding scheme that allows us to double the speed of a conventional ternary bus and save energy. • We proposed a coding scheme (3X coding) to attain an additional 33% speed gain at the cost of 20% area overhead. • We designed and implemented the CODEC and ternary driver/receiver. • Our experimental results show significant power saving (27%) and speed gain (2X or more) over other schemes

More Related