1 / 20

A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit

A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit. Based on a ENEL619.23 white paper prepared by Darrell Anklovitch. Overview. Architecture Overview Register Map ALU features and sample instructions Multiplier features and sample instructions

topper
Télécharger la présentation

A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A comparison of DSP Architectures BlackFin ADSP-BFXXX Compute Unit Based on a ENEL619.23 white paperprepared by Darrell Anklovitch Blackfin Compute Unit REV B

  2. Overview • Architecture Overview • Register Map • ALU features and sample instructions • Multiplier features and sample instructions • Shifter features and sample instructions Blackfin Compute Unit REV B

  3. References • ADSP-BF535 Blackfin Processor Hardware Reference, Rev 2, April 2004, Analog Devices. – Section 2 • Blackfin Processor Instruction Set Reference, Rev 2, May 2003, Analog Devices. – Sections 8 ~ 10, 14 & 15 • A number of the figures in this presentation are based on figures found in the ADSP-BF535 Blackfin Processor Hardware Reference. Blackfin Compute Unit REV B

  4. ADSP-2106x Core Architecture CACHE JTAG TEST & MEMORY EMULATION 32 x 48 FLAGS DAG 1 DAG 2 PROGRAM 8 x 4 x 32 8 x 4 x 24 SEQUENCER TIMER 24 PMA BUS PMA DMA BUS 32 DMA 48 PMD BUS PMD BUS CONNECT DMD BUS 40 DMD REGISTER FLOATING & FIXED-POINT FILE 32-BIT FLOATING-POINT MULTIPLIER, 16 x 40 BARREL & FIXED-POINT FIXED-POINT SHIFTER ALU ACCUMULATOR Blackfin Compute Unit REV B

  5. Register File and COMPUTE Units • Key issues • 5 data paths FROM COMPUTE units • 5 data paths TO COMPUTE units • Highly parallel operations UNDER THE RIGHT CONDITIONS Blackfin Compute Unit REV B

  6. BF533 Memory Accesses Under the right conditions -- 4 memory accesses at same time 64 bit Instruction Fetch, 2x32 bit Data Loads, 32 bit Data Store PLUS up to 2 ALU(32 bit) and 2 MAC(16 bit) operations at the same time PLUS background DMA activity Blackfin Compute Unit REV B

  7. Compute Unit Architecture Register File 2 Multipliers 1 set of Video ALUs 1 Shifter 2 ALUs Blackfin Compute Unit REV B

  8. 8 x 32 bit OR 16 x 16 bit 2 x 40 bit accumulators Register File • DATA REGISTER SYNTAX: • R0, R1 etc refer to 32 bit registers • R0.L refers to the low 16 bits of the R0 32 bit reg • R0.H refers to the high 16 bits of the R0 register • ACCUMULATOR SYNTAX: • A0.L => low 16 bits • A0.H => next 16 bits • A0.W => least significant 32 bit word • A0.X => MS 8 bit extension SHARC – 16 32-bit data registers, integer and floatThere is a pair of SHARC accumulator registers too Blackfin Compute Unit REV B

  9. ALU Data Flow 2 x 32 bit paths to dual Multiplier/ALU units 2 x 32 bit paths back to register file Blackfin Compute Unit REV B

  10. Sample instructions Blackfin Compute Unit REV B

  11. Dual 16 bit OPS: Can be : ALU Features Single 16 bit OPS: 31 Rm Rp Rn Dual 16 bit Cross: Single 32 bit OPS: 31 Rm Rp Rn Blackfin Compute Unit REV B

  12. Quad 16 bit ops: Dual 32 bit ops: C A B D A B ALU Sample Instructions Single 16 bit ops: Dual 16 bit ops: Single 32 bit ops: Does not work in parallel Must have this option Operator order is important + must come before - • A & B registers must stay on the same side of the ‘|’ for both • Instructions • For dual and quad 16 bit operations the (CO) option causes the • destination registers to cross Blackfin Compute Unit REV B

  13. Multiply Data Flow 2 x 32 bit paths to dual Multiplier/ALU units Multiplier share the same operand/result buses as the ALU 2 x 40 bit accumulator 2 x 32 bit paths back to register file Blackfin Compute Unit REV B

  14. H H L L H L H L Multiply Features • Multiplies are signed fractional by default • Signed fractional multiply result is automatically left • shifted 1 bit. • Signed fractional multiply != signed integer multiply • Rounding available on fractional number multiplies and • special option of integer number multiplies Blackfin Compute Unit REV B

  15. 31 Rm 31 Rp 32 bit result 0x8000 0x8000 top 16 bits go to destination register top 16 bits go to destination register 31 31 Rd Rd Rounding 2 cases: Rounding adds 0x8000 to the 32 bit multiplier result or accumulator value before extracting a 16 bit value to the destination register Blackfin Compute Unit REV B

  16. Fractional Multiply Fractional Multiply != Integer Multiply Fractional Multiply != Integer Multiply • When extracting a 16 bit fractional value from an accumulator • the high 16 bits is taken • Where in the destination register it goes depends on which • accumulator is being extracted from Blackfin Compute Unit REV B

  17. Integer Multiply Fractional Multiply != Integer Multiply • When extracting a 16 bit integer value from an accumulator • the low 16 bits is taken. • Where in the destination register the 16 bit value goes depends • on which accumulator is being extracted from Blackfin Compute Unit REV B

  18. Multiply Sample Instructions 16 bit extraction from ACC 0 16 bit extraction from ACC 1 Multi-issue MAC Instruction Examples 32 bit extraction A1 += R1.H * R2.L , A0 += R1.L * R2.L; R3.H = (A1 += R1.H * R2.L) , R3.L = (A0 += R1.L * R2.L); Any combination of .H and .L in the 2 operands is allowed R3 = (A1 += R1.H*R2.L), R2 = (A0 += R1.L * R2.L); Where destination registers must be paired as follows: R[1,0], R[3,2], R[5,4] and R[7,6] R3.H = (A1 += R1.H * R2.L), A0 += R1.L * R2.L; Blackfin Compute Unit REV B

  19. Arithmetic shift 3 op Reg shift 3 op Immediate shift 2 operator Register shifts 2 operator Immediate shifts Shifter Sample Instructions Blackfin Compute Unit REV B

  20. Parallel Instruction Examples • In general there are 16 and 32 bit versions of the arithmetic instructions • Most of the 32 bit instructions can be executed in parallel with 2 x 16 bit memory/index operations • Exceptions are DIVS, DIVQ and MULTIPLY with 32 bit operands • || means parallel • Examples: • A1=R2.L*R1.L,A0=R2.H*R1.H||R2.H=W[I2++] || [I3++]=R3;\ • R2=R2+|+R4, R4=R2-|-R4 || I0+=M0||R1=[I0]; Blackfin Compute Unit REV B

More Related