200 likes | 383 Vues
IXP 2400. 4 Types of registers: General purpose. Synchronous Random Access Memory transfer (SRAM). Dynamic Random Access Memory transfer (DRAM). Next Neighbor (NN). Registers. 256 for each Microengine. 32 Bits. Divided in 2 banks of registers (A and B).
E N D
4 Types of registers: • General purpose. • Synchronous Random Access Memory transfer (SRAM). • Dynamic Random Access Memory transfer (DRAM). • Next Neighbor (NN). Registers
256 for each Microengine. • 32 Bits. • Divided in 2 banks of registers (A and B). • 2 Modes of addressing: Thread-Local or Absolute mode. General Purpose Register
Each thread accesses a unique set of GPRs. • If configured to execute 8 threads, a total of 32 GPRs are allocated to each thread (16 bank A an 16 bank B). • If configured to execute 4 threads, a total of 64 GPRs are allocated to each thread (32 bank A an 32 bank B). Thread-Local
A GPR is accessible by any thread on the microengine. • Absolute registers are useful for inter-thread communication within a microengine. • Each GPR can be accessed in either absolute or local as determined at compile-time by the programmer. Absolute mode.
Context-relative name: Gpr123_gab. • Absolute name: @gpr123_fab. • Example “hello_world”: • “ .reg entries_left current_old_entry ” Syntax
Logical width: 4 Bytes. • Size: 128 MB in IXP2400. • Approx. latency: 90 clks. • Off-chip. • SRAM Unit (2 on the IXP2400) • Atomic operations support (thread Sync). Synchronous Random Access Memory transfer (SRAM).
256 for each Microengine. • 32 Bits. • Used to read from and write to all functional units on the IXP 2400 except for DRAM. • 2 Modes of addressing: Thread-Local or Global mode. • Used to read and write data to and from: • SRAM Unit. • SHaC Unit. • Media Switch Fabric Unit. • PCI Interfaces. Synchronous Random Access Memory transfer (SRAM).
When data is read from other functional units, it is placed in SRAM transfer registers. • When the Microengine writes data to the functional units it must first be placed in transfer registers. • Divided in 2 types of registers, write registers and read registers. • By declaring a Register the ASM declares both the write and read register. They have both the same name but are different.
Context-relative name: $_xfer • Indexed name: *$index • Special Indexing: *$index++, *$index-- Example Code: SRAM
Example Code: //Setup a value to write into memory .reg $my_xfer_reg. //Set bit 31 in the Write Xfer. Alu_shf[$my_xfer_reg, --, B, 1, <<31] Alu_shf[$my_xfer_reg, $my_xfer_reg, OR, ctx] The first ALU shift modifies the Write register called my_xfer_reg. The second instruction logically OR´s the context number to the read transfer register and places the results into de transfer register.
Logical width: 8 Bytes. • Size: 1 G addresable in IXP2400. • Approx. latency: 120 clks. • Off-chip. • DRAM Unit (1 on the IXP2400) • NO Atomic operations support. • Unique functionality: ability to move data to and from the MSF unit without the data going through the microengines. Dynamic Random Access Memory transfer (DRAM).
256 for each Microengine. • 32 Bits. • Used to read from and write to DRAM functional unit on the IXP 2400. • Same 2 types of registers as the SRAM: write and read registers physically separated. • 2 Modes of addressing: Thread-Local or Global mode. Dynamic Random Access Memory transfer (DRAM).
Context-relative name: $$tmp • Indexed name: *$$index • Special Indexing:*$$index++, *$$index-- Example Code: DRAM
Global Addressing in SRAM and DRAM. • Global addressing is made using the T_INDEX register. • Only One T_INDEX register exists per microengine. Simultaneously refers to four different transfer registers: • SRAM read-only, SRAM write-only, DRAM read-only, DRAM write-only. • Any Thread can modify the T_Index register, threads should reload this register with the desired value before accessing it and after every context switch. Addressing the Registers.
The T_INDEX register is first loaded with the transfer register number to access (0-127) then the pseudo-register *$index is used to access the SRAM transfer register indicated by the T_INDEX register.
.reg xfer_index shf_left(xfer_index, &$array_data[0], 2) add_shf_left(xfer_index, xfer_index, entries_to_move, 2) sub_shf_left(xfer_index, xfer_index, 1, 2) local_csr_wr[T_INDEX, xfer_index] Example Code: Hello_World
128 for each Microengine. • 32 Bits. Two modes of operation: • Mode one: extra GPRs. • Mode two: data written in these registers is available in the next microengine. • “if ME 0 writes into a next-neighbor register code on microengine 1 can read the data from its next-neighbor register”. Next-neighbor Registers.
Configured in mode 2, two CSRs in each microengine allow the code to treat the next-neighbor register as a 128-entry queue. • NN_GET and NN_PUT can be used as consumer and producer indexes into the array of 128 next-neighbor registers.
Each memory has logical width that determines the minimum number of bytes that are accessed during any memory operation. • All accesses to a particular memory must be aligned to the memory’s logical width. • The hardware controller accepts byte-addresses, but then simply masks off some number of the least-significant bits in the address to align. • SRAM, local memory, and scratchpad: 32 bits. 64 bits (Bytes 0,4,8,etc.) • DRAM: 64 bits (Bytes 0,8,16,etc.) 2 32-bit transfer registers are required to read/write data from a single DRAM address. Logical Width