1 / 33

Infiniband Architecture

Infiniband Architecture. Aniruddha Bohra. Distributed Applications and Data Transfer. Traditional distributed applications Need low latency message delivery Data volume in transfers between nodes not too high Server applications Need low latency and high bandwidth data transfers

andrew
Télécharger la présentation

Infiniband Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Infiniband Architecture Aniruddha Bohra CS 545 - Distributed Systems

  2. Distributed Applications and Data Transfer • Traditional distributed applications • Need low latency message delivery • Data volume in transfers between nodes not too high • Server applications • Need low latency and high bandwidth data transfers • Data volumes in transfers are high e.g. in a cluster based storage or streaming multimedia servers • Need Reliable and Available Services • Need easy maintenance CS 545 - Distributed Systems

  3. Application System Call Memory buffers TCP sendmsg Copy from user space Kernel IP and lower layers Backup buffers To NIC Traditional message send • One kernel boundary crossing • Two memory copies!! CS 545 - Distributed Systems

  4. Lessons from parallel computing • Co-processors that can access memory directly used for communication • FLASH, J-Machine, Alewife • User level networking • Virtual Memory Mapped Communication • Unet • VMMC • VIA CS 545 - Distributed Systems

  5. Interconnect bottleneck • Servers require high data transfer rate • CPUs operate at GHz speed • Gigabit ethernet is commonly used in cluster based servers • Data volumes are high • PCI bus is much slower • operates at 32 bit/33 MHz or 64 bit/66 MHz • the next generation bus PCI-X operates at 133 MHz CS 545 - Distributed Systems

  6. Some solutions • HyperTransport • Runs at 800MHz full duplex • Bridges with current buses and other HyperTransport buses • 3GIO • Switch based • Provides a layered implementation • Promises more than 40 Gb/s transfer rate CS 545 - Distributed Systems

  7. More problems with bus based interconnects • Cannot keep up with the increasing CPU and peripheral speed • Bus is shared between all peripherals • The pin count is high – PCB space is limited! • Buses are not able to extend to long distances • Do not support a large number of devices CS 545 - Distributed Systems

  8. Outline • Motivation and background • Infiniband architecture • Infiniband components • Infiniband operation • Other Infiniband features • Status • Summary CS 545 - Distributed Systems

  9. Infiniband Architecture • Provides switch based interconnect • Increased reliability • Scalable and easily maintainable • Supports memory to memory communication • Low latency communication • Provides support for “out of box” components • Scalable • Easier to manage and operate • Is complimentary to the 3GIO and HyperTransport Buses CS 545 - Distributed Systems

  10. What is Infiniband? • Infiniband Architecture(IBA) defines a System Area Network (SAN) • IBA SAN is a communications and management infrastructure for I/O and IPC • IBA defines a switched communications fabric • high bandwidth and low latency • protected, remotely managed environment. • IBA hardware off-loads from the CPU much of the I/O communications operation. CS 545 - Distributed Systems

  11. An IBA SAN CS 545 - Distributed Systems

  12. Outline • Motivation and background • Infiniband architecture • Infiniband components • Infiniband operation • Other Infiniband features • Status • Summary CS 545 - Distributed Systems

  13. Topologies and components • IBA serves as an interconnect for endnodes • A node can be a processor node, an I/O unit and/or a router to another network Node Node Infiniband Fabric Node Node Node Node Node CS 545 - Distributed Systems

  14. Topologies and Components • An IBA network is subdivided into subnets interconnected by routers • Endnodes can attach to a single or multiple subnets • An IBA subnet is composed of endnodes, switches, routers and subnet managers • Each IBT device may attach to a single switch or multiple switches and/or directly with each other CS 545 - Distributed Systems

  15. Channel Adapter (endnode) Channel Adapter (endnode) Port Port Port Port IBT device – processor node Verbs Consumer Consumer Consumer Message and Data Service CS 545 - Distributed Systems

  16. Processor node • Each channel adapter constitutes a node on the fabric • Architecture supports multiple channel adapters per unit with each adapter providing one or more ports to the fabric • Message and Data service is an OS component • Verbs describe the functions to configure, manage and operate a host channel adapter • Verbs are not API but provide the framework for OS to specify it CS 545 - Distributed Systems

  17. Channel Adapter • An IBA channel adapter(CA) is a programmable DMA engine with special protection features that allow DMA operations to be initiated locally and remotely. • Host Channel Adapter(HCA) provides a consumer interface providing the functions specified by IBA verbs. • Target Channel Adapter(TCA) provides an interface to the device CS 545 - Distributed Systems

  18. Channel Adapter CS 545 - Distributed Systems

  19. Addressing in IBA • Each endnode has one or more CAs and each CA has one or more ports • Each Queue Pair (QP) has a QP number (QPN) assigned by the CA • Each port has a unique Local ID (LID) and at least one IPv6 address – Global ID (GID) CS 545 - Distributed Systems

  20. Switches • Do not generate or consume packets – pass them along based on the destination address • Are the routing components for intra-subnet routing – support uni or multicast • Every destination is configured with one or more unique Local IDs (LIDs) • Subnet manager configures switches including loading their forwarding tables CS 545 - Distributed Systems

  21. Routers • Routers are inter-subnet routing elements • Routers forward packets based on the packet’s global route header • Routers expose one or more ports between which packets are relayed • IPv6 specifies the protocol performed between routers to derive their routing tables CS 545 - Distributed Systems

  22. Subnet Managers • An Subnet Manager(SM) is an entity attached to a subnet responsible for its management • Tasks • Discover topology • Configure the CA port with a range of LIDs, GIDs, subnet prefix and Partition_Keys • Maintains LID/GID resolution tables CS 545 - Distributed Systems

  23. Outline • Motivation and background • Infiniband architecture • Infiniband components • Infiniband operation • Other Infiniband features • Status • Summary CS 545 - Distributed Systems

  24. Communication • Queuing • Consumer queues up a set of instructions for hardware to execute (Work queue). • Work queues are created in pairs(Queue pairs – QP) for send and receive operations • Each Work Queue has corresponding Completion Queue CS 545 - Distributed Systems

  25. Work Queue Operations • Send operations • SEND • Block in memory space to send to destination • RDMA • RDMA_READ, RDMA_WRITE, ATOMIC • Memory Binding • Alters the memory binding relationship – gives the R_KEY to components which allows secure DMA • Receive operation • Specifies a receive data buffer CS 545 - Distributed Systems

  26. Work Queue Operations CS 545 - Distributed Systems

  27. Communication Stack CS 545 - Distributed Systems

  28. Keys • Keys are used to provide isolation and protection • M_KEY • Enforces the control of a master Subnet Manager • B_KEY • Enforces control of a baseboard Subnet Manager • P_KEY • Enforces membership in a subnet • Q_KEY • Enforces access rights for reliable or unreliable service • L_KEY and R_KEY • Provide access rights to Remote registered memory CS 545 - Distributed Systems

  29. Outline • Motivation and background • Infiniband architecture • Infiniband components • Infiniband operation • Other Infiniband features • Status • Summary CS 545 - Distributed Systems

  30. Virtual Lanes • A virtual lane represents a set of transmit and receive buffers in a port • VL15 is used for subnet management • Each port must have at least one data VL • Separate flow control is maintained over each VL CS 545 - Distributed Systems

  31. Service Levels • Service levels(SLs) are maintained by attaching a VL to a SL • IBA does not specify any QoS levels(e.g. best effort) • The SMA must keep a mapping of Service Level to Virtual Lane and propagate it through the switch CS 545 - Distributed Systems

  32. Status • Intel Developer Forum had several status talks • http://www.intel.com/idf/us • IBA enabled network storage has been demonstrated at industry shows • Banderacom • Windriver • The first products are expected to be in the market by middle of 2002 CS 545 - Distributed Systems

  33. Summary • Future bandwidth requirements for servers would lead to the interconnect becoming a bottleneck – IBA is an attempt to alleviate the problem • IBA provides a thorough migration from a bus based to a switch based architecture while maintaining interoperability • Further deployment is needed to realize other issues that would arise in operation CS 545 - Distributed Systems

More Related