'InfiniBand' is a
switched fabric communications link primarily used in
high-performance computing. Its features include
quality of service and
failover, and it is designed to be
scalable. The InfiniBand architecture specification defines a connection between
processor nodes and high performance
I/O nodes such as storage devices. It is a superset of the
Virtual Interface Architecture.
Description
'Effective theoretical throughput in different configurations'| | Single | Double | Quad |
|---|
| 1X | 2 Gbit/s | 4 Gbit/s | 8 Gbit/s |
|---|
| 4X | 8 Gbit/s | 16 Gbit/s | 32 Gbit/s |
|---|
| 12X | 24 Gbit/s | 48 Gbit/s | 96 Gbit/s |
|---|
Like
Fibre Channel,
PCI Express,
Serial ATA, and many other modern interconnects, InfiniBand is a point-to-point bidirectional
serial link intended for the connection of processors with high speed peripherals such as disks. It supports several signalling rates and, as with
PCI Express, links can be
bonded together for additional bandwidth.
Signalling rate
The serial connection's signalling rate is 2.5
gigabit per second (Gbit/s) in each direction per connection. InfiniBand supports double and quad data speeds, for 5 Gbit/s or 10 Gbit/s respectively.
Links use
8B/10B encoding — every 10 bits sent carry 8bits of data — so that the useful data transmission rate is four-fifths the raw rate. Thus single, double, and quad data rates carry 2, 4, or 8 Gbit/s respectively.
Links can be aggregated in units of 4 or 12, called 4X or 12X. A quad-rate 12X link therefore carries 120 Gbit/s raw, or 96 Gbit/s of useful data. Most systems today use a 4X 2.5Gbit connection, though the first 5Gbit products are already entering the market. Larger systems with 12x links are typically used for
cluster and
supercomputer interconnects and for inter-
switch connections.
Latency
Current switch chips have a latency of 200
nanoseconds, but the total latency to send a message includes a much larger effect of the end-points.
Thus, the total latency is much larger, 1.29 microseconds (Qlogic InfiniPath HTX HCAs) to 2.6 microseconds (Voltaire Grid Switch ISR 6000 with Mellanox DDR HCAs.).
Topology
InfiniBand uses a
switched fabric topology, as opposed to a hierarchical switched network like
Ethernet.
Like the channel model used in most
mainframe computers, all transmissions begin or end at a channel adapter. Each processor contains a ''host channel adapter'' (HCA) and each peripheral has a ''target channel adapter'' (TCA). These adapters can also exchange information for security or
quality of service.
Messages
Data is transmitted in packets of up to 4 kB that are taken together to form a ''message''. A message can be:
★ a
direct memory access read from or, write to, a remote node (
RDMA)
★ a channel send or receive
★ a transaction-based operation (that can be reversed)
★ a
multicast transmission.
★ an
atomic operation
Programming
One caveat is that InfiniBand has no standard programming interface. The standard only lists a set of "verbs," functions that must exist. The syntax of these functions is left to the vendors. The most common to date has been the verbs API ("VAPI") from
Mellanox. The
OpenFabrics Alliance is creating an open source software stack for InfiniBand and
iWARP that includes the "IBVerbs" library.
History
InfiniBand is the result of merging two competing designs, 'Future I/O', developed by
Compaq,
IBM, and
Hewlett-Packard, with 'Next Generation I/O (ngio)', developed by
Intel,
Microsoft, and
Sun. From the Compaq side, the roots were derived from
Tandem's
ServerNet. For a short time before the group came up with a new name, InfiniBand was called 'System I/O'.
InfiniBand was originally envisioned as a comprehensive "system area network" that would connect CPUs and provide all high speed I/O for "back-office" applications. In this role it would potentially replace just about every datacenter I/O standard including
PCI,
Fibre Channel, and various networks like
Ethernet. Instead, all of the CPUs and peripherals would be connected into a single pan-datacenter switched InfiniBand fabric. This vision offered a number of advantages in addition to greater speed, not the least of which is that I/O workload would be largely lifted from computer and storage. In theory, this should make the construction of clusters much easier, and potentially less expensive, because more devices could be shared and they could be easily moved around as workloads shifted. Proponents of a less comprehensive vision saw InfiniBand as a pervasive, low latency, high bandwidth, low overhead interconnect for commercial datacenters, albeit one that might perhaps only connect servers and storage to each other, while leaving more local connections to other protocols and standards such as PCI.
So far, InfiniBand has seen more limited use. It is used today mostly for performance focused
computer cluster applications, and there are some efforts to adapt InfiniBand as a "standard" interconnect between low-cost machines for either commercial or technical applications. However, a number of the
TOP500 supercomputers have used InfiniBand including the low-cost
System X built by Virginia Tech. In another example of InfiniBand use within high performance computing, the
Cray XD1 uses built-in
Mellanox InfiniBand switches to create a fabric between Opteron and HyperTransport-based compute nodes.
SGI, among others, has also released storage utilizing
LSI products with InfiniBand "target adapters". This product essentially competes with architectures such as Fibre Channel,
iSCSI, and other traditional
storage area networks. Such target adapter-based discs would become a part of the fabric of a given network, in a fashion similar to
DEC VMS clustering. The advantage to this configuration would be lower latency and higher availability to nodes on the network (because of the fabric nature of the network).
The cable InfiniBand uses (CX4) is also commonly used to connect SAS
Serial Attached SCSI HBAs to external (SAS) disk arrays.
See also
★
InfiniBand Trade Association
★
List of device bandwidths
External links
★
An Introduction to the InfiniBand Architecture
★
The InfiniBand Trade Association homepage
★
An InfiniBand™ Technology Overview
★
Is InfiniBand poised for a comeback?
★
InfiniBand edging into storage market
★
OpenFabrics Project SVN
★
OpenFabrics Alliance
★
InfiniBand simulation model