What is Remote Direct Memory Access (RDMA)?
Remote Direct Memory Access, RDMA for short, makes it possible to send data quickly and easily from a computer’s main storage to the main storage of another computer. This is especially helpful when dealing with large data sets or complex processes, which tend to be required, for example, for machine learning. But what exactly is RDMA and how does it work?
Definition: Remote Direct Memory Access (RDMA)
As is the case with the related direct memory access (DMA), one can use remote direct memory access (RDMA) to transfer data directly from a computer’s main memory or working memory to the storage of another computer. In this case, operating systems, CPUs, and cache are completely ignored so that fewer hardware resources are burdened. For this purpose, the storage transfer with RDMA runs via a private buffer in which the user transfers their data. In doing so, the technology accesses a system’s network cards, where the transfer is processed via Ethernet or InfiniBand.
Remote Direct Memory Access is a technology that has direct memory access as its basis. It enables data to be transferred from a computer’s working memory to the working memory of another system, without burdening the operating systems, cache or CPU in the process.
Many products (hardware and software) already support RDMA. Among these solutions are:
- Apache Hadoop
- Spark
- Baidu Paddle
- Dell EMC PowerEdge Server
- Intel Xeon Scalable Processors
- Microsoft Windows Server (2012 and more recent versions)
How does RDMA work?
For the data exchange via RDMA to work, a suitable set of protocols in the network cards is required. Typically, TCP/IP is used as a basis. Only through a suitable transport protocol can the technology support so-called zero copy networking, for example. Zero copy networking does not utilise the computer processor. Where both systems enable the use of Remote Direct Memory Access, the data transfer between them is considerably faster than that between systems without RDMA support.
Remote Direct Memory Access is especially helpful in the case of parallel processes on high-performance computers.
For a smooth transfer with RDMA, the following network technologies and interfaces offer the best requirements:
- RDMA over Converged Ethernet (RoCE): RoCE enables RDMA via an Ethernet connection.
- Internet Wide Area RDMA Protocol (IWARP): For the data transfer, IWARP relies on the transport protocol TCP, or alternatively, on streaming TCP. IWARP was initiated by the Internet Engineering Task Force (IETF) in order to write tasks and processes directly onto applications in another system.
- InfiniBand: InfiniBand is a communication standard for high-performance computers in order to transfer files with lower latency. It is frequently used in computer centres in order to connect computer clusters with one another. RDMA via InfiniBand is one of the most popular methods for quick data exchange.
However, Remote Direct Memory Access can also be used in conjunction with Flash or SSD storage mediums and NVDIMMs (non-volatile dual in-line memory modules).
The evolution of RDMA is still in full swing. As with RDMA over Fabrics, its next potential application is already in the starting blocks. Infrastructures between several servers and computers are referred to as a fabric. They support the data transfer via fibre channel networks (storage area networks) and PCI Express (standard for high-speed connections).
What are the advantages and disadvantages of Remote Direct Memory Access?
One of the main advantages of RDMA is its outstanding speed when compared with other technologies and protocols for data transfer such as iSCSI (SCSI protocol via TCP), fibre channel (FC) or fibre channel over Ethernet (FCoE). However, the final speed of the data exchange also depends on the RDMA variant. Ethernet and InfiniBand are especially popular, as these enable transfer speeds of 10 to 100 Gigabit per second. This is especially suitable for areas of application that require high computing power, such as distributed databases, big data analyses or applications in data centres.
However, RDMA also has disadvantages when compared with fibre channels, which are still used by many companies. In order to introduce RDMA, companies must arrange high investments, as the technology requires the acquisition of new hardware and protocol components. The costs of Remote Direct Memory Access are thus considerably higher than those for FC or FCoE. In addition, the speedy data transfer via RDMA only works if all systems are supported by the technology.