This will introduce Ethernet flow control on Intel® Ethernet 800 Series Network Adapters with RDMA driver - iRDMA, with a focus on best practices for Linux RDMA traffic.
It includes:
Number of Adapter Ports |
Traffic Class Recommendation |
RDMA |
1, 2, or 4 |
Up to four TCs, with one of them enabled with PFC. |
Supported |
More than 4 |
No DCB Support |
Not Supported |
By design, Ethernet is an unreliable protocol with no guarantee that packets arrive at their destination correctly and in order. Instead, Ethernet relies on upper-layer protocols (such as TCP) or applications to provide reliable service and error correction.
The 802.3x standard introduced flow control to the Ethernet protocol, defining a mechanism for throttling the flow of data between two directly connected full-duplex network devices. If the sender transmits data faster than the receiver can accept it, the overwhelmed receiver can send a pause signal (Xoff or transmit off) to the sender, requesting that the sender stop transmitting data for a specified period of time. The sender resumes transmission either after the timeout period expires or if the receiver indicates that it is ready to accept more data by sending an Xon (transmit on) signal.
Without flow control, data might be lost or need to be re-transmitted by a ULP or application, which can significantly affect performance.
The 800 Series supports both iWARP and RoCEv2 RDMA transports. Flow control is strongly recommended for RoCEv2, but iWARP also benefits.
Base Transport |
Flow Control Requirements |
|
iWARP TCP |
• |
iWARP runs over TCP, a reliable protocol that implements its own flow control. |
• |
TCP's flow control might be relatively slow to respond in a high-performance, low-latency RDMA environment, especially under bursty traffic patterns. |
|
• |
Ethernet flow control is optional, but can be beneficial for iWARP. |
|
• |
iWARP mode requires VLAN to be configured fully to enable PFC. |
|
RoCEv2 UDP |
• |
RoCEv2 runs over UDP, an unreliable protocol with no built-in flow control. |
• |
RoCEv2 therefore requires a lossless Ethernet network to ensure packet delivery. |
|
• |
If the irdma driver is in RoCEv2 mode and detects no flow control, it automatically de-tunes, causing lower performance. |
|
• |
Flow control is always recommended for RoCEv2. |
Ethernet standards define two types of flow control:
Both types use Xon/Xoff pause frames to control data transmission. The primary difference is that LFC pauses all traffic on a link, but PFC supports Quality-of-Service (QoS) by defining different traffic priorities that can be indiv