• iRDMA Flow Control Introduction


    1.0       Introduction

    This will introduce Ethernet flow control on Intel® Ethernet 800 Series Network Adapters with RDMA driver - iRDMA, with a focus on best practices for Linux RDMA traffic.

    It includes:

    • Background on Ethernet flow control (FC) and Data Center Bridging (DCB).
    • Differences between Link-level Flow Control (LFC) and Priority Flow Control (PFC).
    • Configuration steps for each type on 800 Series Linux hosts.
    • Verification tips.

    1.1         QoS/Flow Control Limitations on the 800 Series

    • Although the 800 Series hardware supports eight Traffic Classes (TCs), the maximum supported configuration is four TCs per port. Only one TC can have Priority Flow Control enabled per port.

    Number of Adapter Ports

    Traffic Class Recommendation

    RDMA

    1, 2, or 4

    Up to four TCs, with one of them enabled with PFC.

    Supported

    More than 4

    No DCB Support

    Not Supported

    • In RoCEv2 mode, if no flow control is detected (either LFC or PFC), the driver automatically de-tunes. This is an intentional design to allow RoCEv2 to operate without flow control, but with lower performance.
    • When the 800 Series is in firmware Link Layer Discovery Protocol (LLDP) mode, only three application priorities are supported. Software LLDP supports 32. This refers to the LLDP APP TLV - see man lldptool-app for more info.


       

    2.0         Background

                                                                                                       

    2.1           Ethernet Flow Control

    By design, Ethernet is an unreliable protocol with no guarantee that packets arrive at their destination correctly and in order. Instead, Ethernet relies on upper-layer protocols (such as TCP) or applications to provide reliable service and error correction.

    The 802.3x standard introduced flow control to the Ethernet protocol, defining a mechanism for throttling the flow of data between two directly connected full-duplex network devices. If the sender transmits data faster than the receiver can accept it, the overwhelmed receiver can send a pause signal (Xoff or transmit off) to the sender, requesting that the sender stop transmitting data for a specified period of time. The sender resumes transmission either after the timeout period expires or if the receiver indicates that it is ready to accept more data by sending an Xon (transmit on) signal.

    Without flow control, data might be lost or need to be re-transmitted by a ULP or application, which can significantly affect performance.

    2.2           Flow Control in RDMA Networks

    The 800 Series supports both iWARP and RoCEv2 RDMA transports. Flow control is strongly recommended for RoCEv2, but iWARP also benefits.

    Base Transport

    Flow Control Requirements

    iWARP TCP

    iWARP runs over TCP, a reliable protocol that implements its own flow control.

    TCP's flow control might be relatively slow to respond in a high-performance, low-latency RDMA environment, especially under bursty traffic patterns.

    Ethernet flow control is optional, but can be beneficial for iWARP.

    iWARP mode requires VLAN to be configured fully to enable PFC.

    RoCEv2 UDP

    RoCEv2 runs over UDP, an unreliable protocol with no built-in flow control.

    RoCEv2 therefore requires a lossless Ethernet network to ensure packet delivery.

    If the irdma driver is in RoCEv2 mode and detects no flow control, it automatically de-tunes, causing lower performance.

    Flow control is always recommended for RoCEv2.

    2.3          Types of Flow Control: LFC vs. PFC

    Ethernet standards define two types of flow control:

    • Link-level Flow Control (LFC)
    • Priority Flow Control (PFC)

    Both types use Xon/Xoff pause frames to control data transmission. The primary difference is that LFC pauses all traffic on a link, but PFC supports Quality-of-Service (QoS) by defining different traffic priorities that can be indiv

  • 相关阅读:
    优惠加油系统定制开发卡密
    互联网Java工程师面试题·Dubbo篇·第一弹
    服务器一直被暴力破解
    【SpringBoot从入门到精通】第四章 Springboot配置文件
    仅用Python三行代码,实现数据库和Excel之间的导入导出
    存储优化知识复习二详细版解析
    python绘制ROC曲线
    【Python自学笔记】报错No module Named Wandb
    Python: 一步之遥
    什么是Monkey,以及Monkey异常
  • 原文地址:https://blog.csdn.net/mounter625/article/details/134553500