• 基于soft-RoCE运行OSU Micro Benchmark


    之前的文章描述了如何运行Benchmark,但是那个是基于TCP的。现在想要跑一个基于RoCEv2的结果。虚拟机上没有支持infiniband的网卡,那就用Soft RoCE了。

    Soft-RoCE的安装和调试

    • 系统版本信息
    admin@osu-1:~$ uname -a
    Linux osu-1 5.11.0-44-generic #48~20.04.2-Ubuntu SMP Tue Dec 14 15:36:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
    
    • 1
    • 2
    • 安装rdma-core和verbs应用
    admin@osu-1:~$ sudo apt install rdma-core ibverbs-utils -y
    
    • 1
    • 基于已有网口ens8添加ib端口,命名为ib5
    admin@osu-1:~$ sudo  rdma link add ib5 type rxe netdev ens8
    admin@osu-1:~$  rdma link show
    link ib5/1 state ACTIVE physical_state LINK_UP netdev ens8 
    
    • 1
    • 2
    • 3

    安装调试MPI

    • 支持MPI有很多选择:openmpi/mpich/mvapich
    • 经过各种测试和挫折,最后选择mvapich2,谁让它跟OSU Micro Benchmark是一家的呢
    • 提前安装编译过程中需要的软件
    admin@osu-1:~$  sudo apt install byacc -y
    
    • 1
    • 获取源码
    admin@osu-1:~$ wget http://mvapich.cse.ohio-state.edu/download/mvapich/mv2/mvapich2-2.3.7-1.tar.gz
    
    • 1
    • 解压后进入目录
    admin@osu-1:~$ tar zxvf mvapich2-2.3.7-1.tar.gz 
    admin@osu-1:~$ cd mvapich2-2.3.7-1/
    admin@osu-1:~/mvapich2-2.3.7-1$ 
    
    • 1
    • 2
    • 3
    • configure的时候,注意要带的参数
    admin@osu-1:~/mvapich2-2.3.7-1$ ./configure --with-device=ch3:mrail --with-rdma=gen2
    
    • 1
    • 然后编译安装
    admin@osu-1:~/mvapich2-2.3.7-1$ make -j$(nproc) 
    admin@osu-1:~/mvapich2-2.3.7-1$ sudo make install
    
    • 1
    • 2
    • Benchmark已经同步编译好了
    admin@osu-1:~/mvapich2-2.3.7-1$ cd osu_benchmarks/mpi/pt2pt/
    admin@osu-1:~/mvapich2-2.3.7-1/osu_benchmarks/mpi/pt2pt$ 
    admin@osu-1:~/mvapich2-2.3.7-1/osu_benchmarks/mpi/pt2pt$ ls -lt
    total 320
    -rwxrwxr-x 1 admin admin  6332 11月 17 10:40 osu_multi_lat
    -rwxrwxr-x 1 admin admin  6342 11月 17 10:40 osu_latency_mt
    -rwxrwxr-x 1 admin admin  6312 11月 17 10:40 osu_latency
    -rwxrwxr-x 1 admin admin  6262 11月 17 10:40 osu_bw
    -rwxrwxr-x 1 admin admin  6342 11月 17 10:40 osu_latency_mp
    -rwxrwxr-x 1 admin admin  6302 11月 17 10:40 osu_mbw_mr
    -rwxrwxr-x 1 admin admin  6282 11月 17 10:40 osu_bibw
    -rw-rw-r-- 1 admin admin 11904 11月 17 10:40 osu_bibw.o
    -rw-rw-r-- 1 admin admin 18072 11月 17 10:40 osu_mbw_mr.o
    -rw-rw-r-- 1 admin admin 16872 11月 17 10:40 osu_latency_mt.o
    -rw-rw-r-- 1 admin admin 11456 11月 17 10:40 osu_bw.o
    -rw-rw-r-- 1 admin admin 10976 11月 17 10:40 osu_latency_mp.o
    -rw-rw-r-- 1 admin admin  9688 11月 17 10:40 osu_latency.o
    -rw-rw-r-- 1 admin admin  9872 11月 17 10:40 osu_multi_lat.o
    -rw-rw-r-- 1 admin admin 28374 11月 17 10:23 Makefile
    -rw-r--r-- 1 admin admin 28795 5月  24 01:46 Makefile.in
    -rw-r--r-- 1 admin admin  1446 5月  17  2022 Makefile.am
    -rw-r--r-- 1 admin admin 13925 5月  17  2022 osu_bibw.c
    -rw-r--r-- 1 admin admin 13046 5月  17  2022 osu_bw.c
    -rw-r--r-- 1 admin admin  9926 5月  17  2022 osu_latency.c
    -rw-r--r-- 1 admin admin  7763 5月  17  2022 osu_latency_mp.c
    -rw-r--r-- 1 admin admin 12654 5月  17  2022 osu_latency_mt.c
    -rw-r--r-- 1 admin admin 19056 5月  17  2022 osu_mbw_mr.c
    -rw-r--r-- 1 admin admin 10070 5月  17  2022 osu_multi_lat.c
    admin@osu-1:~/mvapich2-2.3.7-1/osu_benchmarks/mpi/pt2pt$ 
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 确保mpi的路径加入到PATH
    admin@osu-1:~/mvapich2-2.3.7-1/osu_benchmarks/mpi/pt2pt$ which mpirun
    /usr/local/bin/mpirun
    admin@osu-1:~/mvapich2-2.3.7-1/osu_benchmarks/mpi/pt2pt$ 
    admin@osu-1:~/mvapich2-2.3.7-1/osu_benchmarks/mpi/pt2pt$ echo $PATH
    /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
    admin@osu-1:~/mvapich2-2.3.7-1/osu_benchmarks/mpi/pt2pt$ 
    admin@osu-1:~/mvapich2-2.3.7-1/osu_benchmarks/mpi/pt2pt$ PATH=$PATH:/usr/local/bin
    admin@osu-1:~/mvapich2-2.3.7-1/osu_benchmarks/mpi/pt2pt$ 
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    运行

    • 克隆一台和上面一样的虚拟机,两台虚拟机可以通过5.5.5.3和5.5.5.4互相ping通
    • 运行osu_latency,开头有一些WARNING,先不管
    admin@osu-1:~/mvapich2-2.3.7-1/osu_benchmarks/mpi/pt2pt$ mpirun_rsh -np 2 5.5.5.3 5.5.5.4 MV2_USE_RoCE=1 MV2_IBA_HCA=ib5 ./osu_latency
    [osu-1:mpi_rank_0][rdma_find_network_type] Unable to find the numa process is bound to. Disabling process placement aware hca mapping.
    [osu-1:mpi_rank_0][mv2_get_hca_type] **********************WARNING***********************
    [osu-1:mpi_rank_0][mv2_get_hca_type] Failed to automatically detect the HCA architecture.
    [osu-1:mpi_rank_0][mv2_get_hca_type] This may lead to subpar communication performance.
    [osu-1:mpi_rank_0][mv2_get_hca_type] ****************************************************
    [osu-1:mpi_rank_0][mv2_get_hca_type] **********************WARNING***********************
    [osu-1:mpi_rank_0][mv2_get_hca_type] Failed to automatically detect the HCA architecture.
    [osu-1:mpi_rank_0][mv2_get_hca_type] This may lead to subpar communication performance.
    [osu-1:mpi_rank_0][mv2_get_hca_type] ****************************************************
    [osu-1:mpi_rank_0][mv2_get_hca_type] **********************WARNING***********************
    [osu-1:mpi_rank_0][mv2_get_hca_type] Failed to automatically detect the HCA architecture.
    [osu-1:mpi_rank_0][mv2_get_hca_type] This may lead to subpar communication performance.
    [osu-1:mpi_rank_0][mv2_get_hca_type] ****************************************************
    [osu-1:mpi_rank_0][rdma_open_hca] Unknown HCA type: this build of MVAPICH2 does not fully support the HCA found on the system (try with other build options)
    [osu-1:mpi_rank_0][mv2_new_get_hca_type] **********************WARNING***********************
    [osu-1:mpi_rank_0][mv2_new_get_hca_type] Failed to automatically detect the HCA architecture.
    [osu-1:mpi_rank_0][mv2_new_get_hca_type] This may lead to subpar communication performance.
    [osu-1:mpi_rank_0][mv2_new_get_hca_type] ****************************************************
    [osu-2:mpi_rank_1][rdma_find_network_type] Unable to find the numa process is bound to. Disabling process placement aware hca mapping.
    [osu-2:mpi_rank_1][rdma_open_hca] Unknown HCA type: this build of MVAPICH2 does not fully support the HCA found on the system (try with other build options)
    [osu-1:mpi_rank_0][rdma_param_handle_heterogeneity] All nodes involved in the job were detected to be homogeneous in terms of processors and interconnects. Setting MV2_HOMOGENEOUS_CLUSTER=1 can improve job startup performance on such systems. The following link has more details on enhancing job startup performance. http://mvapich.cse.ohio-state.edu/performance/job-startup/.
    [osu-1:mpi_rank_0][rdma_param_handle_heterogeneity] To suppress this warning, please set MV2_SUPPRESS_JOB_STARTUP_PERFORMANCE_WARNING to 1
    # OSU MPI Latency Test v5.9
    # Size          Latency (us)
    0                     139.61
    1                     144.72
    2                     141.35
    4                     140.04
    8                     139.94
    16                    140.42
    32                    139.10
    64                    137.50
    128                   142.40
    256                   143.07
    512                   140.62
    1024                  143.64
    2048                  175.03
    4096                  222.74
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 同时在另外一台上对ens8做tcpdump,可以抓到UDP的dest_port为1791的报文,正是RoCEv2报文
    10:51:43.782588 52:54:00:28:f8:36 > 52:54:00:3c:a8:a3, ethertype IPv4 (0x0800), length 222: 5.5.5.4.63843 > 5.5.5.3.4791: UDP, length 180
    10:51:43.782725 52:54:00:3c:a8:a3 > 52:54:00:28:f8:36, ethertype IPv4 (0x0800), length 62: 5.5.5.3.63843 > 5.5.5.4.4791: UDP, length 20
    10:51:43.782857 52:54:00:3c:a8:a3 > 52:54:00:28:f8:36, ethertype IPv4 (0x0800), length 222: 5.5.5.3.63843 > 5.5.5.4.4791: UDP, length 180
    10:51:43.782865 52:54:00:28:f8:36 > 52:54:00:3c:a8:a3, ethertype IPv4 (0x0800), length 62: 5.5.5.4.63843 > 5.5.5.3.4791: UDP, length 20
    10:51:43.782885 52:54:00:28:f8:36 > 52:54:00:3c:a8:a3, ethertype IPv4 (0x0800), length 222: 5.5.5.4.63843 > 5.5.5.3.4791: UDP, length 180
    10:51:43.783040 52:54:00:3c:a8:a3 > 52:54:00:28:f8:36, ethertype IPv4 (0x0800), length 62: 5.5.5.3.63843 > 5.5.5.4.4791: UDP, length 20
    10:51:43.783146 52:54:00:3c:a8:a3 > 52:54:00:28:f8:36, ethertype IPv4 (0x0800), length 222: 5.5.5.3.63843 > 5.5.5.4.4791: UDP, length 180
    10:51:43.783154 52:54:00:28:f8:36 > 52:54:00:3c:a8:a3, ethertype IPv4 (0x0800), length 62: 5.5.5.4.63843 > 5.5.5.3.4791: UDP, length 20
    10:51:43.783173 52:54:00:28:f8:36 > 52:54:00:3c:a8:a3, ethertype IPv4 (0x0800), length 222: 5.5.5.4.63843 > 5.5.5.3.4791: UDP, length 180
    10:51:43.783312 52:54:00:3c:a8:a3 > 52:54:00:28:f8:36, ethertype IPv4 (0x0800), length 62: 5.5.5.3.63843 > 5.5.5.4.4791: UDP, length 20
    10:51:43.783423 52:54:00:3c:a8:a3 > 52:54:00:28:f8:36, ethertype IPv4 (0x0800), length 222: 5.5.5.3.63843 > 5.5.5.4.4791: UDP, length 180
    10:51:43.783431 52:54:00:28:f8:36 > 52:54:00:3c:a8:a3, ethertype IPv4 (0x0800), length 62: 5.5.5.4.63843 > 5.5.5.3.4791: UDP, length 20
    10:51:43.783451 52:54:00:28:f8:36 > 52:54:00:3c:a8:a3, ethertype IPv4 (0x0800), length 222: 5.5.5.4.63843 > 5.5.5.3.4791: UDP, length 180
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 如果报文写入文件并用wireshark解析,可以看到是RoCEv2的RC报文
      在这里插入图片描述
  • 相关阅读:
    大家期待的 OceanBase 4.0 发包了,很多新特性让人眼前一亮,大家可以上手测试一下了
    详解 HBase 的常用 API
    台积电嘲讽英特尔CEO:不可能超越我们了,安心退休吧
    Foxit PDF SDK Windows 9.1 Crack
    ansible自动化运维详解(六)ansible中的任务执行控制及实例演示:循环、条件判断、触发器、处理失败任务
    榕树贷款Mybatis-Plus的特点
    一、TestNG的基本使用
    iOS高级理论:常用的架构模式
    P1843 奶牛晒衣服 【贪心】
    多模态论文阅读之VLMo
  • 原文地址:https://blog.csdn.net/ljyfree/article/details/127902406