Open Fabrics Enterprise Distribution (OFED) Version 4.8-2 Release Notes February 2018 =============================================================================== Table of Contents =============================================================================== 1. Overview, which includes: - OFED Distribution Rev 4.8-1 Contents - Supported Platforms and Operating Systems - Supported HCA and RNIC Adapter Cards and Firmware Versions - Tested Switch Platforms - Third party Test Packages - OFED sources 2. Change log 3. Known Issues 4. Generic Notes =============================================================================== 1. Overview =============================================================================== These are the release notes of OpenFabrics Enterprise Distribution (OFED) release 4.8-2. The OFED software package is composed of several software modules, and is intended for use on a computer cluster constructed as an InfiniBand Fabric, an iWARP Network or a RoCE Fabric. Note: If you plan to upgrade the OFED package on your cluster, please upgrade all of its nodes to this new version. 1.1 OFED 4.8-2 Contents ----------------------- The OFED package contains the following components: - OpenFabrics core and ULPs: - IB HCA drivers (mthca, mlx4, mlx5, qib, ehca) - iWARP RNIC driver (cxgb3, cxgb4, nes, i40iw, qedr) - RoCE drivers (mlx4, mlx5, ocrdma, qedr, bnxt_re) - IB core - Upper Layer Protocols: - IPoIB, SRP Initiator, iSER, uDAPL, qlgc_vnic - NVMe-oF (kernel 4.8 only) - NFS-RDMA (SLES12.2 RH7.3 only). - OpenFabrics utilities: - OpenSM (OSM): InfiniBand Subnet Manager - Diagnostic tools - Performance tests - Extra packages: - infinipath-psm: Performance-Scaled Messaging API, an accelerated interface to Intel(R) HCAs - libfabric - library that exports interfaces for fabric services to applications - Sources of all software modules (under conditions mentioned in the modules' LICENSE files) - Documentation 1.2 Supported Platforms and Operating Systems --------------------------------------------- o CPU architectures: - x86_64 - x86 - ppc64 o Linux Operating Systems: - RedHat EL7.0 3.10.0-123.el7 - RedHat EL7.1 3.10.0-210.el7 - RedHat EL7.2 3.10.0-327.el7 - RedHat EL7.3 3.10.0-514.el7 - RedHat EL7.4 3.10.0-693.el7 - SLES12 3.12.28-4 - SLES12.1 3.12.49-11.1 - SLES12.2 4.4.21-69-default - SLES12.3 4.4.73-5-default - kernel.org 4.8 * * Minimal QA for these versions. 1.3 HCAs and RNICs Supported ---------------------------- This release supports IB HCAs by IBM, Intel and Mellanox Technologies, iWARP RNICs by Chelsio Communications and Intel and RoCE adapters by Emulex, IBM, Mellanox, Cavium, and Broadcom. InfiniBand Adapters o IBM HCAs: - GX Dual-port SDR 4x IB HCA - GX Dual-port SDR 12x IB HCA - GX Dual-port DDR 4x IB HCA - GX Dual-port DDR 12x IB HCA o Intel (formerly QLogic) HCAs: - Intel(R) True Scale DDR PCIe x8 and x16 HCAs - Intel(R) True Scale QDR PCIe x8 Gen2 HCAs o Mellanox Technologies HCAs (FDR and FDR10 Modes are Supported): - ConnectX-3 (Rev 2.33.5100 and above) - ConnectX-3 Pro (Rev 2.33.5100 and above) - Connect-IB (Rev 10.10.5054 and above) o Mellanox Technologies HCAs (EDR, FDR, FDR10 Modes are Supported): - ConnectX-4 (Rev 12.12.x and above) For official firmware versions please see: http://www.mellanox.com/content/pages.php?pg=firmware_download iWARP Adapters o Chelsio RNICs: - S310/S320 10GbE Storage Accelerators - R310/R320 10GbE iWARP Adapters - T4: T420-CR, T440-CR, T422-CR, T404-BT, T440-LP-CR, T420-LL-CR, T420-CX - T5: T502-BT, T580-CR, T580-LP-CR, T520-LL-CR, T520-CR, T522-CR, T540-CR - T6: T62100-CR, T62100-LP-CR, T6225-CR o Intel RNICs: - NE020 10Gb iWARP Adapter - i40iw 10Gb iWARP Adapter (kernel 4.8 only) RoCE Adapters o Emulex - using ocrdma - Emulex OCe14102 2-port 10 GbE RoCE - Emulex OCe14401 1-port 40 GbE RoCE o IBM - IBM Flex System EN4132 2-port 10 GbE RoCE - IBM EL27 PCIe LP 2-Port 10GbE RoCE SFP+ adapter - IBM EC28 PCIe 2-Port 10GbE RoCE SFP+ adapter o Mellanox - ConnectX-2 EN (Rev 2.9.1200 and above) - ConnectX-3 EN (Rev 2.31.5050 and above) o Cavium - QL4xxxx series of converged network adapters (RoCEv2) o Broadcom - NetXtreme Ethernet Network Adapters (RoCEv2) Other Drivers - VMware Paravirtual RDMA Driver (vmw_pvrdma) - Software RDMA over Ethernet (RoCEv2) - NVMe-oF Host and Target (kernel 4.8 only) 1.4 Switches Supported ---------------------- This release was tested with switches and gateways provided by the following companies: InfiniBand Switches o Flextronics - F-X430044 o Intel (formerly QLogic) - 12200 o Mellanox - MLNX-OS MSX6036/SX6025 w/w MLNX-OS version 3.3.4304 - Grid Director 4036 w/w Grid Director version 3.9.2-992 - FabricIT EFM IS5035 w/w FabricIT EFM version 1.1.3000 - FabricIT BXM MBX5020 w/w FabricIT BXM version 2.1.2000 iWARP Switches o Fujitsu - XG2000C 10Gb Ethernet Switch RoCE Switches o Arista o BLADE Network Technologies (BNT) o Mellanox - SX1036 - SX1024 - SX1016 1.5 Third Party Packages ------------------------ The following third party packages have been tested with OFED 4.8-1: - Open MPI - 1.8 - Intel MPI Library 2017 - MVAPICH2 1.6 OFED Sources ---------------- All sources are located under git://git.openfabrics.org/ Linux: ------ URL: git://git.openfabrics.org/compat-rdma/linux-4.8.git Branch: master - Linux kernel sub-tree that includes files relevant for the OFED project only. Based on v4.8. Used to shorten git clone time. Note: the regular Linux git tree can be used as well. compat: ------- URL: git://git.openfabrics.org/compat-rdma/compat.git Branch: ofed - Based on compat project (https://github.com/mcgrof/compat). The compat module provides functionality introduced in newer kernels to older kernels through a set of header files and exported symbols. See https://github.com/mcgrof/compat/wiki for details. - Used to replace kernel_addons in the previous OFED kernel tree. compat-rdma: ------------ URL: git://git.openfabrics.org/compat-rdma/compat-rdma.git Branch: master User level Sources are downloaded from http://www.openfabrics.org/downloads/ as written in the BUILD_ID The kernel sources are based on Linux 4.8 mainline kernel. Its patches are included in the OFED sources directory. For details see HOWTO.build_ofed. The list of maintainers is available under: http://www.openfabrics.org/downloads/MAINTAINERS =============================================================================== 2. Change log =============================================================================== OFED-4.8-2-rc1 Main Changes from OFED-4.8-1 ------------------------------------------------------------------------------- 1. compat-rdma - bnxt_en: Fix Max MTU setting on SLES 12 SP3 - qede: SLE12SP3 Backport fix use core min max MTU check - cxgb: SLE12SP3 Backport fix: use net core MTU range checking - Remove vmw_pvrdma tech preview patches - Fixed mlx4 backport - qedr cherry pick: lower message verbosity - Update qib backport for SLES 12.3 - bnxt_en: backport for RH 7.0 and 7.1 - compat-rdma.spec: Added compat-rdma-firmware subpackage - bnxt_en: Backporting for RHEL 7.4 and SLES12SP3 - qedr: Add backports for RHEL 7.4 and SLES 12 SP3 - configure: Add parameters for RXE and RDMAVT - ib_iser: Added support for SLES12 SP3 - mlx5: Added support for SLES12 SP3 - mlx4: Added support for SLES12 SP3 - ib_core: Added support for SLES12 SP3 - mlx5: Added RHEL7.4 support - mlx4: Added RHEL7.4 support - configure/makefile: Rename CONFIG_INFINIBAND_RXE -> CONFIG_RDMA_RXE to match upstream 2. Updated packages - rdma-core-v16.2 - fabtests-1.5.3 - libfabric-1.5.3 - perftest-4.1-0.2.g770623f 3. ofed_scripts - add --with-rdmavt-mod for qib - Add VMware Paravirtual RDMA Driver as an installable driver - Remove vmw_pvrdma tech preview install option - install.pl: Added rdma-core-devel dependency for the perftest build - Revert "install.pl: Enable bnxt_re on supported OSes only" - install.pl: Added compat-rdma-firmware package - install.pl: Enable bnxt_re on supported OSes only - install.pl: Updated rdma-core dependencies - install.pl: Remove requirement on the specific cmake version - install.pl: Added support for SLES12 SP3 - install.pl: Added RHEL7.4 support OFED-4.8-1 Main Changes from OFED-4.8 ------------------------------------------------------------------------------- 1. compat-rdma - openibd: rpcrdma added to the list of modules to unload - bnxt_en/bnxt_re: Support ulp_shutdown hook - bnxt_re: BZ 2655 Fix nfs client hang with a call trace on stress traffic - bnxt_re: BZ 2654 Fix traffic failure after changing ip address - bnxt_re: BZ 2656 fix a crash in qp error event processing - bnxt_en: BZ 2649 Enabling DCB parameter for bnxt_en - Fixes for i40iw which have been included in kernels 4.12 and 4.13 - qedr: fix BZ 2642 - parse VLAN ID and priority correctly - bnxt_en: Bug 2645 - Set default completion ring for async events - bnxt_re: bz 2646 - 0012-bnxt_re-Make-room-for-mapping-beyond-32-entries.patch - ib_srp: Fixed backport - issue 2635 - bnxt_en: Misc bug fixes - bnxt_re: Fix bnxt_en installation in the latest daily builds - bnxt_re: Adding bug fix patches - bnxt_re: Backports for SLES12SP0/SP1/SP2 and RH7.0/7.1 - NFS/RDMA backport patch for SLES12SP2 - NFS/RDMA backport patch to revert source files to 4.6 kernel in order to facilitate dependency on distro SUNRPC. Include fix to use correct ib_map_mr_sg signature from OFED4.8. - makefile: Added CONFIG_BNXT parameter to override kernel's default - compat-rdma.spec: Fixed QED firmware installation - configure: Fix typos introduced by b8520c5751cc6c40bb2f5ba8533a34dc260f86f2 - configure: Fix typo - Merge branch 'master' of https://github.com/selvintxavier/compat-rdma - Fixed SRP support on RHEL7.2 by using blk_iopoll API in ib_core - bnxt_re: Add bnxt_re backports - bnxt_re: Enable bnxt_re building - Removed previously applied patch 0057-qed-Fix-error-in-the-dcbx-app-meta-data.patch - Add qedr 2. Updated packages - rdma-core stable-v15 - libfabric-1.5.2 - fabtests-1.5.2 - mstflint-4.8.0 3. ofed_scripts - Added support for rdma-core new packaging format - install.pl: Added support for using source RPM per Distro - install.pl: Enable nfsrdma on RHEL7.3 and SLES12 SP2 - Fix syntax error after libbnxt_re-rdmav2 - Add bnxt_re support - Add qedr OFED-4.8 Main Changes from OFED 3.18-3 GA ------------------------------------------------------------------------------- 1. Added support for RHEL7.3 and SLES12 SP2 2. compat-rdma: - BACKPORT-ib_srp: synced with MLNX_OFED ib_srp backport - IB/srp: force reconnect_delay module param to exceed fast_io_fail_tmo - Final updates to ccl ABI between host and card to match OFED 4.8 and kernel 4.9. - Fix rstream timeouts on 4k_lat test - Support for Kernel 4.8 in QIB on OFED-4.8 - compat-rdma: Add i40iw to openibd - Interop related fixes for iWarp drivers i40iw and nes - configure: Fix typo - compat: Cleanup auto-generated defines - iw_cxgb4: Guard against null cm_id in dump_ep/qp - Xeon Phi updates - Fixes for i40iw which have been included in kernels > 4.8 - Updated 0003-add-the-ibp-client-and-server-drivers.patch 3. Updated packages: - compat-rdma-4.8 - based on linux-4.8 - infiniband-diags-2.0.0 - Removed libibmad (deprecated) - libfabric-1.4.2 - fabtests-1.4.2 - rdma-core v13 - libibscif-1.1.3 - rdma-core-12-1 - dapl-2.1.10 - fabtests-1.4.1rc1 - infiniband-diags-1.6.7 - libfabric-1.4.1rc1 - libibmad-1.3.13 - libibscif-1.1.2 - mstflint-4.5.0-1.17.g8a0c39d - perftest-3.4-0.9.g98a9a17 4. install.pl - Disable inband support for mstflint =============================================================================== 3. Known Issues =============================================================================== The following is a list of general limitations and known issues of the various components of the OFED 4.8-1 release. 01. When upgrading from an earlier OFED version, the installation script does not stop the earlier OFED version prior to uninstalling it. Workaround: Stop the old OFED stack (/etc/init.d/openibd stop) before upgrading to OFED 4.8-1 or reboot the server after OFED installation. 02. Memory registration by the user is limited according to administrator setting. See "Pinning (Locking) User Memory Pages" in OFED_tips.txt for system configuration. 03. Fork support from kernel 2.6.12 and above is available provided that applications do not use threads. fork() is supported as long as the parent process does not run before the child exits or calls exec(). The former can be achieved by calling wait(childpid), and the latter can be achieved by application specific means. The Posix system() call is supported. 04. The qib driver is supported only on 64-bit platforms. 05. IPoIB: brctl utilities do not work on IPoIB interfaces. The reason for that is that these utilities support devices of type Ethernet only. 06. In case uninstall is failing, check the error log and remove the remaining RPMs manually using 'rpm -e '. 07. RDS is not supported. 08. Bug 2515: when an Intel HCA is attached directly to a Mellanox ConnectX3 and the OpenSM is started on the Intel HCA, the link will not go to active. The workaround is to start the OpenSM on the Mellanox HCA. 09. Bug 2553: Unable to run UD traffic using qperf on a RoCE R-NIC 10. Bug 2561: Devices do not receive the echo reply of the first ping sequence when the payload size is 65493 or higher. This is still an issues in the 4.3 kernel and should be resolved in a future OFED 4.x release 11. Bug 2574: Installing OFED (with --with_xeon_phi and qib=y in ofed.conf file) makes the access to the scif char device fail. psmd uses scif functions to allow psm to connect to the Phi devices. It also makes sure it can connect to them when the service is started. Scif functions cannot be called unless the mpss service is running. Since openibd starts the psmd service, it should be loaded after mpss 12. The ipath and eHCA drivers will be deprecated in the next OFED build that uses a kernel >= 4.3 13. Bug 2583: "iwpm_mapping_error_cb: Received msg seq = 381240 err code = 12 client = 3" seen intermittently 14. Bug 2587: autoconf is needed by ibacm-1.1.0-1.x86_64 15. Bug 2596: Host crash related to mlx5 driver, ConnectIB HCAs, ib_srp, and performing IO 16. cmake and ninja packages are required to build rdma-core package. ninja binary RPMs can be downloaded from https://software.opensuse.org/package/ninja On SLES12 cmake-3.5 or higher is required. The working cmake version can be taken from SLES12 SP2 and installed on SLES12 SP0 or SP1. 17. Bug 2622: ib_qib cannot be loaded after installing OFED 4.8 on RHEL 7.3 with stock kernel (3.10.0-514). Modeprobe fails with "Invalid argument". It is needed to pull a kernel update to 3.10.0-514.10.2 before installing OFED. 18. Bug 2619: IPoIB interface fails on RHEL 7.2 using FW release 12.18.2000 with mlx5 MT_2180110032 adapter Issue: Trying to bring ib0 interface fails with mlx5_0:mlx5_ib_post_send: ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22 Workaround: revert MT_2180110032 firmware from 12.18.2000 to 12.12.1100. 19. Bug 2628: rstream timeouts on 4k_lat test, qib driver 20. Bug 2644 – qedr: Changing Ethernet MTU prevents future RoCE traffic. Changing the MTU size of the Ethernet interface will stop and prevent RoCE traffic on that interface. To overcome this reload the qedr driver: $ rmmod qedr $ modprobe qedr 21. Bug 2639 - ib_isert/ib_srpt report "Unknown symbol" after OFED-4.8-1 installation: ib_srpt and ib_isert kernel modules are not included in OFED-4.8-x. So, after OFED installation the in-box ib_core being replaced by one coming with OFED and, therefore, in-box ib_srpt and ib_isert kernel modules fail to be loaded over OFED's ib_core. If one need to use iSER or SRP target then need to recompile iSERT/SRPT over the OFED. 22. Bug 2640 - openibd fail to start when system is coming up: The inbox kernel modules being loaded from initrd So, need to rebuild the initrd by: # dracut -f -v 23. Bug 2641 - IPoIB: Missing first ping after clearing ARP cache: This is a known issue by IPoIB design 24. Bug 2661 - rstream intermittent error: Connection refused (stale connection) 1. Two Q-Logic/Cavium NICs back to back 2. Command: rstream -S all -T a on the second connect in the test: Connection refused (stale connection ) Note: See the release notes of each component for additional issues. =============================================================================== 4. Generic Notes =============================================================================== 1. install.pl - '--without-depcheck' parameter should be used in case where the required dependencies sutisfied using 'make install' command and not in RPM form, so, they cannot be verified by the install.pl script. Using '--without-depcheck' when the dependency is actually not present may lead to undesired behaviour both in compilation and run time. 2. ULP and Driver restrictions: - NVMe-oF (kernel 4.8 only) - NFS-RDMA (SLES12.2 RH7.3 only). - i40iw 10GbE iWARP Adapter (kernel 4.8 only)