Open Fabrics Enterprise Distribution (OFED)
- Version 1.4.1-rc4
+ Version 1.4.1-rc5
Release Notes
- April 2009
+ May 2009
===============================================================================
all of its nodes to this new version.
-1.1 OFED 1.4 Contents
+1.1 OFED 1.4.1 Contents
-----------------------
The OFED package contains the following components:
- OpenFabrics core and ULPs:
- Qlogic
- Flextronics
- Sun
+ - Mellanox
1.5 Third Party Packages
------------------------
- NFS/RDMA: In beta qaulity with backports for RHEL 5.2, 5.3 and SLES 10 SP2
- Updated MPI packages:
mvapich-1.1.0-3143
- Open MPI 1.3.1
-- Updated bonding package: ib-bonding-0.9.0-38
-- Updated DAPL: compat-dapl-1.2.13 and dapl-2.0.16
+ Open MPI 1.3.2
+- Updated bonding package: ib-bonding-0.9.0-40
+- Updated DAPL: compat-dapl-1.2.14 and dapl-2.0.19
- Updated opensm version to include critical bug fixes
- Fixed RDS iWARP support
- Low level drivers updated: ehca, mlx4, cxgb3, nes, ipath
Open Fabrics Enterprise Distribution (OFED)
- IPoIB in OFED 1.4 Release Notes
+ IPoIB in OFED 1.4.1 Release Notes
- December 2008
+ May 2009
===============================================================================
5. The ib-bonding driver
6. Bug Fixes and Enhancements Since OFED 1.3
7. Bug Fixes and Enhancements Since OFED 1.3.1
-8. Performance tuning
+8. Bug Fixes and Enhancements Since OFED 1.4
+9. Performance tuning
===============================================================================
1. Overview
11. The IPoIB module uses a Linux implementation for Large Receive Offload
(LRO) in kernel 2.6.24 and later. These kernels require installing the
"inet_lro" module.
+
+12. ConnectX only: If you have a port configured as ETH, and are running IPoIB
+ in connected mode -- and then change the port type to IB, the IPoIB mode
+ changes to datagram mode.
+
+13. When working with ISCSI, you must disable LRO (even if you are working in
+ connected mode). This is because there is a bug in older kernels which causes
+ a kernel panic.
+
+
===============================================================================
4. DHCP Support of IPoIB
Notes:
* Using /etc/infiniband/openib.conf to create a persistent configuration is
no longer supported
+* On RHEL4_U7, cannot set a slave interface as primary.
===============================================================================
- Bonding: Set default number of grat. ARP after failover to three (was one)
===============================================================================
-8. Performance tuning
+8. Bug Fixes and Enhancements Since OFED 1.4
+===============================================================================
+- Performance tuning is enabled by default for IPOIB CM.
+- Clear IPOIB_FLAG_ADMIN_UP if ipoib_open fails
+- disable napi while cq is being drained (bugzilla #1587)
+- rdma_cm: Use rate from ipoib broadcast when joining ipoib multicast
+ When joining IPoIB multicast group, use the same rate as in the broadcast
+ group. Otherwise, if rdma_cm creates this group before IPoIB does, it might get
+ a different rate. This will cause IPoIB to fail joining to the same group later
+ on, because IPoIB has a strict rate selection.
+- fix unprotected use of priv->broadcast in ipoib_mcast_join_task.
+- Do not join broadcast group if interface is brought down
+
+
+===============================================================================
+9. Performance tuning
===============================================================================
-- In IPoIB connected mode, the throughput of medium and large messages can be
- increased by setting the following TCP parameters as follows:
-
- /sbin/sysctl -w net.ipv4.tcp_timestamps=0
- /sbin/sysctl -w net.ipv4.tcp_sack=0
- /sbin/sysctl -w net.core.netdev_max_backlog=250000
- /sbin/sysctl -w net.core.rmem_max=16777216
- /sbin/sysctl -w net.core.wmem_max=16777216
- /sbin/sysctl -w net.core.rmem_default=16777216
- /sbin/sysctl -w net.core.wmem_default=16777216
- /sbin/sysctl -w net.core.optmem_max=16777216
- /sbin/sysctl -w net.ipv4.tcp_mem="16777216 16777216 16777216"
- /sbin/sysctl -w net.ipv4.tcp_rmem="4096 87380 16777216"
- /sbin/sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216"
+When IPoIB is configured to run in connected mode, tcp parameter tuning is
+performed at driver startup -- to improve the throughput of medium and large
+messages.
+The driver startup scripts set the following TCP parameters as follows:
+
+ net.ipv4.tcp_timestamps=0
+ net.ipv4.tcp_sack=0
+ net.core.netdev_max_backlog=250000
+ net.core.rmem_max=16777216
+ net.core.wmem_max=16777216
+ net.core.rmem_default=16777216
+ net.core.wmem_default=16777216
+ net.core.optmem_max=16777216
+ net.ipv4.tcp_mem="16777216 16777216 16777216"
+ net.ipv4.tcp_rmem="4096 87380 16777216"
+ net.ipv4.tcp_wmem="4096 65536 16777216"
+
+This tuning is effective only for connected mode. If you run in datagram mode,
+it actually reduces performance.
+
+If you change the IPoIB run mode to "datagram" while the driver is running,
+the tuned parameters do not get reset to their default values. We therefore
+recommend that you change the IPoIB mode only while the driver is down
+(by setting line "SET_IPOIB_CM=yes" to "SET_IPOIB_CM=no" in file
+/etc/infiniband/openib.conf, and then restarting the driver).
Open Fabrics Enterprise Distribution (OFED)
ConnectX driver (mlx4) in OFED 1.4 Release Notes
- December 2008
+ May 2009
===============================================================================
1. Overview
2. Supported Firmware Versions
3. VPI (Virtual Process Interconnect)
-4. Infiniband new features and bug fixes
-5. Known Issues
+4. Infiniband new features and bug fixes since OFED 1.3.1
+5. Infiniband (mlx4_ib) new features and bug fixes since OFED 1.4
+6. Eth (mlx4_en) new features and bug fixes since OFED 1.4
+7. Known Issues
===============================================================================
1. Overview
===============================================================================
- This release was tested with FW 2.6.000.
- The minimal version to use is 2.3.000.
-- To use both IB and Ethernet use FW version 2.6.0
+- To use both IB and Ethernet (VPI) use FW version 2.6.0
===============================================================================
3. VPI (Virtual Protocol Interconnect)
===============================================================================
-4. Infiniband new features and bug fixes
+4. Infiniband new features and bug fixes since OFED 1.3.1
===============================================================================
Features that are enabled with FW 2.5.0 only:
- Send with invalidate and Local invalidate send queue work requests.
===============================================================================
-5. Known Issues
+5. Infiniband new features and bug fixes since OFED 1.4
+===============================================================================
+- Enable setting 4K MTU for ConnectX ports.
+- Support optimized registration of huge pages backed memory.
+ With this optimization, the number of MTT entries used is significantly
+ lower than for regular memory, so the HCA will access registered memory with
+ fewer cache misses and improved performance.
+ For more information on this topic, please refer to Linux documentation file:
+ Documentation/vm/hugetlbpage.txt
+- Do not enable blueflame sends if write combining is not available
+- Add write combining support for for PPC64, and thus enable blueflame sends.
+- Unregister IB device before executing CLOSE_PORT.
+
+===============================================================================
+6. Eth (mlx4_en) new features and bug fixes since OFED 1.4
+===============================================================================
+- Yevgeni - ...
+
+===============================================================================
+7. Known Issues
===============================================================================
- mlx4_en driver is not supported on PPC64 and IA64
- The mlx4_en module uses a Linux implementation for Large Receive Offload
options mlx4_en parameter=<value>
mlx4_core parameters:
+ set_4k_mtu: attempt to set 4K MTU to all ConnectX ports (default 0)
msi_x: attempt to use MSI-X if nonzero (default 1)
enable_qos: Enable Quality of Service support in the HCA if > 0, (default 0)
block_loopback Block multicast loopback packets if > 0 (default: 1)
4. Known Issues
===============================================================================
+* In the very unlikely event that you get the following error message when
+ running mstflint:
+ Warning: memory access to device 0a:00.0 failed: Input/output error.
+ Warning: Fallback on IO: much slower, and unsafe if device in use.
+ *** buffer overflow detected ***: mstflint terminated
+
+ simply run "mst start" and then re-run mstflint.
Open Fabrics Enterprise Distribution (OFED)
- OSU MPI MVAPICH-1.1.0, in OFED 1.4.0 Release Notes
+ OSU MPI MVAPICH-1.1.0, in OFED 1.4.r10 Release Notes
- December 2008
+ May 2009
===============================================================================
===============================================================================
5. Known Issues
===============================================================================
+- Shared memory broadcast optimization is disabled by default.
+
- MVAPICH MPI compiled on AMD x86_64 does not work with MVAPICH MPI compiled
on Intel X86_64 (EM64t).
Workaround:
Version: OpenSM 3.2.x
Repo: git://git.openfabrics.org/~sashak/management.git
-Date: Dec 2008
+Date: May 2009
1 Overview
----------
OpenSM prints list of "Invalid Cached Option" error messages.
This does not affect OpenSM functionality.
+* SMs do not hand-over when running on ConnectX in a switch-based topology.
+
3 Unsupported IB Compliance Statements
--------------------------------------
The following section lists all the IB compliance statements which
* Don't startup automatically on SuSE based systems
+* Discovery bug, where some ports were leaved unlinked (without remote side).
+
4.2 Other Bug Fixes
* opensm/osm_console.c: fix seg fault when running "portstatus ca" in
* Other less critical or visible bugs were also fixed.
+* opensm: update LFTs when entering master
+
+* opensm: invalidate routing cache when entering master state
+
+* opensm/osm_port_info_rcv.c: don't clear sw->need_update if port 0 is active
+
+
5 Main Verification Flows
-------------------------
Open Fabrics Enterprise Distribution (OFED)
- SDP in OFED 1.4 Release Notes
+ SDP in OFED 1.4.1 Release Notes
- December 2008
+ May 2009
Table of Contents
===============================================================================
1. Overview
-2. Bug Fixes and Enhancements
-3. Known Issues
-4. Verification Applications/Flows/Tests
+2. Bug Fixes and Enhancements since OFED 1.3
+3. Bug Fixes and Enhancements since OFED 1.4
+4. Known Issues
+5. Verification Applications/Flows/Tests
===============================================================================
1. Overview
===============================================================================
-SDP in OFED is at GA level for OFED 1.4.
+SDP in OFED is at GA level for OFED 1.4.1
===============================================================================
-2. Bug Fixes and Enhancements
+2. Bug Fixes and Enhancements since OFED 1.3
===============================================================================
* Cleanup
- Compilation warnings
- Having now full windows interoperability.
+===============================================================================
+2. Bug Fixes and Enhancements since OFED 1.4
+===============================================================================
+SDP:
+- BUG1311 Netpipe fails with a IB_WC_LOC_LEN_ERR.
+- BUG1472 - clean socket timeouts and refcount when device is removed
+- BUG1502 - scheduling while atomic
+- BUG1309 - SDP close is slow + fix recv buffer initial size setting
+- BUG1087 - fixed recovery from failing rdma_create_qp()
+
===============================================================================
3. Known Issues
===============================================================================
- TCP allows connecting to IP_ANY - 0.0.0.0 (as a destination address!). SDP
does not allow - and will reject the connection.
-- BUG1309 - sometimes SDP close connection takes longer than TCP close.
-
-- BUG1256 - libsdp does not support epoll
-
-- BUG1087 - sometimes libsdp does not recover well when host is running out of QPs.
-
- Each SDP socket currently consumes up to 2 MBytes of memory. If this value
is high for your installation, it is possible to trade off performance
for lower memory utilization per socket by reducing the value of the
- Various Java client server applications (SUN:jre, BEA:jrockit/WebLogic, GNU:gij/gcj)
- Many UNIX utilities to verify that pre-load did not harm the applications
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Open Fabrics Enterprise Distribution (OFED)
- SDP in OFED 1.4 Release Notes
-
- December 2008
-
-
-
-===============================================================================
-Table of Contents
-===============================================================================
-1. Overview
-2. Bug Fixes and Enhancements
-3. Known Issues
-4. Verification Applications/Flows/Tests
-
-===============================================================================
-1. Overview
-===============================================================================
-SDP in OFED is at GA level for OFED 1.3.
-
-
-===============================================================================
-2. Bug Fixes and Enhancements
-===============================================================================
-* Fixes for SDP specification compliance
- - OOB data not marked as solicited (bug 596)
- - DisConn, ChRcvBuf, ChRcvBufAck marked solicited (bug 644)
- - Do not send DisConn if only 1 credit (bug 646)
- - Validate ChRcvBuf range (bug 647)
-
-* Cleanup
- - Compilation warnings
- - New kernel support
-
-* New function
- - SIOCOUTQ ioctl support
- - Add keepalive support
- - New /sys options: sdp_keepalive_probes_sent, sdp_keepalive_time
- - New options: SOCK_KEEPALIVE, TCP_KEEPIDLE
- - Add Zero copy bcopy support (bzcopy)
- - New /sys option: sdp_zcopy_thresh
-
-* Bugs fixed
- - Resize buffers if out of credits (bug 556)
- - Resize using skb_put (bug 620)
- - Move to accept queue on RTU drop and DREQ (bug 645)
- - Modify memory allocation to support in kernel users
- - Fix reference count but that prevents driver unload
- - connect() now allows AF_INET_SDP and AF_INET (bug 294)
- - poll() always returns POLLOUT on non-blocking socket (bug 829)
- - Executing netperf with TCP_CORK never ends (bug 837)
-
-
-===============================================================================
-3. Known Issues
-===============================================================================
-- Each SDP socket currently consumes up to 2 MBytes of memory. If this value
- is high for your installation, it is possible to trade off performance
- for lower memory utilization per socket by reducing the value of the
- "rcvbuf_scale" module parameter (default: 16).
-
- Note: the minimum legal value for this parameter is 1.
- At this parameter value, each socket will consume approximately 128 KBytes.
-
-- Small message size performance is low when messages are sent by client
- at a rate lower than the rate at which they are consumed by server,
- and when TCP_CORK is not set. This is observed, for example, with iperf
- benchmark. As a workaround, set the TCP_CORK socket option
- to ensure data is sent in at least 32K byte chunks.
-
-- Performance is low on 32-bit kernels, as SDP utilizes high memory
- to ease memory pressure. Moving to a 64-bit kernel solves this
- problem even if the application remains a 32-bit one.
-
-- By default, SDP utilizes a 2 Kbyte MTU size. This may cause PCI-X cards
- using Mellanox Technologies "Infinihost" HCAs to experience low bandwidth.
- Workaround: reset the MTU size to 1K in this situation, using either of
- the two methods below:
-
- 1. Activate the "tavor quirk" workaround in opensm:
- a. Create an opensm options cache file (/var/cache/osm/opensm.opts):
- > opensm --cache-options -o
- b. Add the following line to /var/cache/osm/opensm.opts:
- enable_quirks TRUE
- c. Rerun opensm using your usual command line options to activate
- the opensm quirk option.
-
- 2. Activate the "tavor quirk" workaround in cma:
- set the tavor_quirk module parameter of the rdma_cm module to value 1
- (default: 0).
-
-- The new BZCOPY mode is only effective for large block transfers.
- By setting the /sys parameter 'sdp_zcopy_thresh' to a non-zero value, a
- non-standard SDP speedup is enabled. All messages longer than
- 'sdp_zcopy_thresh' bytes in length will cause the user space buffer to
- be pinned and the data sent directly from the original buffer. This
- results in less CPU use and, on many systems, much better bandwidth.
- The default 64K value for 'sdp_zcopy_thresh' is sometimes too low for
- some systems. You must experiment with your hardware to select the
- best value.
-
-- Windows interoperability
- The Windows version of SDP does not support resizing buffers using the
- standard protocol messages. There will sometimes be inter-operability
- problems for this reason.
-
-===============================================================================
-4. Verification Applications/Flows/Tests
-===============================================================================
-See the corresponding section in the SDP release notes above.
-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Open Fabrics Enterprise Distribution (OFED)
- libsdp v. 9382 in OFED 1.4 Release Notes
+ libsdp v. 9382 in OFED 1.4.1 Release Notes
- December 2008
+ May 2009
===============================================================================
1. Overview
2. New Features
3. Bug Fixes
-4. Known Issues
-5. Verification Applications/Flows/Tests
+4. Bug Fixes and Enhancements since OFED 1.4
+5. Known Issues
+6. Verification Applications/Flows/Tests
===============================================================================
1. Overview
* Add libsdp-devel sub-package
-
===============================================================================
-3 Bug Fixes
+3. Bug Fixes
===============================================================================
The following list of bugs were fixed. Note that other less critical
or visible bugs were also fixed.
returning -1.
===============================================================================
-4. Known Issues
+4. Bug Fixes and Enhancements since OFED 1.4
+===============================================================================
+libsdp:
+* Enable building libsdp on Solaris
+* BUG1256 - Add epoll support
+
+sdpnetstat:
+* BUF1513 - sdpnetstat is not showing all the listening processes on ipv6 sockets.
+
+===============================================================================
+5. Known Issues
===============================================================================
* libsdp cannot provide its socket switch functionality for executables
statically linked with libc.
===============================================================================
-5. Verification Applications/Flows/Tests
+6. Verification Applications/Flows/Tests
===============================================================================
See the corresponding section in the SDP release notes above.