Sean Hefty [Fri, 5 Feb 2010 22:25:15 +0000 (14:25 -0800)]
librdmacm: transition QP to RTS before sending reply
In order to handle a race condition where the passive side of
a connection can receive data on a QP before the connection established
event has been received, transition the QP to RTS before sending the reply.
This allows a user to send a response to any received message immediately,
rather than waiting until the connection established event has been
processed.
A similar fix was applied to the kernel rdma_cm a while ago.
Simply duplicate the fix in the user space library.
Or Gerlitz [Mon, 9 Nov 2009 22:46:20 +0000 (14:46 -0800)]
librdmacm/mckey: enforce local binding for unmapped multicast address
Enforce that mckey is bound to a local address when specifying
an unmapped multicast address. Otherwise, mckey crashes when
attempting to use cmd_id->verbs pointer.
Update documentation on using unmapped MGIDs.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Jason Gunthorpe [Wed, 21 Oct 2009 15:02:43 +0000 (08:02 -0700)]
librdmacm: returns errors from the library consistently
Remove the return of -errno and always return codes via errno.
As documented in librdmacm, these libraries are already documented
to return -1 to indicate the code is in errno.
Update rping to show correct error reporting methodology.
Also fix errant return of 0 if the read/write syscalls return 0.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Mon, 19 Oct 2009 14:21:42 +0000 (07:21 -0700)]
librdmacm: fix race initializing library
Multi-threaded code can race with ucma_init(). When ucma_init() is called
it can set cma_dev_cnt before it is done initializing the library. If
a second thread checks cma_dev_cnt and finds it non-zero, then it will
skip the call to ucma_init() and assume that the library is ready for use.
This can lead to an application crash during startup.
Do not set cma_dev_cnt until the end of ucma_init(), after all
initialization has completed. Adjust the error handling code in
ucma_init() accordingly.
Sean Hefty [Wed, 14 Jan 2009 22:31:34 +0000 (14:31 -0800)]
cmatose: avoid missing completions
cmatose uses a single CQ for send and receive completions. It then
counts completions to determine if all sends and receives are done.
It's possible for a receive completion to be polled when the intent
is to count send completions. (See server side polling for sends
done, then receives done. The poll will get up to 8 completions,
which can lead to sends and receives being polled together.)
Fix this by separating the send and receive completions to their
own CQs to avoid any issues knowing what type of receives have been
polled from the CQ.
Or Gerlitz [Wed, 2 Jul 2008 18:53:49 +0000 (11:53 -0700)]
librdmacm: implement address change event
RDMA_CM_EVENT_ADDR_CHANGE event can be used by librdmacm consumers
that wish to have their RDMA sessions always use the same links
(eg <hca/port>) as the IP stack does. In the current code, this
does not happen when bonding is used and fail-over happened,
but the IB link used by an already existing session is operating fine.
The kernel rdma-cm code was enhanced to use netevent notification
for sensing that a change has happened in the IP stack, and deliver
this event for ID that is misaligned that respect with the IP
stack. The user can act on the event or just ignore it
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Thu, 22 May 2008 14:35:13 +0000 (07:35 -0700)]
librdmacm: fix license text
The license text for references a third software license
that was inadvertently copied in. Update the license to match that used
by openfabrics. This update was based on a request from HP.
Sean Hefty [Wed, 20 Feb 2008 16:53:19 +0000 (08:53 -0800)]
librdmacm: add rdma_migrate_id
This is based on user feedback from Doug Ledford at RedHat:
Events that occur on an rdma_cm_id are reported to userspace through
an event channel. Connection request events are reported
on the event channel associated with the listen. When the
connection is accepted, a new rdma_cm_id is created and automatically
uses the listen event channel. This is suboptimal where the user
only wants listen events on that channel.
Additionally, it may be desirable to have events related to
connection establishment use a different event channel than those
related to already established connections.
Allow the user to migrate an rdma_cm_id between event channels.
Roland Dreier [Mon, 21 Jan 2008 18:21:44 +0000 (10:21 -0800)]
Update %install secion of librdmacm spec file
Change from using the %makeinstall macro to using "make install"
directly. The page <http://fedoraproject.org/wiki/Packaging/Guidelines>
has this to say:
"Fedora's RPM includes a %makeinstall macro but it must NOT be used
when make install DESTDIR=%{buildroot} works. %makeinstall is a
kludge....
It is error-prone and can have unexpected effects....
It can trigger unnecessary and wrong rebuilds....
....it can cause broken *.la files to be installed....
Instead, Fedora packages should use: make DESTDIR=%{buildroot}
install or make DESTDIR=$RPM_BUILD_ROOT install"
The librdmacm package uses automake, which means that the "make
DESTDIR=... install" method works fine, so we should use it.
Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Roland Dreier [Mon, 21 Jan 2008 18:21:41 +0000 (10:21 -0800)]
Updated License: field in librdmacm spec file
Update License: field to match the exact format given in
http://fedoraproject.org/wiki/Packaging/LicensingGuidelines
for a package available under a choice of GPL or BSD license.
Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Thu, 3 Jan 2008 23:36:14 +0000 (15:36 -0800)]
librdmacm/cm: override default responder_resources with user value
By default, the responder_resources parameter is set to that received
in a connection request. The passive side may override this value
when accepting the connection. Use the value provided by the passive
side when transitioning the QP to RTR state, rather than the value
given in the connect request. Without this change, the RTR transition
may fail if the passive side supports fewer responder_resources than
that in the request.
Sean Hefty [Tue, 6 Nov 2007 19:07:55 +0000 (11:07 -0800)]
librdmacm/man: fix-up man pages
Fix a couple of errors in the man page documentation and add
infiniband specific text about QP configuration settings. This
is in response to user questions about various settings based
on feedback from Or.
Sean Hefty [Tue, 16 Oct 2007 21:59:21 +0000 (14:59 -0700)]
librdmacm/cma: provide sanity checks for max outstanding rdma ops
Ensure that the responder_resources and initiator_depth values
provided by the user are supported by the local hardware. This
traps errors sooner during connection establishment (when calling
rdma_connect), rather than waiting until the modify QP fails
(after calling rdma_accept).
Sean Hefty [Tue, 16 Oct 2007 16:47:22 +0000 (09:47 -0700)]
librdmacm/man: update man pages to clarify connection request params
Document connection requests parameters in rdma_connect(),
rdma_accept(), and rdma_get_cm_event(), specifically regarding
initiator_depth and responder_resources.
The private_data_len on the receive side is the size of the data
buffer, and not the size of the private data sent by the remote side.
For IB, the size of the sent data is not known, so private_data_len
is the size of the private data field carried in the CM message.
Dotan Barak [Tue, 14 Aug 2007 16:04:05 +0000 (09:04 -0700)]
librdmacm: Fix memory leak reported by valgrind
Fix memory leak reported by valgrind:
==6239== 16 bytes in 1 blocks are definitely lost in loss record 2 of 10
==6239== at 0x4A04CBF: calloc (vg_replace_malloc.c:279)
==6239== by 0x4E386C4: ibv_get_device_list@@IBVERBS_1.1 (device.c:65)
==6239== by 0x4C2D868: ucma_init (cma.c:221)
==6239== by 0x4C2F831: rdma_create_event_channel (cma.c:299)
==6239== by 0x4018A5: main (cmatose.c:650)
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Fri, 20 Apr 2007 22:16:07 +0000 (15:16 -0700)]
librdmacm: update datagram tests to abort if msg size > MTU
If the user specifies a message size for mckey or udaddy tests that's
larger than the active MTU of the bound port, the sends will fail.
Detect this condition and abort the test if the message size is >
the active MTU of the port.
Sean Hefty [Fri, 6 Apr 2007 17:14:02 +0000 (10:14 -0700)]
RDMA/cma: fix 32-bit user / 64-bit kernel mismatch issue
A 64-bit kernel will pad the size of the event structure to a
multiple of 8-bytes. When using a 32-bit kernel, the structure is left
aligned to a 4-byte boundary. This results in the userspace event structure
being too small because of the padding. Fix this by increasing the padding
at the end of the userspace event structure.
Sean Hefty [Wed, 31 Jan 2007 01:37:12 +0000 (17:37 -0800)]
Update mckey test program to join specific multicast groups.
Add options to mckey to allow a user to specify a given multicast group.
This allows the user to join a group using IP address 0, get back the
actual group MGID that was created for the user, and join that group from
a separate copy of mckey.
Sean Hefty [Fri, 26 Jan 2007 18:21:17 +0000 (10:21 -0800)]
Allow unicast traffic over IPOIB port space.
Adjust the RMDA_PS_IPOIB to allow unicast traffic. This requires
changing how QPs are initialized in order to get the correct qkey
to use. We need to call into the kernel to get the initial QP
attributes.
Update the udaddy unicast test program to test this capability.
Sean Hefty [Wed, 24 Jan 2007 00:11:45 +0000 (16:11 -0800)]
Add support to join IPOIB multicast groups
Add to the librdmacm an IPOIB port space that allows interoperability with
IPOIB multicast traffic. Use of the RDMA_PS_IPOIB is limited to multicast
join/leave.
Rename the RDMA_UD_QKEY to RDMA_UDP_QKEY to signify that the qkey is only
used with the RDMA_PS_IPOIB port space. Update mckey test program based on
patch from Or Gerlitz.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Mon, 8 Jan 2007 22:27:43 +0000 (14:27 -0800)]
librdmacm: return 0 if rdma_leave_multicast is successful.
Return 0 if leave multicast succeeds, rather than the size of the write.
Update mckey test program to call rdma_leave_multicast to verify its
operation.
Steve Wise [Fri, 15 Dec 2006 22:56:06 +0000 (16:56 -0600)]
librdmacm Pass back the status or errno in RDMA CM events.
The librdmacm code isn't passing back the errno in all events.
For example, if a connection request times out the kernel CMA will pass
up event RDMA_CM_EVENT_UNREACHABLE with the status set to -ETIMEDOUT.
This errno isn't currently passed back to the librdmacm user in the event.
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Tue, 5 Dec 2006 03:17:44 +0000 (19:17 -0800)]
Update librdmacm to support ABI version 3
rdma_ucm ABI 3 adds support for reporting connection information with
connect events and UDP port space support. Update the librdmacm to
take advantage of these features, and update test programs as well.
Sean Hefty [Tue, 5 Sep 2006 19:32:38 +0000 (19:32 +0000)]
r9272: Update documentation for rdma_destroy_id to call out that it will cancel
any oustanding asynchronous operation on the id. Suggested clarification
by Or.
Sean Hefty [Tue, 29 Aug 2006 22:03:47 +0000 (22:03 +0000)]
r9183: Need to poll both send and receive completions from the CQ on the passive side
of the connection. Without polling the sends first, we exit before receiving
all replies.
Sean Hefty [Thu, 24 Aug 2006 17:36:24 +0000 (17:36 +0000)]
r9105: Older versions of Linux do not have the misc device. Backport patches create
the RDMA CM abi_version file under /sys/class/infiniband_ucma, so we'll auto-
matically look there for the abi_version if we don't find it in its normal
location.