Roland Dreier [Wed, 15 Dec 2010 20:29:27 +0000 (12:29 -0800)]
librdmacm: Add a check for libpthread during librdmacm configure.
Add a check for libpthread during librdmacm configure. This will add
libpthread to the list of libraries that librdmacm is linked to.
Currently librdmacm gets libpthread implicitly through libibverbs, but
this breaks when using a linker that does not implicitly link with such
dependencies; eg the new gold linker is such a linker:
Sean Hefty [Mon, 6 Dec 2010 21:17:03 +0000 (13:17 -0800)]
librdmacm: Support RAI_NUMERICHOST and no delay options
Add support similar to getaddrinfo AI_NUMERICHOST. This
indicates that lengthy address resolution protocols should
not be used. Also allow a caller of rdma_getaddrinfo to
indicate that lengthy route resolution protocols should not
be used.
Since rdma_getaddrinfo is a synchronous call, this allows a
user to obtain locally available data only without long
delays that may block an application thread. Callers can then
use the asynchronous librdmacm calls to complete any missing
information.
Sean Hefty [Thu, 2 Dec 2010 18:15:11 +0000 (10:15 -0800)]
librdmacm: support non-default ACM port number
By default, ACM uses port 6125. The actual port number
used is now published in /var/run/ibacm.port. Attempt to
obtain the correct port number from here, and if that fails
revert to using the default port number of 6125.
Sean Hefty [Mon, 1 Nov 2010 18:12:13 +0000 (11:12 -0700)]
librdmacm/rping: Make sure CQ event thread exits before destroying the CQ
It is possible for the CQ event thread to poll the CQ after it has been
destroyed which can result in a seg fault on T3 interfaces. This patch
waits for the thread to exit before destroying the CQ.
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Jonathan Rosser [Mon, 1 Nov 2010 16:38:22 +0000 (09:38 -0700)]
librdmacm: fix make install
make install fails if the include files in the install prefix
include/rdma,infiniband already exist. install claims that the <src>
and <destination> file are the same and exits with an error.
This patch modifies Makefile.am so that the rdma and infiniband include
files explicitly reference the source directory rather than the build
directory.
Also, EXTRA_DIST now only lists files that are not referenced anywhere
else in Makefile.am
Signed-off-by: Jonathan Rosser <jrosser@rd.bbc.co.uk> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Jason Gunthorpe [Mon, 1 Nov 2010 16:03:47 +0000 (09:03 -0700)]
librdmacm: Fix autotools to include the necessary M4 files
Otherwise running autogen.sh with a new version of autotools and then
building on a system with an older version tends to explode.
Unfortunately this is sometimes necessary since the new version is
required by the package.
This is how GNU envisions this mess works at least..
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Mon, 18 Oct 2010 16:53:53 +0000 (09:53 -0700)]
librdmacm: do not modify qp_init_attr in rdma_get_request
rdma_create_qp modifies the qp_init_attr structure passed in
by the user to return the actual QP capabilities that were
allocated. If the qp_init_attr does not specify CQs, the
librdmacm will allocate CQs for the user and return them.
rdma_get_request will allocate a QP off newly connected rdma_cm_id
if the corresponding listen request is associated with a
qp_init_attr structure. The librdmacm passes in the listen->
qp_init_attr structure into the rdma_create_qp call.
rdma_create_qp ends up modifying the qp_init_attr's associated
with the listen. The result is that future calls to
rdma_get_request will use the modified qp attributes, rather
than those specified by the user.
Fix this by having rdma_get_request pass in a copy of the
qp_init_attr, rather than modifying those associated with the
listen. Also update the man page for rdma_create_qp to indicate
that the qp_init_attr structure may be modified on output.
Sean Hefty [Mon, 4 Oct 2010 23:35:25 +0000 (16:35 -0700)]
librdmacm: only allocate qp in rdma_create_ep if qp_attr provided
The comments and documentation for rdma_create_ep indicate that
it will only allocate a QP if initial QP attributes are provided.
However, the code always attempts to create a QP off an associated
active rdma_cm_id endpoint.
By _not_ allocating the QP, this allows a user to first determine
what RDMA device a rdma_cm_id was associated with. The user can
then create a QP that references an existing CQ, SRQ, or PD.
Sean Hefty [Wed, 18 Aug 2010 19:39:58 +0000 (12:39 -0700)]
librdmacm: expand support for hints to rdma_getaddrinfo
If a user passes in hints into rdma_getaddrinfo, they can
specify resolved source and destination addresses. In this
case, there's no need for the user to specify the node or
service parameters. This differs from getaddrinfo, which
indicates that either node or service must be provided, but
is useful if rdma_getaddrinfo is being used to obtain
routing data.
Supporting this option allows the librdmacm to call
rdma_getaddrinfo internally from rdma_resolve_route when IB ACM
is enabled.
In addition to specifying the source and destination addresses
as part of the hints, a user could instead specify partial
routing data and rdma_getaddrinfo can resolve the full route.
This helps to support MPI applications that exchange endpoint
data, such as LIDs, out of band, but require SL data from the
SA to avoid potential deadlock conditions.
Sean Hefty [Tue, 20 Jul 2010 21:07:03 +0000 (14:07 -0700)]
librdmacm: fix all calls to set errno
The librdmacm documentation (rdma_cm.7 man page) specifies that librdmacm
functions return 0 on success and -1 on error, with errno set correctly.
Update places in the code where errno is not set correctly and
cleanup setting the error code.
Sean Hefty [Tue, 20 Jul 2010 22:24:55 +0000 (15:24 -0700)]
librdmacm: convert ibv error values to errno
The librdmacm documentation (rdma_cm.7 man page) specifies that librdmacm
functions return 0 on success and -1 on error, with errno set correctly.
The libibverbs abstractions simply pass the libibverbs return codes
through to the user. Since libibverbs may return errno directly through
a given call, convert the return status to use errno where appropriate.
This fixes an issue with rdma_get_send_comp and rdma_get_recv_comp, where
a return value of 1 could indicate both success (1 completed request
returned) and failure (EPERM error).
Sean Hefty [Mon, 17 May 2010 21:37:30 +0000 (14:37 -0700)]
librdmacm: define struct ibv_path_record if not defined
librdmacm relies on struct ibv_path_record from libibverbs. However,
to support older versions of libibverbs, define struct ibv_path_record
if it is not already defined.
Sean Hefty [Tue, 11 May 2010 18:15:47 +0000 (11:15 -0700)]
librdmacm: disable AF_IB support
To avoid potential compatability issues, disable AF_IB support
until it has been queued for inclusion upstream. We will re-enable
AF_IB support after releasing version 1.0.12 for OFED 1.5.2, which
will include support for IB ACM.
Sean Hefty [Thu, 6 May 2010 22:51:38 +0000 (15:51 -0700)]
librdmacm: use IB ACM to resolve IB path
Starting with 2.6.33, the kernel supports the ability to
manually specify the path record that a connection should
use. Allow the librdmacm to contact the IB ACM to acquire
path record data, even if rdma_getaddrinfo is not used and
the kernel does not support AF_IB.
Sean Hefty [Thu, 6 May 2010 22:50:57 +0000 (15:50 -0700)]
librdmacm/rdma_server: add new sample server application
Provide a simple server application to demonstrate the minimal
amount of coding needed to accept a connection request from
a client and exchange messages.
Sean Hefty [Thu, 6 May 2010 22:50:54 +0000 (15:50 -0700)]
librdmacm/rdma_client: add new client sample
Provide a very simple client application that shows the
minimal coding needed to establish a connection and exchange
messages with a server. The client makes use of the new
rdma_getaddrinfo and rdma_create_ep calls, plus rdma verbs
abstractions.
Sean Hefty [Thu, 6 May 2010 22:50:50 +0000 (15:50 -0700)]
librdmacm: provide abstracted verb calls
Provide abstractions to the verb calls to simplify the user
interface for more casual verbs consumers. Users still have
access to the full range of verbs functionality by calling
verbs directly.
Sean Hefty [Thu, 6 May 2010 22:50:43 +0000 (15:50 -0700)]
librdmacm: format IB CM private data RDMA CM header
When IB ACM is used, the address and route resolution is
done entirely in user space. Before converting AF_INET or
AF_INET6 addresses to AF_IB, format the connection private
data for IB CM REQ messages.
Sean Hefty [Thu, 6 May 2010 22:49:55 +0000 (15:49 -0700)]
librdmacm: use sockaddr_ib addressing if IB ACM is in use
If IB ACM route resolution succeeds, provide the user with
AF_IB addresses, rather than AF_INET or AF_INET6 addressing.
AF_IB identifies the local and remote devices directly,
eliminating the need to perform address resolution a second
time via rdma_resolve_addr.
AF_IB addresses are returned using sockaddr_ib as part of
rdma_getaddrinfo.
Sean Hefty [Thu, 6 May 2010 22:48:53 +0000 (15:48 -0700)]
librdmacm: add support for IB ACM service
Allow the librdmacm to contact a service via sockets to obtain
address mapping and path record data. The use of the service
is controlled through a build option (with-ib_acm). If the
library fails to contact the service, it falls back to using
the kernel services to resolve address and routing data.
Sean Hefty [Thu, 6 May 2010 22:47:23 +0000 (15:47 -0700)]
librdmacm: specify qp_type when creating id
To support AF_IB / PS_IB, we need to specify the qp type when
creating the rdma_cm_id. The kernel requires this in order
to select the correct type of operation to perform (e.g. SIDR
versus REQ).
Sean Hefty [Thu, 6 May 2010 22:46:37 +0000 (15:46 -0700)]
librdmacm: add rdma_get_request
To simplify passive side operation and better support synchronous
operations, add rdma_get_request(). This function is called on the
listening side to retrieve a connection request event.
Sean Hefty [Thu, 6 May 2010 22:46:32 +0000 (15:46 -0700)]
librdmacm: add rdma_getaddrinfo
Provide a call similar to getaddrinfo for RDMA devices and
connections. rdma_get_addrinfo is modeled after getaddrinfo, with
the following modifications:
A source address is returned as part of the call to allow the
user to allocate the necessary resources for connections.
Optional routing information may be returned to support
Infiniband fabrics. IB routing information includes necessary
path record data.
Sean Hefty [Thu, 6 May 2010 22:45:05 +0000 (15:45 -0700)]
librdmacm: allow user to specify max RDMA resouces
Allow the user to indicate that the library should select the
maximum RDMA read values available should be used when
establishing a connection. The library selects the maximum
based on local hardware limitations and connection request
data.
Sean Hefty [Thu, 6 May 2010 22:44:57 +0000 (15:44 -0700)]
librdmacm: make CQs optional for rdma_create_qp
Allow the user to specify NULL for the send and receive CQs when
creating a QP through rdma_create_qp. The librdmacm will automatically
create CQs for the user, along with completion channel.
Sean Hefty [Thu, 6 May 2010 22:44:52 +0000 (15:44 -0700)]
librdmacm: allow pd parameter to be optional
Allow the user to create a QP using rdma_create_qp without
specifying a PD. If a PD is not given, a default PD will be
used instead. This simplifies the user interface.
Sean Hefty [Thu, 6 May 2010 22:44:16 +0000 (15:44 -0700)]
librdmacm: replace query_route call with separate queries
To support other address families and multiple path records,
replace the query_route call with specific query calls to obtain
only the desired information.
Sean Hefty [Wed, 7 Apr 2010 18:01:52 +0000 (11:01 -0700)]
librdmacm: support querying AF_IB addresses
The current query route command returns path record data and address
information. The latter is restricted to sizeof(sockaddr_in6). In
order to support AF_IB, modify the library to use the new query addr
command, which supports larger address sizes and avoids querying for
path records data when none are available.
Sean Hefty [Wed, 7 Apr 2010 18:01:43 +0000 (11:01 -0700)]
librdmacm: name changes to indicate only IP addresses supported
Several commands to the kernel RDMA CM only support IP addresses
because of limitations in the structure definition. Update
the library to match the name changes in the kernel and indicate
that only IP addresses can be used with the current commands.
Sean Hefty [Fri, 5 Feb 2010 22:25:15 +0000 (14:25 -0800)]
librdmacm: transition QP to RTS before sending reply
In order to handle a race condition where the passive side of
a connection can receive data on a QP before the connection established
event has been received, transition the QP to RTS before sending the reply.
This allows a user to send a response to any received message immediately,
rather than waiting until the connection established event has been
processed.
A similar fix was applied to the kernel rdma_cm a while ago.
Simply duplicate the fix in the user space library.
Or Gerlitz [Mon, 9 Nov 2009 22:46:20 +0000 (14:46 -0800)]
librdmacm/mckey: enforce local binding for unmapped multicast address
Enforce that mckey is bound to a local address when specifying
an unmapped multicast address. Otherwise, mckey crashes when
attempting to use cmd_id->verbs pointer.
Update documentation on using unmapped MGIDs.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Jason Gunthorpe [Wed, 21 Oct 2009 15:02:43 +0000 (08:02 -0700)]
librdmacm: returns errors from the library consistently
Remove the return of -errno and always return codes via errno.
As documented in librdmacm, these libraries are already documented
to return -1 to indicate the code is in errno.
Update rping to show correct error reporting methodology.
Also fix errant return of 0 if the read/write syscalls return 0.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Mon, 19 Oct 2009 14:21:42 +0000 (07:21 -0700)]
librdmacm: fix race initializing library
Multi-threaded code can race with ucma_init(). When ucma_init() is called
it can set cma_dev_cnt before it is done initializing the library. If
a second thread checks cma_dev_cnt and finds it non-zero, then it will
skip the call to ucma_init() and assume that the library is ready for use.
This can lead to an application crash during startup.
Do not set cma_dev_cnt until the end of ucma_init(), after all
initialization has completed. Adjust the error handling code in
ucma_init() accordingly.
Sean Hefty [Wed, 14 Jan 2009 22:31:34 +0000 (14:31 -0800)]
cmatose: avoid missing completions
cmatose uses a single CQ for send and receive completions. It then
counts completions to determine if all sends and receives are done.
It's possible for a receive completion to be polled when the intent
is to count send completions. (See server side polling for sends
done, then receives done. The poll will get up to 8 completions,
which can lead to sends and receives being polled together.)
Fix this by separating the send and receive completions to their
own CQs to avoid any issues knowing what type of receives have been
polled from the CQ.
Or Gerlitz [Wed, 2 Jul 2008 18:53:49 +0000 (11:53 -0700)]
librdmacm: implement address change event
RDMA_CM_EVENT_ADDR_CHANGE event can be used by librdmacm consumers
that wish to have their RDMA sessions always use the same links
(eg <hca/port>) as the IP stack does. In the current code, this
does not happen when bonding is used and fail-over happened,
but the IB link used by an already existing session is operating fine.
The kernel rdma-cm code was enhanced to use netevent notification
for sensing that a change has happened in the IP stack, and deliver
this event for ID that is misaligned that respect with the IP
stack. The user can act on the event or just ignore it
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Thu, 22 May 2008 14:35:13 +0000 (07:35 -0700)]
librdmacm: fix license text
The license text for references a third software license
that was inadvertently copied in. Update the license to match that used
by openfabrics. This update was based on a request from HP.
Sean Hefty [Wed, 20 Feb 2008 16:53:19 +0000 (08:53 -0800)]
librdmacm: add rdma_migrate_id
This is based on user feedback from Doug Ledford at RedHat:
Events that occur on an rdma_cm_id are reported to userspace through
an event channel. Connection request events are reported
on the event channel associated with the listen. When the
connection is accepted, a new rdma_cm_id is created and automatically
uses the listen event channel. This is suboptimal where the user
only wants listen events on that channel.
Additionally, it may be desirable to have events related to
connection establishment use a different event channel than those
related to already established connections.
Allow the user to migrate an rdma_cm_id between event channels.