Sean Hefty [Tue, 22 May 2012 01:46:36 +0000 (18:46 -0700)]
rsocket preload: Use environment variable to set QP size
Allow the user to specify the size of the send/receive queues
and inline data size through environment variables:
RS_SQ_SIZE, RS_RQ_SIZE, and RS_INLINE.
Sean Hefty [Sat, 26 May 2012 07:02:47 +0000 (00:02 -0700)]
rsocket: Merge nonblock test with test() routine in rs_process_cq
rs_process_cq takes the following 2 parameters: nonblock and test().
These are used to control the operation of rs_process_cq. If
nonblock is true, rs_process_cq will exit without arming the CQ or
waiting on a CQ event. rs_process_cq() will also exit if test()
returns true. The only difference in the operation is the return
value that rs_process_cq() returns.
We can simplify the code by merging the nonblock test into the
caller's provided test() routine. The test() routine simply needs
to return the correct value for rs_process_cq(). This will also
simplify fixing an issue where a caller may block indefinitely
in send() or recv() after an rsocket has been disconnected. That
fix is in a subsequent patch.
Sean Hefty [Sat, 26 May 2012 07:02:47 +0000 (00:02 -0700)]
rsocket: Merge nonblock test with test() routine in rs_process_cq
rs_process_cq takes the following 2 parameters: nonblock and test().
These are used to control the operation of rs_process_cq. If
nonblock is true, rs_process_cq will exit without arming the CQ or
waiting on a CQ event. rs_process_cq() will also exit if test()
returns true. The only difference in the operation is the return
value that rs_process_cq() returns.
We can simplify the code by merging the nonblock test into the
caller's provided test() routine. The test() routine simply needs
to return the correct value for rs_process_cq(). This will also
simplify fixing an issue where a caller may block indefinitely
in send() or recv() after an rsocket has been disconnected. That
fix is in a subsequent patch.
Sean Hefty [Sat, 26 May 2012 07:02:47 +0000 (00:02 -0700)]
rsocket: Merge nonblock test with test() routine in rs_process_cq
rs_process_cq takes the following 2 parameters: nonblock and test().
These are used to control the operation of rs_process_cq. If
nonblock is true, rs_process_cq will exit without arming the CQ or
waiting on a CQ event. rs_process_cq() will also exit if test()
returns true. The only difference in the operation is the return
value that rs_process_cq() returns.
We can simplify the code by merging the nonblock test into the
caller's provided test() routine. The test() routine simply needs
to return the correct value for rs_process_cq(). This will also
simplify fixing an issue where a caller may block indefinitely
in send() or recv() after an rsocket has been disconnected. That
fix is in a subsequent patch.
Sean Hefty [Sat, 26 May 2012 00:24:08 +0000 (17:24 -0700)]
rsocket: Fix hang in rrecv/rsend after disconnecting
If a user calls rrecv() after a blocking rsocket has been disconnected,
it will hang. This problem and the cause was reported by Sirdhar Samudrala
<samudrala@us.ibm.com>. It can be reproduced by running netserver -f -D
using the rs-preload library. A similar issue exists with rsend().
Fix this by not blocking on a CQ unless we're connected.
Sean Hefty [Sat, 26 May 2012 00:24:08 +0000 (17:24 -0700)]
rsocket: Fix hang in rrecv() after disconnecting
If a user calls rrecv() after a blocking rsocket has been disconnected,
it will hang. This problem was reported by Sirdhar Samudrala
<samudrala@us.ibm.com>. It can be reproduced by running netserver -f -D
using the rs-preload library.
Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Sat, 26 May 2012 00:24:08 +0000 (17:24 -0700)]
rsocket: Fix hang in rrecv() after disconnecting
If a user calls rrecv() after a blocking rsocket has been disconnected,
it will hang. This problem was reported by Sirdhar Samudrala
<samudrala@us.ibm.com>. It can be reproduced by running netserver -f -D
using the rs-preload library.
Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Sat, 26 May 2012 00:24:08 +0000 (17:24 -0700)]
rsocket: Fix hang in rrecv() after disconnecting
If a user calls rrecv() after a blocking rsocket has been disconnected,
it will hang. This problem was reported by Sirdhar Samudrala
<samudrala@us.ibm.com>. It can be reproduced by running netserver -f -D
using the rs-preload library.
Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Sat, 26 May 2012 00:24:08 +0000 (17:24 -0700)]
rsocket: Fix hang in rrecv() after disconnecting
If a user calls rrecv() after a blocking rsocket has been disconnected,
it will hang. This problem was reported by Sirdhar Samudrala
<samudrala@us.ibm.com>. It can be reproduced by running netserver -f -D
using the rs-preload library.
Signed-off-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Sean Hefty [Fri, 25 May 2012 19:42:12 +0000 (12:42 -0700)]
rs-preload: Handle recursive socket() calls
When ACM support is enabled in the librdmacm, it will attempt to
establish a socket connection to the ACM daemon. When the rsocket
preload library is in use, this can result in a recursive call
to socket() that results in the library hanging. The resulting
call stack is:
The second call to ucma_init() hangs because initialization is
still pending.
Fix this by checking for recursive calls to socket() in the preload
library. When detected, call the real socket() call instead of
directing the call back into rsockets(). Since rsockets is a part
of the librdmacm, it can call rsockets directly if it wants to use
rsockets instead of standard sockets.
This problem and the cause was reported by Chet Murthy <chet@watson.ibm.com>
Sean Hefty [Fri, 25 May 2012 19:42:12 +0000 (12:42 -0700)]
rs-preload: Handle recursive socket() calls
When ACM support is enabled in the librdmacm, it will attempt to
establish a socket connection to the ACM daemon. When the rsocket
preload library is in use, this can result in a recursive call
to socket() that results in the library hanging. The resulting
call stack is:
The second call to ucma_init() hangs because initialization is
still pending.
Fix this by checking for recursive calls to socket() in the preload
library. When detected, call the real socket() call instead of
directing the call back into rsockets(). Since rsockets is a part
of the librdmacm, it can call rsockets directly if it wants to use
rsockets instead of standard sockets.
This problem and the cause was reported by Chet Murthy <chet@watson.ibm.com>
Sean Hefty [Fri, 25 May 2012 17:48:47 +0000 (10:48 -0700)]
librdmacm: Delay ACM connection until resolving an address
Avoid creating a connection to the ACM service when
it's not needed. For example, if the user of the librdmacm
is a server application, it will not use ACM services.