Sean Hefty [Thu, 23 Sep 2010 19:25:29 +0000 (12:25 -0700)]
ibal: send drep in response to unmatched dreq
If a DREQ is received that cannot be matched with an
existing CEP, issue a DREP in response. It's possible
that the targetted CEP issued a DREP which was lost.
The CEP then transitioned through the timewait state
before another DREQ was received, and is no longer accessible.
To prevent the remote side from timing out waiting on a DREP,
send one so that it can complete its disconnection.
This fixes an issue in iMPI where one side of a connection
ends up waiting up to two minutes on disconnection.
Sean Hefty [Thu, 23 Sep 2010 19:25:29 +0000 (12:25 -0700)]
ibal: send drep in response to unmatched dreq
If a DREQ is received that cannot be matched with an
existing CEP, issue a DREP in response. It's possible
that the targetted CEP issued a DREP which was lost.
The CEP then transitioned through the timewait state
before another DREQ was received, and is no longer accessible.
To prevent the remote side from timing out waiting on a DREP,
send one so that it can complete its disconnection.
This fixes an issue in iMPI where one side of a connection
ends up waiting up to two minutes on disconnection.
Sean Hefty [Fri, 17 Sep 2010 21:17:20 +0000 (14:17 -0700)]
dapl/ibal: delay QP transition until user disconnects
The ibal provider calls ib_cm_drep in response to receiving
a dreq. The result is that the user's QP is transitioned
through the error state, which fails any outstanding send
operations and flushes all receives. The disconnect request
is then reported to the user.
Since a user can receive errors from the QP before they are
aware of a pending disconnect request, the application may
respond to the errors as, well, actual errors. Fix this by
delaying the QP transition until the user responds to the
dreq.
This fixes an error with Intel MPI running over the ibal
dapl provider with a 'spawn' test.
Sean Hefty [Fri, 17 Sep 2010 21:17:20 +0000 (14:17 -0700)]
dapl/ibal: delay QP transition until user disconnects
The ibal provider calls ib_cm_drep in response to receiving
a dreq. The result is that the user's QP is transitioned
through the error state, which fails any outstanding send
operations and flushes all receives. The disconnect request
is then reported to the user.
Since a user can receive errors from the QP before they are
aware of a pending disconnect request, the application may
respond to the errors as, well, actual errors. Fix this by
delaying the QP transition until the user responds to the
dreq.
This fixes an error with Intel MPI running over the ibal
dapl provider with a 'spawn' test.
Sean Hefty [Fri, 17 Sep 2010 21:17:20 +0000 (14:17 -0700)]
dapl/ibal: delay QP transition until user disconnects
The ibal provider calls ib_cm_drep in response to receiving
a dreq. The result is that the user's QP is transitioned
through the error state, which fails any outstanding send
operations and flushes all receives. The disconnect request
is then reported to the user.
Since a user can receive errors from the QP before they are
aware of a pending disconnect request, the application may
respond to the errors as, well, actual errors. Fix this by
delaying the QP transition until the user responds to the
dreq.
This fixes an error with Intel MPI running over the ibal
dapl provider with a 'spawn' test.
Sean Hefty [Fri, 17 Sep 2010 21:17:20 +0000 (14:17 -0700)]
dapl/ibal: delay QP transition until user disconnects
The ibal provider calls ib_cm_drep in response to receiving
a dreq. The result is that the user's QP is transitioned
through the error state, which fails any outstanding send
operations and flushes all receives. The disconnect request
is then reported to the user.
Since a user can receive errors from the QP before they are
aware of a pending disconnect request, the application may
respond to the errors as, well, actual errors. Fix this by
delaying the QP transition until the user responds to the
dreq.
This fixes an error with Intel MPI running over the ibal
dapl provider with a 'spawn' test.
Sean Hefty [Fri, 17 Sep 2010 21:17:20 +0000 (14:17 -0700)]
dapl/ibal: delay QP transition until user disconnects
The ibal provider calls ib_cm_drep in response to receiving
a dreq. The result is that the user's QP is transitioned
through the error state, which fails any outstanding send
operations and flushes all receives. The disconnect request
is then reported to the user.
Since a user can receive errors from the QP before they are
aware of a pending disconnect request, the application may
respond to the errors as, well, actual errors. Fix this by
delaying the QP transition until the user responds to the
dreq.
This fixes an error with Intel MPI running over the ibal
dapl provider with a 'spawn' test.
Sean Hefty [Fri, 17 Sep 2010 21:17:20 +0000 (14:17 -0700)]
dapl/ibal: delay QP transition until user disconnects
The ibal provider calls ib_cm_drep in response to receiving
a dreq. The result is that the user's QP is transitioned
through the error state, which fails any outstanding send
operations and flushes all receives. The disconnect request
is then reported to the user.
Since a user can receive errors from the QP before they are
aware of a pending disconnect request, the application may
respond to the errors as, well, actual errors. Fix this by
delaying the QP transition until the user responds to the
dreq.
This fixes an error with Intel MPI running over the ibal
dapl provider with a 'spawn' test.
Sean Hefty [Fri, 17 Sep 2010 21:17:20 +0000 (14:17 -0700)]
dapl/ibal: delay QP transition until user disconnects
The ibal provider calls ib_cm_drep in response to receiving
a dreq. The result is that the user's QP is transitioned
through the error state, which fails any outstanding send
operations and flushes all receives. The disconnect request
is then reported to the user.
Since a user can receive errors from the QP before they are
aware of a pending disconnect request, the application may
respond to the errors as, well, actual errors. Fix this by
delaying the QP transition until the user responds to the
dreq.
This fixes an error with Intel MPI running over the ibal
dapl provider with a 'spawn' test.