Stan Smith [Fri, 9 Apr 2010 22:15:47 +0000 (22:15 +0000)]
[WIX] simplify by creating %systemdrive%\DAT in a single place (moved from ofed.wxs files).
Create DAT SDK include file to match uDAT ver 2 conventions (DAT --> DAT2)
Include missing DAT file dat_ib_extensions.h
Stan Smith [Thu, 8 Apr 2010 20:35:13 +0000 (20:35 +0000)]
[WINOF] Fix isolated installer crash. In CheckDriversOK() remove usage of Wsh.Sleep() as it fails in the installer environment. Wait by invoking 'cmd.exe /C timeout /T 2'. Need to wait in a small # of isolated cases for the IPoIB driver (device IBA\IPOIB) to show up in the Window device database before installing ND/WSD. Generally IPoIB device is present prior to ND/WSD provider installation.
Waiting for the IPoIB device ensures NetworkDirect/IBAL & Winsock Direct providers install correctly. Wait 30 seconds max, output timed nastyGram about possible IPoIB driver initialization failure (no errors returned).
Stan Smith [Thu, 8 Apr 2010 20:29:26 +0000 (20:29 +0000)]
[OFED] More WinOF --> OFED name changes. In CheckDriversOK() remove usage of Wsh.Sleep() as it fails in the installer environment. Wait by invoking 'cmd.exe /C timeout /T 2'. Need to wait in a small # of isolated cases for the IPoIB driver (device IBA\IPOIB) to show up in the Window device database. Waiting ensures NetworkDirect/IBAL & Winsock Direct providers install correctly. Wait 30 seconds max, output timed nastyGram about possible IPoIB driver initialization failure (no errors returned).
Sean Hefty [Fri, 2 Apr 2010 23:32:39 +0000 (23:32 +0000)]
winverbs/nd: do not convert timeout status value
wv_io_timeout is an error value, but nd_timeout is a
success value. This means that an overlapped request
can be completed in error, but GetOverlappedResult can
return a success value. Instead, if a timeout is fatal,
report the status as wv_io_timeout, but if the timeout
can be retried, then report the status as wv_timeout.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2763 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Sean Hefty [Fri, 2 Apr 2010 23:31:13 +0000 (23:31 +0000)]
winverbs/ep: allow failed connections to be retried
Allow a user to retry a connection request if it fails. Report a
'success' timeout value, rather than an error timeout, reset the
EP state, and allow the request to be retried from user space.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2761 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Stan Smith [Thu, 18 Mar 2010 19:27:13 +0000 (19:27 +0000)]
[ND/IBAL] ND provider INDAdaper::Query busted.
The INDAdapter::Query implementation doesn't set the maximum transfer lengths properly.
This patch fixes this, and allows MSMPI to chunk large transfers properly.
Signed-off-by: Fab Tillier <ftillier@microsoft.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2753 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Stan Smith [Thu, 18 Mar 2010 19:13:50 +0000 (19:13 +0000)]
[IPOIB/IPOIB_NDIS6_CM] install 32-bit version of ND winverbs provider in syswow64\ for 64-bit builds.
updated netipoib-xp32.inf also to minimize diffs such that XP specific changes can easily be identified.
This patch is in 2.2 release and working well for MS MPI.
Stan Smith [Thu, 18 Mar 2010 18:16:19 +0000 (18:16 +0000)]
[OFED] simplify code - remove wide-spread VersionNT usage. When checking if HCA install is OK, skip warning if not HCA hardware in system (aka, SW install).
Stan Smith [Thu, 11 Mar 2010 20:15:30 +0000 (20:15 +0000)]
[ND/IBAL] ND provider INDAdaper::Query busted.
The INDAdapter::Query implementation doesn't set the maximum transfer lengths properly.
This patch fixes this, and allows MSMPI to chunk large transfers properly.
Signed-off-by: Fab Tillier <ftillier@microsoft.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2736 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Stan Smith [Wed, 10 Mar 2010 22:32:22 +0000 (22:32 +0000)]
[DAPL2] scm: CM linking to EP must be done before socket write in accept_user.
scm accept RTU was processing the conn object in the cr_thread
before the user accept thread bound the EP to the CM object.
The linking must be done before the socket write to insure
proper linking and state during accept_rtu processing.
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2732 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Stan Smith [Wed, 10 Mar 2010 22:26:01 +0000 (22:26 +0000)]
[WINVERBS/ND] do not convert wv_io_timeout to nd_timeout
wv_io_timeout is an error value, but nd_timeout is a
success value. This means that an overlapped request
can be completed in error, but GetOverlappedResult can
return a success value. Instead, if a timeout is fatal,
report the status as wv_io_timeout, but if the timeout
can be retried, then report the status as wv_timeout.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2730 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Stan Smith [Wed, 10 Mar 2010 22:24:02 +0000 (22:24 +0000)]
[WINVERBS/ND] allow retrying ND:Connect()
The ND documentation specifies that ND:Connect() should be retry-able.
Add this support to the winverbs ND provider.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2729 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Stan Smith [Wed, 10 Mar 2010 22:19:37 +0000 (22:19 +0000)]
[WINVERBS] winverbs/ep: allow failed connection requests to be retried
Allow a user to retry a connection request if it fails. Report a
'success' timeout value, rather than an error timeout, reset the
EP state, and allow the request to be retried from user space.
winverbs: allow WV:Disconnect() to be retried
ND for some odd reason wants a successful return code for
a failed disconnect call. If a Disconnect() call fails,
allow it to be retried. Return STATUS_TIMEOUT - a 'successful'
failure, rather than STATUS_IO_TIMEOUT, which is a 'failed' failure.
(I love Windows, really, I do.) A subsequent call to EP:
Disconnect() after a timeout will force the QP into the error
state and force the EP into the disconnected state.
This change is needed to prevent ndping and other ND tests from
reporting a failure. They only allow disconnect calls to fail
'successfully' with STATUS_TIMEOUT. With the status mapping
removed from the user space WV ND library, we need to return the
desired value directly from the kernel.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2728 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Stan Smith [Tue, 23 Feb 2010 23:03:33 +0000 (23:03 +0000)]
[WINOF] update with trunk commits for 2.2 release
2703 ibat/resolve: retry ibat resolution
2702 libibverbs: release wvprovider reference from ibvwv_acquire_windata
2701 dapl: locking cleanup and fixes
2689 libibverbs/device: destroy completion channel when closing device
2688 dapl: use private_data_len for mem copies
2687 dapl/cma: fix referencing freed address
2686 librdmacm: set private_data_len
2685 dapl: move close device after async thread is done using it
2683 [core] Improved error message.
2682 [core] Release the RDMA interface if ib_register_ca fails.
Stan Smith [Fri, 19 Feb 2010 18:13:20 +0000 (18:13 +0000)]
[IPoIB_ndis6_cm] migrate IPoIB shutter code into the trunk - match 2.2 release.
shutter code stops IPoIB from receiving packets during device shutdown.
Stan Smith [Fri, 19 Feb 2010 17:46:36 +0000 (17:46 +0000)]
[WINOF] delete orphaned files on uninstall
Added new CustomAction to wait for IPoIB device to appear in order to ensure ND & WSD provider install success.
Stan Smith [Fri, 19 Feb 2010 17:13:41 +0000 (17:13 +0000)]
[DAPL2] Cleanup CM object lock before freeing CM object memory
Running windows application verifiier for uDAPL validation
for all 3 providers. Cleanup memory lock leaks found
by verifier.
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2706 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Stan Smith [Fri, 19 Feb 2010 17:09:58 +0000 (17:09 +0000)]
[DAPL2] ucm, scm, cma: destroy verbs completion channels created via ia_open or ep_create.
Completion channels are created with ia_open for CNO events and with ep_create in cases where DAT allows EP(qp) to be created with no EVD(cq) and IB doesn't. These completion channels need to be destroyed at close along with a CQ for the "EP without EVD" case.
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2705 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Sean Hefty [Thu, 18 Feb 2010 21:51:19 +0000 (21:51 +0000)]
ibat/resolve: retry ibat resolution
Winverbs ND scale out testing showed that IBAT::Resolve() can
return E_PENDING, which requires that the resolution be retried.
A similar issue to this was seen when testing with the librdmacm.
Rather than duplicating retry logic in the winverbs ND provider,
add new functionality to ibat, with retry capability. To
avoid breaking the ibat.dll interface, extend the API with a
new call ResolvePath() that takes a timeout value.
ResolvePath() automatically retries Resolve() while the result
is E_PENDING, until the request times out. Modify the winverbs
ND provider to call ResolvePath(). Also update other places
where Resolve() is called in a loop: the librdmacm and wsd.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2703 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Sean Hefty [Thu, 18 Feb 2010 21:49:14 +0000 (21:49 +0000)]
dapl: locking cleanup and fixes
Cleanup allocated completion channels. Destroy cm_ptr locks before freeing the cm_ptr to avoid memory leaks. And avoid accessing the cm_ptr after queuing it for destruction with the cr_thread to avoid use after free errors.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2701 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Stan Smith [Wed, 3 Feb 2010 23:04:54 +0000 (23:04 +0000)]
[IPOIB_NDIS6_CM] use SHUTTER data structure to eliminate the problem when IPoIB continue receive packets during shutdown/halt process.
signed-off-by: Alex Naslednikov [xalex@mellanox.co.il]
Stan Smith [Mon, 1 Feb 2010 18:13:47 +0000 (18:13 +0000)]
[OFED] switch references to WinOF -- >OFED.
Support target-OS & target-arch in build scripts & etc\clean-build.bat.
Migrate WIX package definitions from ofed.wxs files to new file common\Package.inc (easier maintenance).
DAT.conf,dt-cli.bat,dt-svr.bat: WinOF --> OFED as binaries are now installed into %ProgramFiles%\OFED\ .
Sean Hefty [Fri, 29 Jan 2010 05:06:11 +0000 (05:06 +0000)]
dapl: use private_data_len for mem copies
From: Sean Hefty <sean.hefty@intel.com>
When copying private_data out of rdma_cm events, use the
reported private_data_len for the size, and not IB maximums.
This fixes a bug running over the librdmacm on windows, where
DAPL accessed invalid memory.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2688 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
Sean Hefty [Fri, 29 Jan 2010 05:05:03 +0000 (05:05 +0000)]
dapl/cma: fix referencing freed address
From: Sean Hefty <sean.hefty@intel.com>
DAPL uses a pointer to reference the local and remote addresses
of an endpoint. It expects that those addresses are located
in memory that is always accessible. Typically, for the local
address, the pointer references the address stored with the DAPL
HCA device. However, for the cma provider, it changes this pointer
to reference the address stored with the rdma_cm_id.
This causes a problem when that endpoint is connected on the
passive side of a connection. When connect requests are given
to DAPL, a new rdma_cm_id is associated with the request. The
DAPL code replaces the current rdma_cm_id associated with a
user's endpoint with the new rdma_cm_id. The old rdma_cm_id is
then deleted. But the endpoint's local address pointer still
references the address stored with the old rdma_cm_id. The
result is that any reference to the address will access freed
memory.
Fix this by keeping the local address pointer always pointing
to the address associated with the DAPL HCA device. This is about
the best that can be done given the DAPL interface design.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2687 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86