shefty [Tue, 9 Jun 2009 16:40:06 +0000 (16:40 +0000)]
winverbs: process connect and accept asynchronously
Allow processing of EP:Connect and EP:Accept calls asynchronously. The
librdmacm uses events to report the completion of rdma_connect and
rdma_accept calls, which allows users of that interface to take advantage
of asynchronous operation. Modify the winverbs kernel driver to
queue connect/accept calls to a system thread for better parallelism.
This improves the measured connection rate of rdma_cmatose by 3%. The
connection rate includes address resolution, route resolution, PD/CQ/QP
creation and state transitions, memory registration, posting of receive
buffers, and CM message exchanges. This patch effectively only improves
the parallelism of modify QP.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2239 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Thu, 4 Jun 2009 18:51:56 +0000 (18:51 +0000)]
netdirect: replace 'localhost' with '..localmachine'
Use of 'localhost' does not work with getaddrinfo to return all available IP addresses, but '..localmachine' does. A bug was submitted to MS regarding the issue.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2235 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Wed, 3 Jun 2009 17:44:16 +0000 (17:44 +0000)]
winverbs: export WvGetObject as extern C
If a standard C program or library tries to link against winverbs, it will fail
because of C++ name decoration. Export the WvGetObject call using C naming
conventions.
This allows DAPL to link directly against winverbs.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2232 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Wed, 3 Jun 2009 17:43:33 +0000 (17:43 +0000)]
winverbs: export WvGetObject as extern C
If a standard C program or library tries to link against winverbs, it will fail
because of C++ name decoration. Export the WvGetObject call using C naming
conventions.
This allows DAPL to link directly against winverbs.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2231 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
leonidk [Tue, 2 Jun 2009 17:49:32 +0000 (17:49 +0000)]
[IBAL] work around for reference count leakage bugs. [mlnx: 4404]
IBAL still has bugs, which cause reference count leakage, which stops the cascading destroying resources of IBAL.
It causes in turn a freeze of IBBUS on HCA disable or system power down.
On checked builds IBAL forces destroying of the objects after some timeout.
On free version it waits endlessly.
This patch makes the behavior of free version to be like in checked version while sending a message to System Event Log.
leonidk [Tue, 2 Jun 2009 17:48:10 +0000 (17:48 +0000)]
[IBBUS] fix bug in Control device implementation (patches 4218,4280). [mlnx: 4396]
Control Device was created in DriverEntry and removed in DrvUnload routine.
But PnP Manager won't call DrvUnload before Control Device is removed.
So IBBUS gets never unloaded.
leonidk [Tue, 2 Jun 2009 17:32:41 +0000 (17:32 +0000)]
[IBAL] Summary: Ill-defined mechanism of event propagation.[mlnx: 4412]
Bug description and reproduction:
1. Connect to machines (A and B) via IB switch
2. Run subnet manager (say, opensm) on B
3. Kill opensm and clear arp tables
4. Rerun opensm - ping will not longer work
5. That's because new opensm instance will clear old multicast groups, and side A will be not aware about opensm restart and will not request to join new MCAST group
Explanations:
There are 2 types of events relevant in our case: PnP and AE.
The problem had happened due to:
1. During opensm restart, port will generate AE event: IB_EVENT_LID_CHANGE or (in other cases) IB_EVENT_CLIENT_REREGISTER
These events will be generated even in the case when SM was restarted, but LID will not actually change.
2. All PnP events were handled properly; but these events were mapped to IB_AE_FATAL
This patch fixes it and maps IB_EVENT_* events to appropriate IB_AE_* events and then to IB_PNP_* events
3. unction force_smi_poll() will now update it's subscribers about LID change event iff LID was changed.
So, we still have the problem when opensm was restarted and no one of the port attributes was changed.
This patch generated appropriate IB_PNP event to resolve this issue.
shefty [Wed, 27 May 2009 19:13:36 +0000 (19:13 +0000)]
winverbs: transition QP to error on disconnect
The QP transition into the error state must occur after a DREQ has been
received and acknowledged by the user (by a subsequent call to Disconnect),
or after a DREP has been received. The current winverbs API requires the
user to call QP:Modify after their NotifyDisconnect completes. This
presents challenges to implementing an ND provider, which expects a single
function call to perform both operations.
Unlike during connection establishment, the QP transition to error must
sometimes be delayed until after a CM callback. And since CM callbacks
are at dispatch, we must queue the modify call to a system thread.
Regardless of the outcome of the disconnect attempt or other failures,
the driver tries to transition the QP to error. This results in some
minor checks to ensure that the correct status is reported to the user.
A couple of additional changes were made to the Accept path to keep the
code consistent, since both Accept and Disconnect have active/passive
code paths.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2214 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Wed, 27 May 2009 15:59:36 +0000 (15:59 +0000)]
etc/work_queue: abstraction to manage a small pool of IO_WORKITEMs
Create an abstraction for managing a small pool of IO_WORKITEMs that
can be used to process a queue of work requests at passive level.
To prevent starvation of other work items and ensure fairness of system
threads, only a single work requests is processed each time a work
item is queued. If more work remains, the work item is requeued.
Using a pool of work items, rather than a single work item, allows for
some parallelism of tasks.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2203 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Wed, 27 May 2009 15:58:27 +0000 (15:58 +0000)]
librdmacm: fix event reporting when destroying listen
Do not report connect request events if a user is in the process of destroying an associated listen request. Ensure that the listen request continues to exist while any callbacks exist.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2202 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
stansmith [Thu, 14 May 2009 17:13:07 +0000 (17:13 +0000)]
[WinOF] support additional OFED Diags (winverbs_OFED.inc)
ibqueryerrors iblinkinfo saquery sminfo smpdump smpquery venstat ibportstate perfquery mcm_rereg_test
libibnetdisc
cleanup makebin.bat with single env var defs of what's to be copied.
ipoib.inc - no ND for ia64
shefty [Tue, 12 May 2009 17:25:44 +0000 (17:25 +0000)]
MAD: fix issues routing vendor MADs
Only dispatch received vendor defined MADs to the HCA driver if the
management class is one of the MLX vendor defined classes.
When dispatching MADs locally that are not handled by the HCA driver,
copy the sent MAD data into the received MAD buffer. Also initialize
the address information of the dispatched MAD, so that replies can be
routed correctly back to the sender. If a MAD is not handled by the
HCA driver and cannot be dispatched, return the MAD to the MAD pool
to avoid leaking MADs.
Finally, we simplify the MAD dispatch code.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2174 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Tue, 12 May 2009 17:23:03 +0000 (17:23 +0000)]
mlx4/mthca: define common vendor mgmt class
Replace the separate MLX4 and MTHCA vendor MAD classes with common
MLX vendor classes. This more easily allows us to determine if a vendor
defined management class should be routed to the HCA driver or dispatched
to a MAD client.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2172 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Tue, 12 May 2009 17:21:47 +0000 (17:21 +0000)]
winmad: support registration for unsolicited MADs
To support ibping, winmad needs to support registering for unsolicited
MADs. We just need to change the MAD service context from referencing
the WMProvider to the WMRegistration.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2171 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Tue, 12 May 2009 17:17:54 +0000 (17:17 +0000)]
mad: deregister MR earlier in destroy path
When restarting the stack, the MAD pool tries to deregister
its memory during the cleanup phase. This results in an
error because of an invalid h_mr handle. Fix the error by
deregistering the memory during the destroying callback.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2170 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Tue, 12 May 2009 17:16:33 +0000 (17:16 +0000)]
ib/cm: update port attributes earlier in destruction path
The CM tries to modify the port attributes during the cleanup
phase of port cep destruction. However, if the stack is
being brought down, by the time ib_modify_hca is called, the
h_ca handle is invalid.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2169 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Tue, 12 May 2009 17:14:08 +0000 (17:14 +0000)]
mlx4: change prints from error to information
TRACE_LEVEL_ERROR is intended for "Severe errors". Change
ref/deref interface and DriverEntry exit log messages from
error to informational only. This avoids displaying
misleading "***ERROR***" messages on the debug terminal.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2168 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
leonidk [Tue, 12 May 2009 13:24:37 +0000 (13:24 +0000)]
[HW, WvVerbs] Pass through the user's specified max inline data value when creating a QP.
Currently, if the user specifies a value, it gets ignored when the QP is created. The inline data value supported is then returned to the user, which may be less than requested.
This fixes a failure running dtest over mthca adapters. Mlx4 adapters end up working for userspace apps because the max_inline_data is passed from the UVP to the kernel via the UMV buffer. However, for completeness, fixup kernel calls for QP creation from IBAL for mlx4.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2166 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
leonidk [Tue, 12 May 2009 13:05:24 +0000 (13:05 +0000)]
[INC, WinVerbs] Define additional async event types for GID, LID, PKey, and SM changes. These are reported by winverbs, and at least appear to be handled by the HCA drivers.
This avoids converting LID change events seen by the mlx4 driver into local fatal errors.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2165 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Wed, 6 May 2009 06:11:14 +0000 (06:11 +0000)]
ND: fixup build.txt
Update the build.txt document for building ulp/nd.
Replace use of ND_INC and ND_INC_S variables with a single, user defined
ND_SDK_PATH environment variable. The change makes it consistent
with the existing PLATFORM_SDK_PATH variable.
The makefile checks for ND_SDK_PATH, rather than ND_INC when determining if ND
should be built. ND_INC indicates that the SDK has been installed, but is not
useful for building in the WDK.
The hard-coded paths in the ND sources file are removed, since those paths are
specific to individual installations. PLATFORM_SDK_PATH_S is replaced with
the existing PLATFORM_SDK_PATH variable.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2155 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Wed, 6 May 2009 06:06:21 +0000 (06:06 +0000)]
We need to return all MADs to IBAL before calling close_al. To protect
against queuing MADs during deregistration, set the MAD service handle
to NULL when deregistering and check that it is still valid before queuing
any received MADs.
This fixes a hanging when using ctrl-C to kill a process running ibping.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2152 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
leonidk [Mon, 4 May 2009 12:42:20 +0000 (12:42 +0000)]
[IBAL] fix memory leak on power down/power up flow. [mlnx: 4289]
port_mgr_port_add() allocates a port_pnp_ctx_t context, which is saved by IBAL to be used later in port_mgr_port_remove().
But in hibernation flow port_mgr_port_remove() doesn't release this context which causes IBBUS memory leak.
It was trapped by Verifier during WHQL Common Scenario Stress test.
leonidk [Sun, 3 May 2009 12:47:37 +0000 (12:47 +0000)]
[IBAL] crash on IBBUS disabling while mad traffic. [mlnx: 4275]
__ioc_query_sa takes references on IOC PnP service before sending the node and path_record requests.
But these references get released at the end of __node_rec_cb and __path_rec_cb, while __process_sweep routine, which performs the IOU sweeping, is just scheduled to run in an async thread.
If the test happens to unload the driver after __node_rec_cb and __path_rec_cb and before __process_sweep started to run, IOC PnP service gets released and __process_sweep crashes.
The patch takes a reference on IOC PnP service before scheduling a thread for __process_sweep and releases the reference at the end of __process_sweep.
(Pay attention, that __process_sweep schedules a thread for itself twice while moving through its FSM:
SWEEP_IOU_INFO --> SWEEP_IOC_PROFILE --> SWEEP_SVC_ENTRIES --> SWEEP_COMPLETE)
winverbs: release CM interface only once per reference
The CM interface is not bound to a device, and is only acquired once by the winverbs driver. Release the CM interface only once after all devices have been removed, not once per hardware device.
This should fix issues enabling/disabling HCA drivers with multiple HCAs present in a single system.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2141 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86