leonidk [Tue, 17 Nov 2009 15:03:51 +0000 (15:03 +0000)]
[MLX4] g_stat: added more prints and flags
added support for g_stat structure for MLX4_BUS driver + MAD tracing mechanism.
MAD tracing mechanism works under the control of flags, set in new Registry mlx4_bus parameter StatFlags.
The mechanism prints to debugger the IB headers of packets sent over MLX transport, i.e. on QP0 and QP1. [mlnx: 5138]
The flags are:
0x0001 - print LRH
0x0002 - print BTH
0x0004 - print DETH
0x0008 - print GRH (it won't print id the GRH is absent)
0x0010 - print some WQE info
0x0020 - print some more UD header info
0x0040 - print some send WR info
0x0080 - MLX WQE dump
leonidk [Tue, 17 Nov 2009 14:12:21 +0000 (14:12 +0000)]
[IPOIB, IBBUS] g_stat: first patch of several, adding and populating global g_stat structure to the drivers of IB stack
For debug purposes.
Usable in both checked and free version.
Usage: open Watch Window in Debugger and add ipoib!g_stat or ibbus!g_stat.
An example of a real problem: MLX4 driver is stuck on unload.
It can happen when IBAL is stuck on resource reclamation.
The reclamation is being done in one of IBAL threads, which are started on IBAL's start up.
So you'd like to look into this thread.
But where it is ?
You can find it by printing ALL the threads of System process and looking for ibbus in their stack traces.
It can take minutes.
After this patch you can do that in 3 seconds this way:
1. open Watch Window in WinDbg and add ibbus!g_stat.
2. open
ibbus!g_stat->Drv->Gp_async_obj_mgr->Thread_pool->P_thread[0]->Osd->P_thread
This field contains the address of the thread, you wanted.
stansmith [Mon, 16 Nov 2009 18:44:54 +0000 (18:44 +0000)]
[TRUNK] Change the cl_pfn_fmap_cmp_t compare function to return an 'int' instead of 'intn_t' as int is a better return value match for standard compare functions like memcmp(), cl_memcmp(), qsort() and the like.
Additionally the change removes some usage of the non-standard C type 'intn_t' in favor of the basic C data type 'int'.
Files impacted:
inc\complib\cl_fleximap.h line #185 cl_pfn_fmap_cmp_t function returns 'int' instead of 'intn_t'.
core\al\kernel\al_ioc_pnp.c
core\al\kernel\al_pnp.c
ulp\wsd\user\ibsp_ip.c
ulp\ipoib\kernel\ipoib_port.c
ulp\ipoib\kernel\ipoib_port.cpp
ulp\wsd\user\ibspproto.h
Tested by:
building WinOF installers for wlh, win7, wnet & wxp
installing newly built installers.
Running DAPL tests, IPoIB tests & opensm tests.
Signed-off-by: stan smith <stan.smith@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2559 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
stansmith [Thu, 12 Nov 2009 23:19:40 +0000 (23:19 +0000)]
[WINOF] remove windows 7 existance check as it's now a default OS flavor. Sign .exe files used during a WInOF install to eliminate win7 popup notifiers asking to proceed. ndinstall.exe, installsp.exe for install, devman.exe on uninstall.
shefty [Mon, 9 Nov 2009 20:09:35 +0000 (20:09 +0000)]
etc/docs: developer installation scripts
The following patch series implements a series of scripts that can be used
by developers to build and install the winof drivers across an HPC cluster.
The scripts are intended to allow quick building and replacement of specific
drivers and libraries. The process can be automated more by layering additional
scripts over those provided.
This patch documents the anticipated build and installation process. Follow
on patches in the series implement the various scripts.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2549 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
shefty [Mon, 9 Nov 2009 20:07:12 +0000 (20:07 +0000)]
winmad/inf: create inf file under bin/kernel
Winmad currently creates its inf file under core/winmad/kernel/obj*.
Move the inf file to bin/kernel/obj*. This is the location where all
other inf files in the tree are created.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2548 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
stansmith [Tue, 3 Nov 2009 22:25:45 +0000 (22:25 +0000)]
[OPENSM_3] relocaated OSM_MAX_LOG_NAME_SIZE to where it is used (winosm_common.c). Wired syslog() calls into OutputDebugStringA() so one can view syslog() writes using a DebugView Monitor. Simplified osm_strdup() to handle multiple environment vars in a path/filename.
stansmith [Tue, 3 Nov 2009 22:21:48 +0000 (22:21 +0000)]
[OPENSM_3] Added OpenSM local Service control handling to reset/zero OSM log file (code 128) along with code 192 to start a heavy sweep. Emulates OFED Opensm receiving SIGUSR1 & SIGHUP.
sc control OpenSM 128 reset OSM log file
sc control OpenSM 129 start a Heavy sweep.
Added above text to usage() message under --help.
shefty [Tue, 3 Nov 2009 16:45:38 +0000 (16:45 +0000)]
In order to support opensm running over winmad (via the libibumad),
we need to set the IsSM PortInfo capability bit when it is present.
We do this in the winmad driver based on the user registering for
unsolicted directed route SMPs. The bit is unset when that user goes
away.
In order to set the capability bit, we need to add ib_modify_ca()
to the IB_AL interface. The interface GUID is updated as a result.
For opensm, a call to umad_register (directly or indirectly through
another library), should result in setting the IsSM capability bit
correctly. No additional work is required, such as calling
umad_get_issm_path and opening a separate file, as is done on linx.
This will require a platform specific handling in the opensm code.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
git-svn-id: svn://openib.tc.cornell.edu/gen1@2536 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86
stansmith [Sat, 31 Oct 2009 00:27:53 +0000 (00:27 +0000)]
[DAPL] Sync with latest 2.0.24 OFED release
Summary of changes since last release:
v2 - winof: Utilize WinOF version of inet_ntop() for Windows OSes which do not support inet_ntop().
v2 - winof: ucm windows build issue with new CQ completion channel
v2 - winof: add ucm provider to windows build
v2 - winof: add missing build files for ibal, scm
v2 - scm: connection peer resets under heavy load, incorrect event on error
v2 - ucm: increase default reply and rtu timeout values.
v2 - ucm: change some debug message levels and add check for valid UD REPLY during retries.
v2 - ucm: increase timers during subsequent retries
v2 - ucm, scm: address handles need destroyed when freeing Endpoints with UD QP's.
v2 - openib_common: ignore pd free errors, clear pd_handle and return.
v2 - ucm: using UD type QP's, ucm reports wrong reject event when user rejects AH resolution request.
v2 - ucm, scm, cma: Fix CNO support on DTO type EVD's
v2 - ucm: fix lock init bug in ucm_cm_find
v2 - ucm: fix build problem with latest windows ucm changes
v2 - ucm: HCA should not be closed until all resources have been released.
v2 - ucm: build warning when compiling on 32-bit systems.
v2 - ucm: trying to deregister the same memory region twice
v2 - dat: reduce debug message level when parsing for location of dat.conf
v2 - ucm: update ucm provider for windows environment
v2 - ucm: add timer/retry CM logic to the ucm provider
stansmith [Thu, 29 Oct 2009 20:59:21 +0000 (20:59 +0000)]
[WINOF] enhance driver file copy error reporting by incorporating a for() loop to copy ipoib & qlgcvnic drivers. Include Winverbs ND provider in binary tree creation.
leonidk [Mon, 26 Oct 2009 10:30:10 +0000 (10:30 +0000)]
[MLX4] Allocate and map sufficient ICM memory for EQ context. [mlnx: 4946]
The current implementation allocates a single host page for EQ context
memory, which was OK when we only allocated a few EQs. However, since
we now allocate an EQ for each CPU core, this patch removes the
hard-coded limit (which we exceed with 4 KB pages and 128 byte EQ
context entries with 32 CPUs) and uses the same ICM table code as all
other context tables, which ends up simplifying the code quite a bit
while fixing the problem.
This problem was actually hit in practice on a dual-socket Nehalem box
with 16 real hardware threads and sufficiently odd ACPI tables that it
shows on boot
SMP: Allowing 32 CPUs, 16 hotplug CPUs
so num_possible_cpus() ends up 32, and mlx4 ends up creating 33 MSI-X
interrupts and 33 EQs. This mlx4 bug means that mlx4 can't even
initialize at all on this quite mainstream system.
leonidk [Mon, 26 Oct 2009 10:14:38 +0000 (10:14 +0000)]
[MLX4] limit the process of reading VPD with timeout, but continue to work on error. [mlnx: 4879]
This patch solves the freeze of the driver in case when FW doesn't provide VPD.
(in fact - it's a workaround of a FW bug).
VPD is not used today in IB drivers.
leonidk [Mon, 26 Oct 2009 10:05:59 +0000 (10:05 +0000)]
[CORE,HW] replace using of Paged pool by NonPaged one. [mlnx: 4836]
We see from time to time BSODs at shutdown times while a hard traffic load.
It can be attributed to the fact that some of the structured used in PnP and Power Management are allocated in Paged Pool.
As far as these structures are of little size it is safer to use for them always NonPagedPool.
It makes the driver more robust.
leonidk [Mon, 26 Oct 2009 09:35:50 +0000 (09:35 +0000)]
[IBBUS,HW] add standby/hibernation support to IBBUS. [mlnx: 4750]
Mellanox HW doesn't support neither standby nor hibernation.
To simulate such support, low-level driver resets HCA on power down and starts it up on power up.
IBBUS, continuing to work with HCA, produces BSODs.
This patch deregisters HCA from IBAL on power down and re-registers it on power up.
stansmith [Sat, 24 Oct 2009 00:32:28 +0000 (00:32 +0000)]
[INC] add defines and inline functions from OFED management ib_types.h in order to build OpenSM 3.3.2 using only trunk\inc\*. Tested by building & installing a WinOF release using current openSM and newer openSM; no observed differences.
stansmith [Fri, 23 Oct 2009 02:26:00 +0000 (02:26 +0000)]
[WIX] Added explicit Windows Volume (%SystemDrive%\)for DAT config & SDK directories. Required as older installers do not default TARGET dir to Windows Volume.