]> git.openfabrics.org - ~ardavis/dapl.git/log
~ardavis/dapl.git
11 years agoImproved error messages with SCIF related calls.
Jianxin Xiong [Wed, 12 Sep 2012 20:53:31 +0000 (13:53 -0700)]
Improved error messages with SCIF related calls.

11 years agoAdd instructions for installation without root privilege.
Jianxin Xiong [Mon, 10 Sep 2012 18:30:52 +0000 (11:30 -0700)]
Add instructions for installation without root privilege.

11 years agoAdd instructions for cluster deployment.
Jianxin Xiong [Thu, 6 Sep 2012 17:21:47 +0000 (10:21 -0700)]
Add instructions for cluster deployment.

11 years agoAllow changing the SCIF port id via the envar DAPL_MCM_PORT_ID so that
Jianxin Xiong [Wed, 5 Sep 2012 18:15:57 +0000 (11:15 -0700)]
Allow changing the SCIF port id via the envar DAPL_MCM_PORT_ID so that
the proxy service can be started by a regular user.

11 years agoMake the default config file consistent with the initialization code.
Jianxin Xiong [Wed, 5 Sep 2012 18:10:49 +0000 (11:10 -0700)]
Make the default config file consistent with the initialization code.

11 years agoPrevent dereferencing a NULL pointer.
Jianxin Xiong [Wed, 5 Sep 2012 17:54:15 +0000 (10:54 -0700)]
Prevent dereferencing a NULL pointer.

11 years agoAdd a readme file for the MCM provider.
Jianxin Xiong [Mon, 20 Aug 2012 18:59:29 +0000 (11:59 -0700)]
Add a readme file for the MCM provider.

11 years agoAdd "buffer_segment_size" to the configuration file.
Jianxin Xiong [Mon, 20 Aug 2012 18:52:57 +0000 (11:52 -0700)]
Add "buffer_segment_size" to the configuration file.

11 years agoBug fix: scif_unregister returns -ENODEV becasue the length is not page aligned.
Jianxin Xiong [Sat, 18 Aug 2012 20:43:30 +0000 (13:43 -0700)]
Bug fix: scif_unregister returns -ENODEV becasue the length is not page aligned.

11 years agoThere is a race condition between dapl_rbuf_add() dapl_rbuf_remove() when
Jianxin Xiong [Sat, 18 Aug 2012 05:30:06 +0000 (22:30 -0700)]
There is a race condition between dapl_rbuf_add() dapl_rbuf_remove() when
the ring buffer is empty or full. This could lead to incorrect item be
dequeued and correct item be discarded.

A temporary workaround is provided to ensure the pending event queue work
correctly. Without this fix dapl_evd_dequeue() could return the wrong event
and causes various application errors (hanging, assertion failures. etc).

Ultimately the ring buffer mechanism needs to the fixed.

11 years agoAdd environment variable DAPL_MCM_ALWAYS_PROXY to allow proxying non-MIC
Jianxin Xiong [Sat, 18 Aug 2012 05:13:37 +0000 (22:13 -0700)]
Add environment variable DAPL_MCM_ALWAYS_PROXY to allow proxying non-MIC
initiated connections. Good for debugging.

11 years agoHandle the (unlikely) corner case that the last sge has length 0.
Jianxin Xiong [Fri, 17 Aug 2012 18:44:27 +0000 (11:44 -0700)]
Handle the (unlikely) corner case that the last sge has length 0.

11 years agoMinor code cleanup.
Jianxin Xiong [Fri, 17 Aug 2012 18:35:25 +0000 (11:35 -0700)]
Minor code cleanup.

11 years agoRecount the actual pending WRs from time to time.
Jianxin Xiong [Fri, 17 Aug 2012 06:27:34 +0000 (23:27 -0700)]
Recount the actual pending WRs from time to time.

11 years agoDon't waste the 64 byte in the proxy buffer between segments.
Jianxin Xiong [Fri, 17 Aug 2012 05:59:20 +0000 (22:59 -0700)]
Don't waste the 64 byte in the proxy buffer between segments.

11 years agoBug fix: the calculation of 'total_offset' could be wrong if a segmented sge is not...
Jianxin Xiong [Thu, 16 Aug 2012 19:34:30 +0000 (12:34 -0700)]
Bug fix: the calculation of 'total_offset' could be wrong if a segmented sge is not the last one.

11 years agoPrevent poll_cnt from becoming negative, which could lead to huge delay.
Jianxin Xiong [Thu, 16 Aug 2012 18:31:22 +0000 (11:31 -0700)]
Prevent poll_cnt from becoming negative, which could lead to huge delay.

11 years agoDon't flood the log with messages about deferred WR posting.
Jianxin Xiong [Thu, 16 Aug 2012 18:28:42 +0000 (11:28 -0700)]
Don't flood the log with messages about deferred WR posting.

11 years agoProxy large writes in segments.
Jianxin Xiong [Wed, 15 Aug 2012 05:32:09 +0000 (22:32 -0700)]
Proxy large writes in segments.

11 years agoReplace scif_writeto with scif_readfrom, preparing for segmentation support.
Jianxin Xiong [Tue, 14 Aug 2012 20:00:58 +0000 (13:00 -0700)]
Replace scif_writeto with scif_readfrom, preparing for segmentation support.

11 years agoMake both address and size 64 bytes aligned when calling scif_writeto().
Jianxin Xiong [Tue, 14 Aug 2012 05:23:41 +0000 (22:23 -0700)]
Make both address and size 64 bytes aligned when calling scif_writeto().

11 years agoPort numbers need to be unique within each physical IB device (md),
Jianxin Xiong [Fri, 10 Aug 2012 00:55:04 +0000 (17:55 -0700)]
Port numbers need to be unique within each physical IB device (md),
otherwise connection requests between multiple clients of the same
proxy server can be matched incorrectly.

11 years agoCorrect many mismatches between the log messages and the actual operations.
Jianxin Xiong [Tue, 7 Aug 2012 15:53:18 +0000 (08:53 -0700)]
Correct many mismatches between the log messages and the actual operations.

11 years agoCorrect the MR size calculation when registering with SCIF.
Jianxin Xiong [Tue, 7 Aug 2012 05:53:19 +0000 (22:53 -0700)]
Correct the MR size calculation when registering with SCIF.

11 years agoMPXYD: only post an incoming write WR when there is no pending WR on the QP,
Jianxin Xiong [Sat, 4 Aug 2012 00:32:31 +0000 (17:32 -0700)]
MPXYD: only post an incoming write WR when there is no pending WR on the QP,
otherwise there might be ordering issues.

11 years agoFix the (nil) "wr_id" output of posted pending work request.
Jianxin Xiong [Sat, 4 Aug 2012 00:25:44 +0000 (17:25 -0700)]
Fix the (nil) "wr_id" output of posted pending work request.

11 years agoBug fix: connection error with "unexpected DAPL event 0x4003" when more then one...
Jianxin Xiong [Fri, 3 Aug 2012 03:30:34 +0000 (20:30 -0700)]
Bug fix: connection error with "unexpected DAPL event 0x4003" when more then one client exists.

11 years agoCorrect the range of valid ops in mix_op_str().
Jianxin Xiong [Thu, 2 Aug 2012 21:38:43 +0000 (14:38 -0700)]
Correct the range of valid ops in mix_op_str().

11 years agoFix conenction setup deadlock with more than one client connecting to MPXYD.
Jianxin Xiong [Thu, 2 Aug 2012 18:45:50 +0000 (11:45 -0700)]
Fix conenction setup deadlock with more than one client connecting to MPXYD.

11 years agoAdd wr_end field for entry count and hd/tl adjustments
Arlin Davis [Wed, 25 Jul 2012 06:24:59 +0000 (23:24 -0700)]
Add wr_end field for entry count and hd/tl adjustments

MPI now works pingpong. Still not stress testing well.
Some hang conditions need debugged

11 years agodtest update for unaligned/aligned data option, unaligned by default
sean.hefty@intel.com [Mon, 23 Jul 2012 22:20:08 +0000 (15:20 -0700)]
dtest update for unaligned/aligned data option, unaligned by default

set -a for page alignment.

return EINVAL for anything other than sends or writes.

11 years agordma write working both directions. MIC to XEON, XEON to MIC.
Arlin Davis [Mon, 23 Jul 2012 06:42:16 +0000 (23:42 -0700)]
rdma write working both directions. MIC to XEON, XEON to MIC.

updated memory registration to adjust for SCIF page alignment

12 years agoadd progress thread processing of delayed WR's
Arlin Davis [Tue, 17 Jul 2012 17:44:19 +0000 (10:44 -0700)]
add progress thread processing of delayed WR's

12 years agordma writes working
Arlin Davis [Tue, 17 Jul 2012 05:04:18 +0000 (22:04 -0700)]
rdma writes working

12 years agordma write support coded, compiled not tested
sean.hefty@intel.com [Sun, 15 Jul 2012 07:11:56 +0000 (00:11 -0700)]
rdma write support coded, compiled not tested

12 years agoSends/receives working, Xeon to MIC
Arlin Davis [Wed, 11 Jul 2012 16:33:19 +0000 (09:33 -0700)]
Sends/receives working, Xeon to MIC

12 years agoupdate debug message
Arlin Davis [Thu, 5 Jul 2012 16:05:36 +0000 (09:05 -0700)]
update debug message

12 years agoMCM development
Arlin Davis [Thu, 5 Jul 2012 16:01:47 +0000 (09:01 -0700)]
MCM development

12 years agoupdated for QP, CM services
Arlin Davis [Tue, 26 Jun 2012 18:35:53 +0000 (11:35 -0700)]
updated for  QP, CM services

12 years agoadd mcm
Arlin Davis [Wed, 13 Jun 2012 01:33:45 +0000 (18:33 -0700)]
add mcm

12 years agoupdate makefile, dat.conf
Arlin Davis [Wed, 13 Jun 2012 01:33:14 +0000 (18:33 -0700)]
update makefile, dat.conf

12 years agoadd extension.h update mpxyd.c
Arlin Davis [Wed, 6 Jun 2012 19:19:19 +0000 (12:19 -0700)]
add extension.h update mpxyd.c

12 years agoadd mcm.c and mix.c, MIC indirect CM code and MIC indirect eXchange messager
Arlin Davis [Mon, 4 Jun 2012 23:00:41 +0000 (16:00 -0700)]
add mcm.c and mix.c, MIC indirect CM code and MIC indirect eXchange messager

12 years agoupdate for chkconfig
Arlin Davis [Thu, 31 May 2012 22:56:20 +0000 (15:56 -0700)]
update for chkconfig

12 years agoupdate makefile
Arlin Davis [Thu, 31 May 2012 22:11:07 +0000 (15:11 -0700)]
update makefile

12 years agoupdate spec file
Arlin Davis [Thu, 31 May 2012 21:47:00 +0000 (14:47 -0700)]
update spec file

12 years agomakefile update
Arlin Davis [Thu, 31 May 2012 21:44:16 +0000 (14:44 -0700)]
makefile update

12 years agoupdate Makefile move daplmpxy to mpxyd.init.in
Arlin Davis [Thu, 31 May 2012 21:41:03 +0000 (14:41 -0700)]
update Makefile move daplmpxy to mpxyd.init.in

12 years agoupdate makefile
Arlin Davis [Thu, 31 May 2012 21:38:40 +0000 (14:38 -0700)]
update makefile

12 years agoadd mpxyd.conf
Arlin Davis [Tue, 22 May 2012 18:55:06 +0000 (11:55 -0700)]
add mpxyd.conf

12 years agoupdate makefile for mpdxyd.conf
Arlin Davis [Tue, 22 May 2012 18:31:58 +0000 (11:31 -0700)]
update makefile for mpdxyd.conf

12 years agoupdate makefile
Arlin Davis [Tue, 22 May 2012 17:47:31 +0000 (10:47 -0700)]
update makefile

12 years agoupdate makefile
Arlin Davis [Tue, 22 May 2012 17:44:12 +0000 (10:44 -0700)]
update makefile

12 years agoadd mpxyd.c
Arlin Davis [Mon, 21 May 2012 22:40:30 +0000 (15:40 -0700)]
add mpxyd.c

12 years agofix makefile
Arlin Davis [Mon, 21 May 2012 22:37:12 +0000 (15:37 -0700)]
fix makefile

12 years agofix makefile
Arlin Davis [Mon, 21 May 2012 22:34:19 +0000 (15:34 -0700)]
fix makefile

12 years agofix makefile
Arlin Davis [Mon, 21 May 2012 22:32:27 +0000 (15:32 -0700)]
fix makefile

12 years agoconfigure changes.
Arlin Davis [Mon, 21 May 2012 22:27:53 +0000 (15:27 -0700)]
configure changes.

12 years agoupdate configure.in
Arlin Davis [Mon, 21 May 2012 22:24:35 +0000 (15:24 -0700)]
update configure.in

12 years agoChange names to mpxy
Arlin Davis [Mon, 21 May 2012 22:22:00 +0000 (15:22 -0700)]
Change names to mpxy

12 years agopxd.c
Arlin Davis [Wed, 16 May 2012 23:23:16 +0000 (16:23 -0700)]
pxd.c

12 years agopxd.c
Arlin Davis [Wed, 16 May 2012 23:22:33 +0000 (16:22 -0700)]
pxd.c

12 years agopxd.c
Arlin Davis [Wed, 16 May 2012 23:21:37 +0000 (16:21 -0700)]
pxd.c

12 years agomakefile
Arlin Davis [Wed, 16 May 2012 23:20:00 +0000 (16:20 -0700)]
makefile

12 years agoupdate pxd.c
Arlin Davis [Wed, 16 May 2012 23:16:07 +0000 (16:16 -0700)]
update pxd.c

12 years agomakefile
Arlin Davis [Wed, 16 May 2012 23:14:52 +0000 (16:14 -0700)]
makefile

12 years agomakefile.am
Arlin Davis [Wed, 16 May 2012 23:09:56 +0000 (16:09 -0700)]
makefile.am

12 years agoadd svc/pxm.c for deamon
Arlin Davis [Wed, 16 May 2012 23:06:43 +0000 (16:06 -0700)]
add svc/pxm.c for deamon

12 years agomakefile.am
Arlin Davis [Wed, 16 May 2012 23:02:53 +0000 (16:02 -0700)]
makefile.am

12 years agoMakefile.am
Arlin Davis [Wed, 16 May 2012 22:59:39 +0000 (15:59 -0700)]
Makefile.am

12 years agoconfigure.in
Arlin Davis [Wed, 16 May 2012 22:56:45 +0000 (15:56 -0700)]
configure.in

12 years agoupdate configure.in
Arlin Davis [Wed, 16 May 2012 22:54:44 +0000 (15:54 -0700)]
update configure.in

12 years agoadd daplpdx.init.in
Arlin Davis [Wed, 16 May 2012 22:51:17 +0000 (15:51 -0700)]
add daplpdx.init.in

12 years agoupdate for daplpxd daemon
Arlin Davis [Wed, 16 May 2012 22:48:55 +0000 (15:48 -0700)]
update for daplpxd daemon

12 years agoadd libdaplopcm.map
Arlin Davis [Wed, 16 May 2012 19:01:24 +0000 (12:01 -0700)]
add libdaplopcm.map

12 years agofix makefile.am
Arlin Davis [Wed, 16 May 2012 18:54:51 +0000 (11:54 -0700)]
fix makefile.am

12 years agofix makefile.am
Arlin Davis [Wed, 16 May 2012 18:53:14 +0000 (11:53 -0700)]
fix makefile.am

12 years agoadd pcm files
Arlin Davis [Wed, 16 May 2012 18:46:59 +0000 (11:46 -0700)]
add pcm files

12 years agofix configure.in
Arlin Davis [Wed, 16 May 2012 18:42:42 +0000 (11:42 -0700)]
fix configure.in

12 years agoAdd configuration files for new Proxy RDMA/SCIF Provider
Arlin Davis [Wed, 16 May 2012 18:28:09 +0000 (11:28 -0700)]
Add configuration files for new Proxy RDMA/SCIF Provider

This proxy RDMA provider is a interbus service within a platform
to enable offloading of RDMA writes from a small core (MIC) to
large core resource. Data sourced on a MIC that is
destined for remote nodes on the fabric will be transfered
to large core via SCIF and than to remote node.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agocommon: allow qp modify in init state
Arlin Davis [Mon, 14 May 2012 21:51:38 +0000 (14:51 -0700)]
common: allow qp modify in init state

Allow consumer to modify attributes via dat_ep_modify
in init state.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agocommon: check for valid states during ep posting
Arlin Davis [Thu, 10 May 2012 21:57:31 +0000 (14:57 -0700)]
common: check for valid states during ep posting

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agodat.conf: keep list of providers in order for backward compatibility
Arlin Davis [Thu, 10 May 2012 20:35:55 +0000 (13:35 -0700)]
dat.conf: keep list of providers in order for backward compatibility

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agoucm: record and silently drop a duplicate reject CM message
Arlin Davis [Thu, 10 May 2012 17:49:09 +0000 (10:49 -0700)]
ucm: record and silently drop a duplicate reject CM message

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agowindows: new version of getlocalipaddr not portable
Arlin Davis [Wed, 25 Apr 2012 20:37:53 +0000 (13:37 -0700)]
windows: new version of getlocalipaddr not portable

revert to the original getaddrinfo method for windows

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agodapltest: DFLT_QLEN is defined in multiple tests
Arlin Davis [Wed, 25 Apr 2012 20:36:52 +0000 (13:36 -0700)]
dapltest: DFLT_QLEN is defined in multiple tests

add #ifdef checking in transaction test.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agoRelease 2.0.35 dapl-2.0.35-1
Arlin Davis [Wed, 25 Apr 2012 20:10:39 +0000 (13:10 -0700)]
Release 2.0.35

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agoconfig/build: remove post/postun hacking used to modify dat.conf
Arlin Davis [Wed, 25 Apr 2012 20:07:10 +0000 (13:07 -0700)]
config/build: remove post/postun hacking used to modify dat.conf

Return to the tried and true method of managing configuration
files via %config directive and remove ugly sed editing methods.
The dat.conf includes both v1 and v2 device entries to insure
backward compatibility. Add doc/dat.conf

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agoconfig: clean up help option displays with ext-type options
Arlin Davis [Mon, 23 Apr 2012 17:35:24 +0000 (10:35 -0700)]
config: clean up help option displays with ext-type options

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agowindows: Provide auto-detect between RoCE and Infiniband for Windows.
stan smith [Mon, 23 Apr 2012 17:32:00 +0000 (10:32 -0700)]
windows: Provide auto-detect between RoCE and Infiniband for Windows.

For RoCE, enable transport global ID use.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agoucm: update UD cm provider to support new CM stat and error counters
Arlin Davis [Fri, 20 Apr 2012 00:40:45 +0000 (17:40 -0700)]
ucm: update UD cm provider to support new CM stat and error counters

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agoscm: update socket cm provider to support new CM stat and error counters
Arlin Davis [Fri, 20 Apr 2012 00:40:03 +0000 (17:40 -0700)]
scm: update socket cm provider to support new CM stat and error counters

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agocommom: add cm, link, and diag event counters in IB extended builds
Arlin Davis [Fri, 20 Apr 2012 00:15:22 +0000 (17:15 -0700)]
commom: add cm, link, and diag event counters in IB extended builds

Add additional event monitoring capabilities during runtime to help
isolate issues during scaling in lieu of logging/printing warning
messages. Counters have been added to provider CM services and counters
have been added and mapped to sysfs ib_cm, device port and device
diag counters. ibdev_path is used for device sysfs counters.

uDAPL CM events are tracked on a per IA instance via internal
provider counters. The ib_cm, link, and diag events are tracked on a
per platform basis via sysfs. For these running counters a start
and stop function is provided for sampling and mapping to DAPL
64 bit counters. All counters, along with new start and stop functions,
are provided via dat_ib_extensions.h. New IB extension version is 2.0.7

New DCNT_IA_xx counters include 40 cm, 9 link, and 9 diag types.

To enable new counters (default build is disabled):
./configure --enable-counters

New bitmappings have been added to DAPL_DBG_TYPE environment
variable to automatically start/stop counters and log
errors if counters are enabled. The following will control
CM, LINK, and DIAG respectively:

   DAPL_DBG_TYPE_CM_ERRS = 0x080000,
   DAPL_DBG_TYPE_LINK_ERRS = 0x100000,
   DAPL_DBG_TYPE_DIAG_ERRS = 0x400000,

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agoscm: use ioctl SIOCIFCONF to get complete list of configured netdev interfaces
Arlin Davis [Tue, 17 Apr 2012 22:24:22 +0000 (15:24 -0700)]
scm: use ioctl SIOCIFCONF to get complete list of configured netdev interfaces

replace usage of getaddrinfo since is doesnt actually return bound addresses
and can return the loopback address in some configurations. Some
systems may not have eth0 configured so you cannot assume eth0 as a non-loopback
default netdev.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agoucm: UD send failures at scale, ucm_send ERR: get_smsg(hd=149,tl=150)
Arlin Davis [Fri, 17 Feb 2012 18:28:48 +0000 (10:28 -0800)]
ucm: UD send failures at scale, ucm_send ERR: get_smsg(hd=149,tl=150)

Full sendq should retry polling completions instead of failing.
When sendq is full and all requests are pending the get send message
code should retry polling for completions and not return error on first
empty CQ attempt. Give HCA a chance to complete some batched requests.
Also, clean up the send message error logging.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agoscm: fix retry count on connection pending timeout
Arlin Davis [Mon, 6 Feb 2012 22:04:37 +0000 (14:04 -0800)]
scm: fix retry count on connection pending timeout

Retry count not being decremented on connection TIMEOUT.
Also, cleanup log messages on CONN and REP pending and
add local port to output.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agoucm: cleanup debug message, ntohl on p_size is incorrect
Arlin Davis [Mon, 6 Feb 2012 22:03:20 +0000 (14:03 -0800)]
ucm: cleanup debug message, ntohl on p_size is incorrect

private data size is a short, change to ntohs on log message

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agocma, scm, ucm: allow EP (QP) creation without EVD (CQ)
Arlin Davis [Mon, 30 Jan 2012 18:19:29 +0000 (10:19 -0800)]
cma, scm, ucm: allow EP (QP) creation without EVD (CQ)

Provide ability to create a EP/QP with no EVD/CQ on either the
request or receive queue. The current implementation allows on
receive queue but not request queue. Not all ofa devices support
a null CQ so if necessary create a dummy CQ at the time of
QP creation. Also, if no CQ is specified set appropriate QP
max wr/sge attributes to zero.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agocommon: add DAPL_DBG_TYPE_CM_STATS (0x40000) to debug log options
Arlin Davis [Mon, 30 Jan 2012 18:09:42 +0000 (10:09 -0800)]
common: add DAPL_DBG_TYPE_CM_STATS (0x40000) to debug log options

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
12 years agocommon: dapls_ep_flush_cq will segfault when no CQ is attached to EP
Arlin Davis [Wed, 25 Jan 2012 19:54:29 +0000 (11:54 -0800)]
common: dapls_ep_flush_cq will segfault when no CQ is attached to EP

add check for NULL request/receive EVD (cq) before flushing.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>