]> git.openfabrics.org - ~ardavis/dapl.git/log
~ardavis/dapl.git
11 years agoRelease intel-mic-ofed-dapl-2.0.36.1-1
Arlin Davis [Mon, 26 Nov 2012 18:56:46 +0000 (10:56 -0800)]
Release intel-mic-ofed-dapl-2.0.36.1-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoadd new package spec file - intel-mic-ofed-dapl.spec.in
Arlin Davis [Mon, 26 Nov 2012 18:29:40 +0000 (10:29 -0800)]
add new package spec file - intel-mic-ofed-dapl.spec.in

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agopackage update: rename package for Intel MPSS release
Arlin Davis [Mon, 26 Nov 2012 18:23:29 +0000 (10:23 -0800)]
package update: rename package for Intel MPSS release

dapl- renamed to intel-mic-ofed-dapl-

new package definitions will obsolete the previous dapl version

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoRelease 2.0.36.1
Arlin Davis [Thu, 15 Nov 2012 22:14:14 +0000 (14:14 -0800)]
Release 2.0.36.1

Change version to include sub-minor numbers. Add options
to specfile to include CCFLAGS and LDFLAG options via rpmbuild.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoinstall: correct the path to /etc
Patrick Mccormick [Thu, 15 Nov 2012 02:49:16 +0000 (18:49 -0800)]
install: correct the path to /etc

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agompxyd: dont segfault if pid file is not found when attempting to kill running daemon
Patrick Mccormick [Thu, 15 Nov 2012 02:46:07 +0000 (18:46 -0800)]
mpxyd: dont segfault if pid file is not found when attempting to kill running daemon

11 years agompxyd: misc cleanups for consistency
Patrick Mccormick [Thu, 15 Nov 2012 02:44:34 +0000 (18:44 -0800)]
mpxyd: misc cleanups for consistency

11 years agompxyd: expose CM request and reply timers and retry count
Arlin Davis [Thu, 15 Nov 2012 02:36:56 +0000 (18:36 -0800)]
mpxyd: expose CM request and reply timers and retry count

add entries in the mpxyd.conf for timers and retry. Start with
larger default timers given small cores are processing messages
and they are proxied via SCIF.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: 2 new debug logging levels for low system memory and package info
Arlin Davis [Mon, 29 Oct 2012 18:54:46 +0000 (11:54 -0700)]
common: 2 new debug logging levels for low system memory and package info

DAPL_DBG_TYPE_SYS_WARN = 0x800000
DAPL_DBG_TYPE_VER = 0x1000000

export DAPL_DBG_SYS_MEM = 5 will set the checking for memory less than 5%
when DAPL_DBG_TYPE is set with bit DAPL_DBG_TYPE_SYS_WARN.

The package must be built with --enable-counters for memory checking to
be enabled.

In addition, if DAPL_DBG_TYPE -s set with bit DAPL_DBG_TYPE_VER than
the package rev and build date will be sent to stdout during library
init.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agomcm: process QP create errors from mpxyd
Arlin Davis [Mon, 29 Oct 2012 18:50:04 +0000 (11:50 -0700)]
mcm: process QP create errors from mpxyd

add checking and reporting for QP errors back to mix client
on the MIC adapter

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: add register memory to logging on non-debug build.
Arlin Davis [Mon, 29 Oct 2012 18:48:49 +0000 (11:48 -0700)]
common: add register memory to logging on non-debug build.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: turn down debug level during ep free
Arlin Davis [Mon, 29 Oct 2012 18:36:49 +0000 (11:36 -0700)]
common: turn down debug level during ep free

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agomcm: cleanup build issues with unused variables on --disable-debug
Arlin Davis [Fri, 19 Oct 2012 16:57:45 +0000 (09:57 -0700)]
mcm: cleanup build issues with unused variables on --disable-debug

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: cleanup/remove debug code in post send
Arlin Davis [Fri, 19 Oct 2012 16:56:05 +0000 (09:56 -0700)]
common: cleanup/remove debug code in post send

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoAdd some rpm targets
Patrick McCormick [Mon, 15 Oct 2012 18:51:32 +0000 (11:51 -0700)]
Add some rpm targets

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoEnable mcm on the proxy branch
Patrick McCormick [Mon, 15 Oct 2012 18:51:08 +0000 (11:51 -0700)]
Enable mcm on the proxy branch

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoCorrect mpxyd.conf file location
Patrick McCormick [Mon, 15 Oct 2012 18:50:33 +0000 (11:50 -0700)]
Correct mpxyd.conf file location

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoMerge cst-linux:~jxiong/scm/dapl into proxy
Arlin Davis [Wed, 3 Oct 2012 22:19:24 +0000 (15:19 -0700)]
Merge cst-linux:~jxiong/scm/dapl into proxy

11 years agoKill before daemonizing
Jianxin Xiong [Tue, 2 Oct 2012 17:16:04 +0000 (10:16 -0700)]
Kill before daemonizing

From: Patrick McCormick <patrick.m.mccormick@intel.com>

11 years agoUse libc's daemon fuction and check if it fails
Jianxin Xiong [Tue, 2 Oct 2012 17:11:10 +0000 (10:11 -0700)]
Use libc's daemon fuction and check if it fails

From: Patrick McCormick <patrick.m.mccormick@intel.com>

11 years agoIf scif port 0 is specified (let system select the port), print out the port chosen
Jianxin Xiong [Mon, 1 Oct 2012 17:49:44 +0000 (10:49 -0700)]
If scif port 0 is specified (let system select the port), print out the port chosen

From: Patrick McCormick <patrick.m.mccormick@intel.com>

11 years agoAdd the "debug mode" which manages the pid file (/tmp/mpxyd.pid.<uid>), defaults
Jianxin Xiong [Mon, 1 Oct 2012 17:40:01 +0000 (10:40 -0700)]
Add the "debug mode" which manages the pid file (/tmp/mpxyd.pid.<uid>), defaults
to not daemonizing, and writes the log to stdout.

From: pmmccorm <patrick.m.mccormick@intel.com>

11 years agoAllow the init script be run as regular user.
Jianxin Xiong [Fri, 28 Sep 2012 22:34:38 +0000 (15:34 -0700)]
Allow the init script be run as regular user.

From: pmmccorm <patrick.m.mccormick@intel.com>

11 years agoConvert mpxyd.conf from DOS format to UNIX format.
Jianxin Xiong [Fri, 28 Sep 2012 20:10:39 +0000 (13:10 -0700)]
Convert mpxyd.conf from DOS format to UNIX format.

11 years agoAdd an opiton to kill the running daemon.
Jianxin Xiong [Fri, 28 Sep 2012 05:05:29 +0000 (22:05 -0700)]
Add an opiton to kill the running daemon.

11 years agoCatch SIGINT/SIGTERM and cleanup the lock file.
Jianxin Xiong [Fri, 28 Sep 2012 03:28:42 +0000 (20:28 -0700)]
Catch SIGINT/SIGTERM and cleanup the lock file.

11 years agoChange some default option values.
Jianxin Xiong [Fri, 28 Sep 2012 03:00:07 +0000 (20:00 -0700)]
Change some default option values.

11 years agoOutput configuration file name in the log.
Jianxin Xiong [Fri, 28 Sep 2012 02:55:34 +0000 (19:55 -0700)]
Output configuration file name in the log.

11 years agoOutput more error messages.
Jianxin Xiong [Fri, 28 Sep 2012 01:44:13 +0000 (18:44 -0700)]
Output more error messages.

11 years agoInit script: get the lock file name from the configuration file.
Jianxin Xiong [Thu, 27 Sep 2012 17:26:42 +0000 (10:26 -0700)]
Init script: get the lock file name from the configuration file.

From: pmmccorm <patrick.m.mccormick@intel.com>

11 years agoGive error message when fail to open the lock file.
Jianxin Xiong [Thu, 27 Sep 2012 16:49:05 +0000 (09:49 -0700)]
Give error message when fail to open the lock file.

11 years agoAdd a configuration option for the maximum message size.
Jianxin Xiong [Thu, 27 Sep 2012 16:27:28 +0000 (09:27 -0700)]
Add a configuration option for the maximum message size.

11 years agoDefault to /var/run for .pid files
Jianxin Xiong [Wed, 26 Sep 2012 21:09:58 +0000 (14:09 -0700)]
Default to /var/run for .pid files

From: pmmccorm <patrick.m.mccormick@intel.com>

11 years agoOrchestrate the handling of pid and lock files
Jianxin Xiong [Wed, 26 Sep 2012 21:03:58 +0000 (14:03 -0700)]
Orchestrate the handling of pid and lock files

From: pmmccorm <patrick.m.mccormick@intel.com>

11 years agoIncrease the proxy send queue size according to the segmentation setting.
Jianxin Xiong [Tue, 25 Sep 2012 22:36:43 +0000 (15:36 -0700)]
Increase the proxy send queue size according to the segmentation setting.

11 years agoAdd option to allow the proxy buffer be shared among connections from
Jianxin Xiong [Fri, 21 Sep 2012 04:13:16 +0000 (21:13 -0700)]
Add option to allow the proxy buffer be shared among connections from
the same client. The can help greatly reduce the memory consumption of
the porxy server when the number of connections is large.

11 years agoRevamp the instructions for the MCM provider.
Jianxin Xiong [Thu, 20 Sep 2012 20:45:21 +0000 (13:45 -0700)]
Revamp the instructions for the MCM provider.

11 years agoChange the default per connection proxy RDMA buffer from 64MB to 32MB for
Jianxin Xiong [Thu, 20 Sep 2012 02:53:40 +0000 (19:53 -0700)]
Change the default per connection proxy RDMA buffer from 64MB to 32MB for
better scalability. It can still be changed in mpxyd.conf.

11 years agoBug fix: lock/log file path is truncated to 32 characters.
Jianxin Xiong [Wed, 19 Sep 2012 17:07:00 +0000 (10:07 -0700)]
Bug fix: lock/log file path is truncated to 32 characters.

11 years agoIncrease the default queue length for scif_listen() call to 64, and
Jianxin Xiong [Tue, 18 Sep 2012 01:11:44 +0000 (18:11 -0700)]
Increase the default queue length for scif_listen() call to 64, and
add an option to the configuration file to override this setting.

11 years agoUpdate the non-root setup instructions.
Jianxin Xiong [Sat, 15 Sep 2012 00:50:42 +0000 (17:50 -0700)]
Update the non-root setup instructions.

11 years agoChange "MIC Indirect" to "CCL-proxy" in the readme files.
Jianxin Xiong [Fri, 14 Sep 2012 21:03:22 +0000 (14:03 -0700)]
Change "MIC Indirect" to "CCL-proxy" in the readme files.

11 years agoBug fix: MPI connection failure with 64+ ranks due to REJ packet not being delivered...
Jianxin Xiong [Fri, 14 Sep 2012 06:22:58 +0000 (23:22 -0700)]
Bug fix: MPI connection failure with 64+ ranks due to REJ packet not being delivered to the client.

11 years agoCatch proxy server segfault when scaling up.
Jianxin Xiong [Wed, 12 Sep 2012 22:18:23 +0000 (15:18 -0700)]
Catch proxy server segfault when scaling up.

11 years agoImproved error messages with SCIF related calls.
Jianxin Xiong [Wed, 12 Sep 2012 20:53:31 +0000 (13:53 -0700)]
Improved error messages with SCIF related calls.

11 years agoAdd instructions for installation without root privilege.
Jianxin Xiong [Mon, 10 Sep 2012 18:30:52 +0000 (11:30 -0700)]
Add instructions for installation without root privilege.

11 years agoAdd instructions for cluster deployment.
Jianxin Xiong [Thu, 6 Sep 2012 17:21:47 +0000 (10:21 -0700)]
Add instructions for cluster deployment.

11 years agoAllow changing the SCIF port id via the envar DAPL_MCM_PORT_ID so that
Jianxin Xiong [Wed, 5 Sep 2012 18:15:57 +0000 (11:15 -0700)]
Allow changing the SCIF port id via the envar DAPL_MCM_PORT_ID so that
the proxy service can be started by a regular user.

11 years agoMake the default config file consistent with the initialization code.
Jianxin Xiong [Wed, 5 Sep 2012 18:10:49 +0000 (11:10 -0700)]
Make the default config file consistent with the initialization code.

11 years agoPrevent dereferencing a NULL pointer.
Jianxin Xiong [Wed, 5 Sep 2012 17:54:15 +0000 (10:54 -0700)]
Prevent dereferencing a NULL pointer.

11 years agoAdd a readme file for the MCM provider.
Jianxin Xiong [Mon, 20 Aug 2012 18:59:29 +0000 (11:59 -0700)]
Add a readme file for the MCM provider.

11 years agoAdd "buffer_segment_size" to the configuration file.
Jianxin Xiong [Mon, 20 Aug 2012 18:52:57 +0000 (11:52 -0700)]
Add "buffer_segment_size" to the configuration file.

11 years agoBug fix: scif_unregister returns -ENODEV becasue the length is not page aligned.
Jianxin Xiong [Sat, 18 Aug 2012 20:43:30 +0000 (13:43 -0700)]
Bug fix: scif_unregister returns -ENODEV becasue the length is not page aligned.

11 years agoThere is a race condition between dapl_rbuf_add() dapl_rbuf_remove() when
Jianxin Xiong [Sat, 18 Aug 2012 05:30:06 +0000 (22:30 -0700)]
There is a race condition between dapl_rbuf_add() dapl_rbuf_remove() when
the ring buffer is empty or full. This could lead to incorrect item be
dequeued and correct item be discarded.

A temporary workaround is provided to ensure the pending event queue work
correctly. Without this fix dapl_evd_dequeue() could return the wrong event
and causes various application errors (hanging, assertion failures. etc).

Ultimately the ring buffer mechanism needs to the fixed.

11 years agoAdd environment variable DAPL_MCM_ALWAYS_PROXY to allow proxying non-MIC
Jianxin Xiong [Sat, 18 Aug 2012 05:13:37 +0000 (22:13 -0700)]
Add environment variable DAPL_MCM_ALWAYS_PROXY to allow proxying non-MIC
initiated connections. Good for debugging.

11 years agoHandle the (unlikely) corner case that the last sge has length 0.
Jianxin Xiong [Fri, 17 Aug 2012 18:44:27 +0000 (11:44 -0700)]
Handle the (unlikely) corner case that the last sge has length 0.

11 years agoMinor code cleanup.
Jianxin Xiong [Fri, 17 Aug 2012 18:35:25 +0000 (11:35 -0700)]
Minor code cleanup.

11 years agoRecount the actual pending WRs from time to time.
Jianxin Xiong [Fri, 17 Aug 2012 06:27:34 +0000 (23:27 -0700)]
Recount the actual pending WRs from time to time.

11 years agoDon't waste the 64 byte in the proxy buffer between segments.
Jianxin Xiong [Fri, 17 Aug 2012 05:59:20 +0000 (22:59 -0700)]
Don't waste the 64 byte in the proxy buffer between segments.

11 years agoBug fix: the calculation of 'total_offset' could be wrong if a segmented sge is not...
Jianxin Xiong [Thu, 16 Aug 2012 19:34:30 +0000 (12:34 -0700)]
Bug fix: the calculation of 'total_offset' could be wrong if a segmented sge is not the last one.

11 years agoPrevent poll_cnt from becoming negative, which could lead to huge delay.
Jianxin Xiong [Thu, 16 Aug 2012 18:31:22 +0000 (11:31 -0700)]
Prevent poll_cnt from becoming negative, which could lead to huge delay.

11 years agoDon't flood the log with messages about deferred WR posting.
Jianxin Xiong [Thu, 16 Aug 2012 18:28:42 +0000 (11:28 -0700)]
Don't flood the log with messages about deferred WR posting.

11 years agoProxy large writes in segments.
Jianxin Xiong [Wed, 15 Aug 2012 05:32:09 +0000 (22:32 -0700)]
Proxy large writes in segments.

11 years agoReplace scif_writeto with scif_readfrom, preparing for segmentation support.
Jianxin Xiong [Tue, 14 Aug 2012 20:00:58 +0000 (13:00 -0700)]
Replace scif_writeto with scif_readfrom, preparing for segmentation support.

11 years agoMake both address and size 64 bytes aligned when calling scif_writeto().
Jianxin Xiong [Tue, 14 Aug 2012 05:23:41 +0000 (22:23 -0700)]
Make both address and size 64 bytes aligned when calling scif_writeto().

11 years agoPort numbers need to be unique within each physical IB device (md),
Jianxin Xiong [Fri, 10 Aug 2012 00:55:04 +0000 (17:55 -0700)]
Port numbers need to be unique within each physical IB device (md),
otherwise connection requests between multiple clients of the same
proxy server can be matched incorrectly.

11 years agoCorrect many mismatches between the log messages and the actual operations.
Jianxin Xiong [Tue, 7 Aug 2012 15:53:18 +0000 (08:53 -0700)]
Correct many mismatches between the log messages and the actual operations.

11 years agoCorrect the MR size calculation when registering with SCIF.
Jianxin Xiong [Tue, 7 Aug 2012 05:53:19 +0000 (22:53 -0700)]
Correct the MR size calculation when registering with SCIF.

11 years agoMPXYD: only post an incoming write WR when there is no pending WR on the QP,
Jianxin Xiong [Sat, 4 Aug 2012 00:32:31 +0000 (17:32 -0700)]
MPXYD: only post an incoming write WR when there is no pending WR on the QP,
otherwise there might be ordering issues.

11 years agoFix the (nil) "wr_id" output of posted pending work request.
Jianxin Xiong [Sat, 4 Aug 2012 00:25:44 +0000 (17:25 -0700)]
Fix the (nil) "wr_id" output of posted pending work request.

11 years agoBug fix: connection error with "unexpected DAPL event 0x4003" when more then one...
Jianxin Xiong [Fri, 3 Aug 2012 03:30:34 +0000 (20:30 -0700)]
Bug fix: connection error with "unexpected DAPL event 0x4003" when more then one client exists.

11 years agoCorrect the range of valid ops in mix_op_str().
Jianxin Xiong [Thu, 2 Aug 2012 21:38:43 +0000 (14:38 -0700)]
Correct the range of valid ops in mix_op_str().

11 years agoFix conenction setup deadlock with more than one client connecting to MPXYD.
Jianxin Xiong [Thu, 2 Aug 2012 18:45:50 +0000 (11:45 -0700)]
Fix conenction setup deadlock with more than one client connecting to MPXYD.

11 years agoturn down debug level on mix_write
Arlin Davis [Thu, 26 Jul 2012 05:43:08 +0000 (22:43 -0700)]
turn down debug level on mix_write

11 years agoAdd wr_end field for entry count and hd/tl adjustments
Arlin Davis [Wed, 25 Jul 2012 06:24:59 +0000 (23:24 -0700)]
Add wr_end field for entry count and hd/tl adjustments

MPI now works pingpong. Still not stress testing well.
Some hang conditions need debugged

11 years agodtest update for unaligned/aligned data option, unaligned by default
sean.hefty@intel.com [Mon, 23 Jul 2012 22:20:08 +0000 (15:20 -0700)]
dtest update for unaligned/aligned data option, unaligned by default

set -a for page alignment.

return EINVAL for anything other than sends or writes.

11 years agordma write working both directions. MIC to XEON, XEON to MIC.
Arlin Davis [Mon, 23 Jul 2012 06:42:16 +0000 (23:42 -0700)]
rdma write working both directions. MIC to XEON, XEON to MIC.

updated memory registration to adjust for SCIF page alignment

12 years agoadd progress thread processing of delayed WR's
Arlin Davis [Tue, 17 Jul 2012 17:44:19 +0000 (10:44 -0700)]
add progress thread processing of delayed WR's

12 years agordma writes working
Arlin Davis [Tue, 17 Jul 2012 05:04:18 +0000 (22:04 -0700)]
rdma writes working

12 years agordma write support coded, compiled not tested
sean.hefty@intel.com [Sun, 15 Jul 2012 07:11:56 +0000 (00:11 -0700)]
rdma write support coded, compiled not tested

12 years agoSends/receives working, Xeon to MIC
Arlin Davis [Wed, 11 Jul 2012 16:33:19 +0000 (09:33 -0700)]
Sends/receives working, Xeon to MIC

12 years agoupdate debug message
Arlin Davis [Thu, 5 Jul 2012 16:05:36 +0000 (09:05 -0700)]
update debug message

12 years agoMCM development
Arlin Davis [Thu, 5 Jul 2012 16:01:47 +0000 (09:01 -0700)]
MCM development

12 years agoupdated for QP, CM services
Arlin Davis [Tue, 26 Jun 2012 18:35:53 +0000 (11:35 -0700)]
updated for  QP, CM services

12 years agoadd mcm
Arlin Davis [Wed, 13 Jun 2012 01:33:45 +0000 (18:33 -0700)]
add mcm

12 years agoupdate makefile, dat.conf
Arlin Davis [Wed, 13 Jun 2012 01:33:14 +0000 (18:33 -0700)]
update makefile, dat.conf

12 years agoadd extension.h update mpxyd.c
Arlin Davis [Wed, 6 Jun 2012 19:19:19 +0000 (12:19 -0700)]
add extension.h update mpxyd.c

12 years agoadd mcm.c and mix.c, MIC indirect CM code and MIC indirect eXchange messager
Arlin Davis [Mon, 4 Jun 2012 23:00:41 +0000 (16:00 -0700)]
add mcm.c and mix.c, MIC indirect CM code and MIC indirect eXchange messager

12 years agoupdate for chkconfig
Arlin Davis [Thu, 31 May 2012 22:56:20 +0000 (15:56 -0700)]
update for chkconfig

12 years agoupdate makefile
Arlin Davis [Thu, 31 May 2012 22:11:07 +0000 (15:11 -0700)]
update makefile

12 years agoupdate spec file
Arlin Davis [Thu, 31 May 2012 21:47:00 +0000 (14:47 -0700)]
update spec file

12 years agomakefile update
Arlin Davis [Thu, 31 May 2012 21:44:16 +0000 (14:44 -0700)]
makefile update

12 years agoupdate Makefile move daplmpxy to mpxyd.init.in
Arlin Davis [Thu, 31 May 2012 21:41:03 +0000 (14:41 -0700)]
update Makefile move daplmpxy to mpxyd.init.in

12 years agoupdate makefile
Arlin Davis [Thu, 31 May 2012 21:38:40 +0000 (14:38 -0700)]
update makefile

12 years agoadd mpxyd.conf
Arlin Davis [Tue, 22 May 2012 18:55:06 +0000 (11:55 -0700)]
add mpxyd.conf

12 years agoupdate makefile for mpdxyd.conf
Arlin Davis [Tue, 22 May 2012 18:31:58 +0000 (11:31 -0700)]
update makefile for mpdxyd.conf

12 years agoupdate makefile
Arlin Davis [Tue, 22 May 2012 17:47:31 +0000 (10:47 -0700)]
update makefile

12 years agoupdate makefile
Arlin Davis [Tue, 22 May 2012 17:44:12 +0000 (10:44 -0700)]
update makefile

12 years agoadd mpxyd.c
Arlin Davis [Mon, 21 May 2012 22:40:30 +0000 (15:40 -0700)]
add mpxyd.c

12 years agofix makefile
Arlin Davis [Mon, 21 May 2012 22:37:12 +0000 (15:37 -0700)]
fix makefile