]> git.openfabrics.org - ~ardavis/dapl.git/log
~ardavis/dapl.git
11 years agompxyd: cm scaling bug fixes and profiling
Arlin Davis [Sat, 2 Feb 2013 01:33:17 +0000 (17:33 -0800)]
mpxyd: cm scaling bug fixes and profiling

New CM thread to help with CM scale out. Testing with dtestcm
with 1000's of connections. MPI testing up to 60ppn on KNC nodes.

Add new disc timers and disconnect logging for debug.
Add cleanup for IB device during service termination.
Add profiling of device and CM operations to help debug scaling issues

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoREADME: update package and mcm readme files
Arlin Davis [Sat, 2 Feb 2013 01:07:29 +0000 (17:07 -0800)]
README: update package and mcm readme files

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agodtestcm: add more detailed debug during disconnect phase
Arlin Davis [Sat, 2 Feb 2013 00:54:58 +0000 (16:54 -0800)]
dtestcm: add more detailed debug during disconnect phase

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: add debug logging to dat_lmr_create
Arlin Davis [Sat, 2 Feb 2013 00:53:27 +0000 (16:53 -0800)]
common: add debug logging to dat_lmr_create

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: add ia stats logging and logging bit
Arlin Davis [Sat, 2 Feb 2013 00:49:57 +0000 (16:49 -0800)]
common: add ia stats logging and logging bit

added new DAPL_DBG_TYPE level of 0x2000000
that will print IA stats (non-zero) during dat_ia_close

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoscif: add scif ucm provider entry in dat.conf
Arlin Davis [Sat, 2 Feb 2013 00:48:23 +0000 (16:48 -0800)]
scif: add scif ucm provider entry in dat.conf

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoRelease intel-mic-ofed-dapl-2.0.36.4-1
Arlin Davis [Mon, 21 Jan 2013 22:56:44 +0000 (14:56 -0800)]
Release intel-mic-ofed-dapl-2.0.36.4-1

11 years agoREADME: add environment variable settings/defaults
Arlin Davis [Mon, 21 Jan 2013 22:55:17 +0000 (14:55 -0800)]
README: add environment variable settings/defaults

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agompxyd: TX thread can miss pending requests with multiple clients
Arlin Davis [Mon, 21 Jan 2013 20:51:42 +0000 (12:51 -0800)]
mpxyd: TX thread can miss pending requests with multiple clients

Pending data variable is overwritten with multiple SCIF clients
bound to one HCA causing rdma_write to stall and not posted
on IB device. MPI running multiple ranks on a KNC can stall.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoRelease intel-mic-ofed-dapl-2.0.36.3-1
Arlin Davis [Wed, 16 Jan 2013 22:26:32 +0000 (14:26 -0800)]
Release intel-mic-ofed-dapl-2.0.36.3-1

11 years agopackage: add README and README.mcm and update content
Arlin Davis [Wed, 16 Jan 2013 22:11:09 +0000 (14:11 -0800)]
package: add README and README.mcm and update content

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agomcm,mpxyd: add multi-enpoint, multi-threaded, and CPU affinity support for mpxyd...
Arlin Davis [Wed, 16 Jan 2013 21:39:59 +0000 (13:39 -0800)]
mcm,mpxyd: add multi-enpoint, multi-threaded, and CPU affinity support for mpxyd and mcm clients

For performance reasons separate EP's and separate threads have been incorporated.
3 scif eps. operation, events, and transmit are created for every device open
2 threads per MIC adapter, one for operations and one for RDMA operations
CPU affinity support as been added to to assist in HCA to MIC locality
for optimum performance. This fixes some performance issues seen at scale
on HT systems.

Also added some performance profiling to help with future tunining on
various platforms.

The CPU affinity and profiling are set via new mpxyd.conf parameters.
defaults are affinity=1, affinity base cpu_id=0, profiling=0

mcm_affinity, mcm_affinity_base, mcm_profile

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: add support for ia name during dat_ia_query
Arlin Davis [Wed, 16 Jan 2013 21:38:18 +0000 (13:38 -0800)]
common: add support for ia name during dat_ia_query

the device name was not being updated during a query. Copy
the hca name into ia_attr->adapter_name for consumers.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agodtest: changes to support signaling rates during rdma_write testing
Arlin Davis [Wed, 16 Jan 2013 21:36:18 +0000 (13:36 -0800)]
dtest: changes to support signaling rates during rdma_write testing

To support larger iterations without huge TX queues we need
to signal. Also add unidirectional and bidirectional performance
results.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agomcm: register memory on tx endpoint.
Arlin Davis [Wed, 16 Jan 2013 21:32:23 +0000 (13:32 -0800)]
mcm: register memory on tx endpoint.

separated processing across multiple EP's.
operation and scif dma on different EP's so
register on the tx_ep for proxy enabled providers.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agomcm: fix proxy cq_poll, return only empty or number of completions
Arlin Davis [Wed, 16 Jan 2013 21:30:28 +0000 (13:30 -0800)]
mcm: fix proxy cq_poll, return only empty or number of completions

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoucm: reduce log level of CM warning on UNKNOWN state message
Arlin Davis [Thu, 10 Jan 2013 17:36:55 +0000 (09:36 -0800)]
ucm: reduce log level of CM warning on UNKNOWN state message

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agompxyd: proxy segfaulting running as daemon, IB recv len incorrect
sean.hefty@intel.com [Mon, 10 Dec 2012 16:56:40 +0000 (08:56 -0800)]
mpxyd: proxy segfaulting running as daemon, IB recv len incorrect

didn't account for IB UD packet length being larger
than actual CM message due to added 40 byte GRH
with each receive packet. Adjust each recv pkt
len according to size of GRH.

Change default log to /tmp from stdout
Cleanup debug logs

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoRelease intel-mic-ofed-dapl-2.0.36.2-1
Arlin Davis [Thu, 6 Dec 2012 22:30:43 +0000 (14:30 -0800)]
Release intel-mic-ofed-dapl-2.0.36.2-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agodecrease debug log level on modify qp during close
Arlin Davis [Thu, 6 Dec 2012 20:28:30 +0000 (12:28 -0800)]
decrease debug log level on modify qp during close

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agompxyd: scif_connect return checking incorrect
Arlin Davis [Thu, 6 Dec 2012 20:27:22 +0000 (12:27 -0800)]
mpxyd: scif_connect return checking incorrect

check for -1 on errors, port_id returned on success.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agomcm: support for rdma writes with immediate plus associated fixes
Arlin Davis [Thu, 6 Dec 2012 01:27:17 +0000 (17:27 -0800)]
mcm: support for rdma writes with immediate plus associated fixes

Segmented writes fixed to return proper length in work completion
Mpxyd segmentation size sync'ed with configuration file
Reduced proxy WR depth for segmentation, limit to x8.
Add debug info to help profile performance stalls

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agompxyd: add common scif_send_msg inline function to handle partial writes
Arlin Davis [Tue, 4 Dec 2012 17:52:22 +0000 (09:52 -0800)]
mpxyd: add common scif_send_msg inline function to handle partial writes

scif_send blocking mode will not always block until
entire message is sent as documented. If will sometimes
return with partial sends. Create a common inline send
function what will handle condition and only return failure
with errors and not partial writes.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agomcm/mpxyd: add string print helper function for QP states
Arlin Davis [Tue, 4 Dec 2012 17:50:30 +0000 (09:50 -0800)]
mcm/mpxyd: add string print helper function for QP states

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoconfig: add scif device to dat.conf static configuration, remove v1.2 devices
Arlin Davis [Tue, 4 Dec 2012 17:46:31 +0000 (09:46 -0800)]
config: add scif device to dat.conf static configuration, remove v1.2 devices

v1.2 is no longer supported.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agospecfile: add definition to disable debug package
Arlin Davis [Thu, 29 Nov 2012 23:09:27 +0000 (15:09 -0800)]
specfile: add definition to disable debug package

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agospecfile: remove debug package
Arlin Davis [Thu, 29 Nov 2012 22:15:54 +0000 (14:15 -0800)]
specfile: remove debug package

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agompxyd: add exposed cm timers to configuration file
Arlin Davis [Thu, 29 Nov 2012 22:14:46 +0000 (14:14 -0800)]
mpxyd: add exposed cm timers to configuration file

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoRelease intel-mic-ofed-dapl-2.0.36.1-1
Arlin Davis [Mon, 26 Nov 2012 18:56:46 +0000 (10:56 -0800)]
Release intel-mic-ofed-dapl-2.0.36.1-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoadd new package spec file - intel-mic-ofed-dapl.spec.in
Arlin Davis [Mon, 26 Nov 2012 18:29:40 +0000 (10:29 -0800)]
add new package spec file - intel-mic-ofed-dapl.spec.in

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agopackage update: rename package for Intel MPSS release
Arlin Davis [Mon, 26 Nov 2012 18:23:29 +0000 (10:23 -0800)]
package update: rename package for Intel MPSS release

dapl- renamed to intel-mic-ofed-dapl-

new package definitions will obsolete the previous dapl version

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoRelease 2.0.36.1
Arlin Davis [Thu, 15 Nov 2012 22:14:14 +0000 (14:14 -0800)]
Release 2.0.36.1

Change version to include sub-minor numbers. Add options
to specfile to include CCFLAGS and LDFLAG options via rpmbuild.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoinstall: correct the path to /etc
Patrick Mccormick [Thu, 15 Nov 2012 02:49:16 +0000 (18:49 -0800)]
install: correct the path to /etc

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agompxyd: dont segfault if pid file is not found when attempting to kill running daemon
Patrick Mccormick [Thu, 15 Nov 2012 02:46:07 +0000 (18:46 -0800)]
mpxyd: dont segfault if pid file is not found when attempting to kill running daemon

11 years agompxyd: misc cleanups for consistency
Patrick Mccormick [Thu, 15 Nov 2012 02:44:34 +0000 (18:44 -0800)]
mpxyd: misc cleanups for consistency

11 years agompxyd: expose CM request and reply timers and retry count
Arlin Davis [Thu, 15 Nov 2012 02:36:56 +0000 (18:36 -0800)]
mpxyd: expose CM request and reply timers and retry count

add entries in the mpxyd.conf for timers and retry. Start with
larger default timers given small cores are processing messages
and they are proxied via SCIF.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: 2 new debug logging levels for low system memory and package info
Arlin Davis [Mon, 29 Oct 2012 18:54:46 +0000 (11:54 -0700)]
common: 2 new debug logging levels for low system memory and package info

DAPL_DBG_TYPE_SYS_WARN = 0x800000
DAPL_DBG_TYPE_VER = 0x1000000

export DAPL_DBG_SYS_MEM = 5 will set the checking for memory less than 5%
when DAPL_DBG_TYPE is set with bit DAPL_DBG_TYPE_SYS_WARN.

The package must be built with --enable-counters for memory checking to
be enabled.

In addition, if DAPL_DBG_TYPE -s set with bit DAPL_DBG_TYPE_VER than
the package rev and build date will be sent to stdout during library
init.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agomcm: process QP create errors from mpxyd
Arlin Davis [Mon, 29 Oct 2012 18:50:04 +0000 (11:50 -0700)]
mcm: process QP create errors from mpxyd

add checking and reporting for QP errors back to mix client
on the MIC adapter

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: add register memory to logging on non-debug build.
Arlin Davis [Mon, 29 Oct 2012 18:48:49 +0000 (11:48 -0700)]
common: add register memory to logging on non-debug build.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: turn down debug level during ep free
Arlin Davis [Mon, 29 Oct 2012 18:36:49 +0000 (11:36 -0700)]
common: turn down debug level during ep free

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agomcm: cleanup build issues with unused variables on --disable-debug
Arlin Davis [Fri, 19 Oct 2012 16:57:45 +0000 (09:57 -0700)]
mcm: cleanup build issues with unused variables on --disable-debug

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agocommon: cleanup/remove debug code in post send
Arlin Davis [Fri, 19 Oct 2012 16:56:05 +0000 (09:56 -0700)]
common: cleanup/remove debug code in post send

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoAdd some rpm targets
Patrick McCormick [Mon, 15 Oct 2012 18:51:32 +0000 (11:51 -0700)]
Add some rpm targets

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoEnable mcm on the proxy branch
Patrick McCormick [Mon, 15 Oct 2012 18:51:08 +0000 (11:51 -0700)]
Enable mcm on the proxy branch

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoCorrect mpxyd.conf file location
Patrick McCormick [Mon, 15 Oct 2012 18:50:33 +0000 (11:50 -0700)]
Correct mpxyd.conf file location

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
11 years agoMerge cst-linux:~jxiong/scm/dapl into proxy
Arlin Davis [Wed, 3 Oct 2012 22:19:24 +0000 (15:19 -0700)]
Merge cst-linux:~jxiong/scm/dapl into proxy

11 years agoKill before daemonizing
Jianxin Xiong [Tue, 2 Oct 2012 17:16:04 +0000 (10:16 -0700)]
Kill before daemonizing

From: Patrick McCormick <patrick.m.mccormick@intel.com>

11 years agoUse libc's daemon fuction and check if it fails
Jianxin Xiong [Tue, 2 Oct 2012 17:11:10 +0000 (10:11 -0700)]
Use libc's daemon fuction and check if it fails

From: Patrick McCormick <patrick.m.mccormick@intel.com>

11 years agoIf scif port 0 is specified (let system select the port), print out the port chosen
Jianxin Xiong [Mon, 1 Oct 2012 17:49:44 +0000 (10:49 -0700)]
If scif port 0 is specified (let system select the port), print out the port chosen

From: Patrick McCormick <patrick.m.mccormick@intel.com>

11 years agoAdd the "debug mode" which manages the pid file (/tmp/mpxyd.pid.<uid>), defaults
Jianxin Xiong [Mon, 1 Oct 2012 17:40:01 +0000 (10:40 -0700)]
Add the "debug mode" which manages the pid file (/tmp/mpxyd.pid.<uid>), defaults
to not daemonizing, and writes the log to stdout.

From: pmmccorm <patrick.m.mccormick@intel.com>

11 years agoAllow the init script be run as regular user.
Jianxin Xiong [Fri, 28 Sep 2012 22:34:38 +0000 (15:34 -0700)]
Allow the init script be run as regular user.

From: pmmccorm <patrick.m.mccormick@intel.com>

11 years agoConvert mpxyd.conf from DOS format to UNIX format.
Jianxin Xiong [Fri, 28 Sep 2012 20:10:39 +0000 (13:10 -0700)]
Convert mpxyd.conf from DOS format to UNIX format.

11 years agoAdd an opiton to kill the running daemon.
Jianxin Xiong [Fri, 28 Sep 2012 05:05:29 +0000 (22:05 -0700)]
Add an opiton to kill the running daemon.

11 years agoCatch SIGINT/SIGTERM and cleanup the lock file.
Jianxin Xiong [Fri, 28 Sep 2012 03:28:42 +0000 (20:28 -0700)]
Catch SIGINT/SIGTERM and cleanup the lock file.

11 years agoChange some default option values.
Jianxin Xiong [Fri, 28 Sep 2012 03:00:07 +0000 (20:00 -0700)]
Change some default option values.

11 years agoOutput configuration file name in the log.
Jianxin Xiong [Fri, 28 Sep 2012 02:55:34 +0000 (19:55 -0700)]
Output configuration file name in the log.

11 years agoOutput more error messages.
Jianxin Xiong [Fri, 28 Sep 2012 01:44:13 +0000 (18:44 -0700)]
Output more error messages.

11 years agoInit script: get the lock file name from the configuration file.
Jianxin Xiong [Thu, 27 Sep 2012 17:26:42 +0000 (10:26 -0700)]
Init script: get the lock file name from the configuration file.

From: pmmccorm <patrick.m.mccormick@intel.com>

11 years agoGive error message when fail to open the lock file.
Jianxin Xiong [Thu, 27 Sep 2012 16:49:05 +0000 (09:49 -0700)]
Give error message when fail to open the lock file.

11 years agoAdd a configuration option for the maximum message size.
Jianxin Xiong [Thu, 27 Sep 2012 16:27:28 +0000 (09:27 -0700)]
Add a configuration option for the maximum message size.

11 years agoDefault to /var/run for .pid files
Jianxin Xiong [Wed, 26 Sep 2012 21:09:58 +0000 (14:09 -0700)]
Default to /var/run for .pid files

From: pmmccorm <patrick.m.mccormick@intel.com>

11 years agoOrchestrate the handling of pid and lock files
Jianxin Xiong [Wed, 26 Sep 2012 21:03:58 +0000 (14:03 -0700)]
Orchestrate the handling of pid and lock files

From: pmmccorm <patrick.m.mccormick@intel.com>

11 years agoIncrease the proxy send queue size according to the segmentation setting.
Jianxin Xiong [Tue, 25 Sep 2012 22:36:43 +0000 (15:36 -0700)]
Increase the proxy send queue size according to the segmentation setting.

11 years agoAdd option to allow the proxy buffer be shared among connections from
Jianxin Xiong [Fri, 21 Sep 2012 04:13:16 +0000 (21:13 -0700)]
Add option to allow the proxy buffer be shared among connections from
the same client. The can help greatly reduce the memory consumption of
the porxy server when the number of connections is large.

11 years agoRevamp the instructions for the MCM provider.
Jianxin Xiong [Thu, 20 Sep 2012 20:45:21 +0000 (13:45 -0700)]
Revamp the instructions for the MCM provider.

11 years agoChange the default per connection proxy RDMA buffer from 64MB to 32MB for
Jianxin Xiong [Thu, 20 Sep 2012 02:53:40 +0000 (19:53 -0700)]
Change the default per connection proxy RDMA buffer from 64MB to 32MB for
better scalability. It can still be changed in mpxyd.conf.

11 years agoBug fix: lock/log file path is truncated to 32 characters.
Jianxin Xiong [Wed, 19 Sep 2012 17:07:00 +0000 (10:07 -0700)]
Bug fix: lock/log file path is truncated to 32 characters.

11 years agoIncrease the default queue length for scif_listen() call to 64, and
Jianxin Xiong [Tue, 18 Sep 2012 01:11:44 +0000 (18:11 -0700)]
Increase the default queue length for scif_listen() call to 64, and
add an option to the configuration file to override this setting.

11 years agoUpdate the non-root setup instructions.
Jianxin Xiong [Sat, 15 Sep 2012 00:50:42 +0000 (17:50 -0700)]
Update the non-root setup instructions.

11 years agoChange "MIC Indirect" to "CCL-proxy" in the readme files.
Jianxin Xiong [Fri, 14 Sep 2012 21:03:22 +0000 (14:03 -0700)]
Change "MIC Indirect" to "CCL-proxy" in the readme files.

11 years agoBug fix: MPI connection failure with 64+ ranks due to REJ packet not being delivered...
Jianxin Xiong [Fri, 14 Sep 2012 06:22:58 +0000 (23:22 -0700)]
Bug fix: MPI connection failure with 64+ ranks due to REJ packet not being delivered to the client.

11 years agoCatch proxy server segfault when scaling up.
Jianxin Xiong [Wed, 12 Sep 2012 22:18:23 +0000 (15:18 -0700)]
Catch proxy server segfault when scaling up.

11 years agoImproved error messages with SCIF related calls.
Jianxin Xiong [Wed, 12 Sep 2012 20:53:31 +0000 (13:53 -0700)]
Improved error messages with SCIF related calls.

11 years agoAdd instructions for installation without root privilege.
Jianxin Xiong [Mon, 10 Sep 2012 18:30:52 +0000 (11:30 -0700)]
Add instructions for installation without root privilege.

11 years agoAdd instructions for cluster deployment.
Jianxin Xiong [Thu, 6 Sep 2012 17:21:47 +0000 (10:21 -0700)]
Add instructions for cluster deployment.

11 years agoAllow changing the SCIF port id via the envar DAPL_MCM_PORT_ID so that
Jianxin Xiong [Wed, 5 Sep 2012 18:15:57 +0000 (11:15 -0700)]
Allow changing the SCIF port id via the envar DAPL_MCM_PORT_ID so that
the proxy service can be started by a regular user.

11 years agoMake the default config file consistent with the initialization code.
Jianxin Xiong [Wed, 5 Sep 2012 18:10:49 +0000 (11:10 -0700)]
Make the default config file consistent with the initialization code.

11 years agoPrevent dereferencing a NULL pointer.
Jianxin Xiong [Wed, 5 Sep 2012 17:54:15 +0000 (10:54 -0700)]
Prevent dereferencing a NULL pointer.

11 years agoAdd a readme file for the MCM provider.
Jianxin Xiong [Mon, 20 Aug 2012 18:59:29 +0000 (11:59 -0700)]
Add a readme file for the MCM provider.

11 years agoAdd "buffer_segment_size" to the configuration file.
Jianxin Xiong [Mon, 20 Aug 2012 18:52:57 +0000 (11:52 -0700)]
Add "buffer_segment_size" to the configuration file.

11 years agoBug fix: scif_unregister returns -ENODEV becasue the length is not page aligned.
Jianxin Xiong [Sat, 18 Aug 2012 20:43:30 +0000 (13:43 -0700)]
Bug fix: scif_unregister returns -ENODEV becasue the length is not page aligned.

11 years agoThere is a race condition between dapl_rbuf_add() dapl_rbuf_remove() when
Jianxin Xiong [Sat, 18 Aug 2012 05:30:06 +0000 (22:30 -0700)]
There is a race condition between dapl_rbuf_add() dapl_rbuf_remove() when
the ring buffer is empty or full. This could lead to incorrect item be
dequeued and correct item be discarded.

A temporary workaround is provided to ensure the pending event queue work
correctly. Without this fix dapl_evd_dequeue() could return the wrong event
and causes various application errors (hanging, assertion failures. etc).

Ultimately the ring buffer mechanism needs to the fixed.

11 years agoAdd environment variable DAPL_MCM_ALWAYS_PROXY to allow proxying non-MIC
Jianxin Xiong [Sat, 18 Aug 2012 05:13:37 +0000 (22:13 -0700)]
Add environment variable DAPL_MCM_ALWAYS_PROXY to allow proxying non-MIC
initiated connections. Good for debugging.

11 years agoHandle the (unlikely) corner case that the last sge has length 0.
Jianxin Xiong [Fri, 17 Aug 2012 18:44:27 +0000 (11:44 -0700)]
Handle the (unlikely) corner case that the last sge has length 0.

11 years agoMinor code cleanup.
Jianxin Xiong [Fri, 17 Aug 2012 18:35:25 +0000 (11:35 -0700)]
Minor code cleanup.

11 years agoRecount the actual pending WRs from time to time.
Jianxin Xiong [Fri, 17 Aug 2012 06:27:34 +0000 (23:27 -0700)]
Recount the actual pending WRs from time to time.

11 years agoDon't waste the 64 byte in the proxy buffer between segments.
Jianxin Xiong [Fri, 17 Aug 2012 05:59:20 +0000 (22:59 -0700)]
Don't waste the 64 byte in the proxy buffer between segments.

11 years agoBug fix: the calculation of 'total_offset' could be wrong if a segmented sge is not...
Jianxin Xiong [Thu, 16 Aug 2012 19:34:30 +0000 (12:34 -0700)]
Bug fix: the calculation of 'total_offset' could be wrong if a segmented sge is not the last one.

11 years agoPrevent poll_cnt from becoming negative, which could lead to huge delay.
Jianxin Xiong [Thu, 16 Aug 2012 18:31:22 +0000 (11:31 -0700)]
Prevent poll_cnt from becoming negative, which could lead to huge delay.

11 years agoDon't flood the log with messages about deferred WR posting.
Jianxin Xiong [Thu, 16 Aug 2012 18:28:42 +0000 (11:28 -0700)]
Don't flood the log with messages about deferred WR posting.

11 years agoProxy large writes in segments.
Jianxin Xiong [Wed, 15 Aug 2012 05:32:09 +0000 (22:32 -0700)]
Proxy large writes in segments.

11 years agoReplace scif_writeto with scif_readfrom, preparing for segmentation support.
Jianxin Xiong [Tue, 14 Aug 2012 20:00:58 +0000 (13:00 -0700)]
Replace scif_writeto with scif_readfrom, preparing for segmentation support.

11 years agoMake both address and size 64 bytes aligned when calling scif_writeto().
Jianxin Xiong [Tue, 14 Aug 2012 05:23:41 +0000 (22:23 -0700)]
Make both address and size 64 bytes aligned when calling scif_writeto().

11 years agoPort numbers need to be unique within each physical IB device (md),
Jianxin Xiong [Fri, 10 Aug 2012 00:55:04 +0000 (17:55 -0700)]
Port numbers need to be unique within each physical IB device (md),
otherwise connection requests between multiple clients of the same
proxy server can be matched incorrectly.

11 years agoCorrect many mismatches between the log messages and the actual operations.
Jianxin Xiong [Tue, 7 Aug 2012 15:53:18 +0000 (08:53 -0700)]
Correct many mismatches between the log messages and the actual operations.

11 years agoCorrect the MR size calculation when registering with SCIF.
Jianxin Xiong [Tue, 7 Aug 2012 05:53:19 +0000 (22:53 -0700)]
Correct the MR size calculation when registering with SCIF.

11 years agoMPXYD: only post an incoming write WR when there is no pending WR on the QP,
Jianxin Xiong [Sat, 4 Aug 2012 00:32:31 +0000 (17:32 -0700)]
MPXYD: only post an incoming write WR when there is no pending WR on the QP,
otherwise there might be ordering issues.

11 years agoFix the (nil) "wr_id" output of posted pending work request.
Jianxin Xiong [Sat, 4 Aug 2012 00:25:44 +0000 (17:25 -0700)]
Fix the (nil) "wr_id" output of posted pending work request.

11 years agoBug fix: connection error with "unexpected DAPL event 0x4003" when more then one...
Jianxin Xiong [Fri, 3 Aug 2012 03:30:34 +0000 (20:30 -0700)]
Bug fix: connection error with "unexpected DAPL event 0x4003" when more then one client exists.

11 years agoCorrect the range of valid ops in mix_op_str().
Jianxin Xiong [Thu, 2 Aug 2012 21:38:43 +0000 (14:38 -0700)]
Correct the range of valid ops in mix_op_str().