]> git.openfabrics.org - ~ardavis/dapl.git/log
~ardavis/dapl.git
10 years agomcm: serialize dapls_evd_cqe_to_event calls with evd lock
Arlin Davis [Tue, 15 Jul 2014 20:42:12 +0000 (13:42 -0700)]
mcm: serialize dapls_evd_cqe_to_event calls with evd lock

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agocommon: init rbuf memory, assign hd/tl pos in range
Arlin Davis [Tue, 15 Jul 2014 20:41:10 +0000 (13:41 -0700)]
common: init rbuf memory, assign hd/tl pos in range

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoIB extension: segfault in create collective group with non-vector type IA handle"
Arlin Davis [Wed, 2 Jul 2014 21:49:53 +0000 (14:49 -0700)]
IB extension: segfault in create collective group with non-vector type IA handle"

The dats_get_ia_handle call was change in 2.0.34 to convert IA handle from
both vector to handle and handle to vector to fix query calls that
incorrectly returned IA handles in non-vector form. If a caller uses a
non vector IA handle it will get converted incorrectly to a vector and cause
a segfault. Add additional check to verify a IA handle type before calling
get ia handle to avoid incorrect translation.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agobuild: remove library check for mverbs with --enable-fca
Arlin Davis [Thu, 26 Jun 2014 22:40:46 +0000 (15:40 -0700)]
build: remove library check for mverbs with --enable-fca

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agobuild: change configure help to correctly state collective default=none
Arlin Davis [Tue, 24 Jun 2014 22:48:38 +0000 (15:48 -0700)]
build: change configure help to correctly state collective default=none

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm/mpxyd: cleanup ahead of master branch merge
Arlin Davis [Wed, 2 Jul 2014 20:37:18 +0000 (13:37 -0700)]
mcm/mpxyd: cleanup ahead of master branch merge

combine mpxy.h and dat_mic_extensions.h into dapl_mic_commom.h
since the MIC message and cm protocol is internal only and
is not an exposed extension.

update copyright dates.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: general cleanup of extra debug code
Arlin Davis [Wed, 2 Jul 2014 16:19:43 +0000 (09:19 -0700)]
mcm: general cleanup of extra debug code

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: serialize affinity cpu_id selection for threads
Arlin Davis [Thu, 26 Jun 2014 22:37:45 +0000 (15:37 -0700)]
mpxyd: serialize affinity cpu_id selection for threads

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoRelease dapl-2.0.42.2-1
Arlin Davis [Mon, 16 Jun 2014 21:37:41 +0000 (14:37 -0700)]
Release dapl-2.0.42.2-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: alltoall hangs on scale with MXS,MSS,HST intranode configurations
Arlin Davis [Mon, 16 Jun 2014 16:36:23 +0000 (09:36 -0700)]
mcm: alltoall hangs on scale with MXS,MSS,HST intranode configurations

HST based MCM provider can drop consumer (MPI) request events
if consumer uses shared CQ's across a HST->MSS and HST->MXS
connections and the CQ events is process in the PI progress
thread.

Change the mcm_rcv_pi_event function to mcm_dto_event and add
support to process both direct (HST->MSS or HST) RW,RW_imm,SND
requests (HST->MSS or HST) and proxy-in RW_imm requests (HST->MXS).

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: GUID matches with MXS->HST inside platform, make random without MPXYD_LOCAL_SU...
Arlin Davis [Mon, 16 Jun 2014 16:36:15 +0000 (09:36 -0700)]
mpxyd: GUID matches with MXS->HST inside platform, make random without MPXYD_LOCAL_SUPPORT

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: change readfrom/writeto to RMA_USECPU for IO <256 bytes
Arlin Davis [Mon, 16 Jun 2014 16:33:16 +0000 (09:33 -0700)]
mpxyd: change readfrom/writeto to RMA_USECPU for IO <256 bytes

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: improve logging for scale-up debug
Arlin Davis [Mon, 16 Jun 2014 16:31:27 +0000 (09:31 -0700)]
mcm: improve logging for scale-up debug

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoset default rr_signal rate to 1 from 10
Arlin Davis [Mon, 16 Jun 2014 16:30:37 +0000 (09:30 -0700)]
set default rr_signal rate to 1 from 10

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoRelease dapl-2.0.42.1-1
Arlin Davis [Thu, 22 May 2014 22:28:23 +0000 (15:28 -0700)]
Release dapl-2.0.42.1-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodapltest: increase DTO evd size to prevent CQ overflow on limit_rpost test
Arlin Davis [Tue, 15 Apr 2014 21:48:54 +0000 (14:48 -0700)]
dapltest: increase DTO evd size to prevent CQ overflow on limit_rpost test

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoCreation of reserved SP moves EP state to DAT_EP_STATE_RESERVED even in failure cases...
Arlin Davis [Tue, 15 Apr 2014 20:44:16 +0000 (13:44 -0700)]
Creation of reserved SP moves EP state to DAT_EP_STATE_RESERVED even in failure cases. Reserve EP after successfully binding the listening port.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodapl: fix string bug in dapls_dto_op_str
Dave Goodell [Mon, 24 Mar 2014 21:07:37 +0000 (14:07 -0700)]
dapl: fix string bug in dapls_dto_op_str

This led to indexing off the end of the array and gave surprising
results for OP_RECV_UD.

10 years agompxyd: change affinity to avoid overlapping cores with 2 MICs in same socket
Arlin Davis [Thu, 22 May 2014 21:56:07 +0000 (14:56 -0700)]
mpxyd: change affinity to avoid overlapping cores with 2 MICs in same socket

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: add mpxyd.conf option to disable proxy-in service
Arlin Davis [Thu, 22 May 2014 21:42:28 +0000 (14:42 -0700)]
mpxyd: add mpxyd.conf option to disable proxy-in service

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: set default seg size to 128KB, down from 256KB
Arlin Davis [Thu, 22 May 2014 20:57:42 +0000 (13:57 -0700)]
mpxyd: set default seg size to 128KB, down from 256KB

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoinstall: base package, without mpss, incorrectly tries to install /usr/sbin as file
Arlin Davis [Thu, 22 May 2014 20:16:31 +0000 (13:16 -0700)]
install: base package, without mpss, incorrectly tries to install /usr/sbin as file

modify specfile and makefile to only install /usr/sbin/mpxyd
if built with mpss present. If files don't exist, mpss not
installed, make sure /usr/sbin is not incorrectly installed as a file.

Signed-off-by: Patrick Mccormick <patrick.m.mccormick@intel.com>
Ack-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: scale-up with MPI dapl:dapl hits low mem issue with 1 byte traffic patterns
Arlin Davis [Thu, 22 May 2014 16:30:59 +0000 (09:30 -0700)]
mpxyd: scale-up with MPI dapl:dapl hits low mem issue with 1 byte traffic patterns

PI and PO buffer management use last byte offset of work requests
and assume non-zero value. However, in the case where a 1 byte
rdma occurs at offset 0 will result in m_idx being set to zero
as a valid offset and the buffer never being marked complete.

Clean-up buffer management, add error reporting on setting,
serialization on PO by setting in post_send op thread, setting
start location at cacheline when buffer ring wraps instead of 0.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: add MIC client and device id to logging
Arlin Davis [Thu, 22 May 2014 16:30:52 +0000 (09:30 -0700)]
mpxyd: add MIC client and device id to logging

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoadd new po-pi rdma write perf profile
Arlin Davis [Thu, 22 May 2014 16:29:00 +0000 (09:29 -0700)]
add new po-pi rdma write perf profile

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agochange default wrc entries from 1024 to 512
Arlin Davis [Thu, 22 May 2014 16:27:35 +0000 (09:27 -0700)]
change default wrc entries from 1024 to 512

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: MIC scale-up issue with MPI gather workloads, I_MPI_FABRICS=dapl:dapl
Arlin Davis [Fri, 16 May 2014 17:04:21 +0000 (10:04 -0700)]
mpxyd: MIC scale-up issue with MPI gather workloads, I_MPI_FABRICS=dapl:dapl

issue with shared proxy-in buffer pool when rdma reads complete
out of order across QP's. The tail adjustment when read completes
fails to walk entire queue and process head entry.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: remove unnecessary logging
Arlin Davis [Fri, 16 May 2014 16:26:35 +0000 (09:26 -0700)]
mpxyd: remove unnecessary logging

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: mpxyd error event of m_pi_prep_rcv_q: ERR: ib_qp == 0
Arlin Davis [Fri, 16 May 2014 16:21:26 +0000 (09:21 -0700)]
mcm: mpxyd error event of m_pi_prep_rcv_q: ERR: ib_qp == 0

When incorrect ep_mode is provided by consumer the proxy-in
service modes get setup incorrectly. Validate remote address
ep mode, set to unknown if out of range.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoRelease dapl-2.0.41.2-1
Arlin Davis [Thu, 1 May 2014 20:08:04 +0000 (13:08 -0700)]
Release dapl-2.0.41.2-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoSUSE install/uninstall issues with latest spec file
Arlin Davis [Thu, 1 May 2014 19:47:43 +0000 (12:47 -0700)]
SUSE install/uninstall issues with latest spec file

For installation, remove obsolete/conflict with self, only obsolete
the old intel-mic-ofed-dapl package name.

Add %preun check if mpxyd service exists first, otherwise a non
ccl-proxy systems uninstallation fails

Signed-off-by: Patrick McCormick <patrick.m.mccormick@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd,mcm: increase default CM timers for better oob scaling
Arlin Davis [Thu, 24 Apr 2014 19:35:53 +0000 (12:35 -0700)]
mpxyd,mcm: increase default CM timers for better oob scaling

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd,mcm: changes for backward compatibility with older v4 MIC clients
Arlin Davis [Wed, 30 Apr 2014 18:13:35 +0000 (11:13 -0700)]
mpxyd,mcm: changes for backward compatibility with older v4 MIC clients

Allow mpxyd service to run with older MIC clients that support only proxy-out
and not proxy-in capabilities. Define minimal and compatible versions and
sync to MIC client during device open.

Create and use dat_mcm_msg_compat, dat_mix_mr_compat, and dat_mix_cm_compat
messages and operations with older v4 clients.

Move current MIX command version to v5.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years ago allow proxy_out WR stalls instead of immediate error
Arlin Davis [Wed, 30 Apr 2014 18:13:29 +0000 (11:13 -0700)]
 allow proxy_out WR stalls instead of immediate error

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: increase max open files limit for proxy service
Arlin Davis [Wed, 23 Apr 2014 18:07:14 +0000 (11:07 -0700)]
mpxyd: increase max open files limit for proxy service

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoRelease dapl-2.0.41.1-1
Arlin Davis [Mon, 21 Apr 2014 19:52:49 +0000 (12:52 -0700)]
Release dapl-2.0.41.1-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodapltest: change server port, from 45278 to 62000, out of registered IANA range
Arlin Davis [Fri, 14 Mar 2014 17:47:06 +0000 (10:47 -0700)]
dapltest: change server port, from 45278 to 62000, out of registered IANA range

The existing port 45278 is in the registered port range.

RFC 6335:
 System Ports, well known, 0-1023 (assigned by IANA)
 User Ports, registered, 1024-49151 (assigned by IANA)
 Dynamic Ports, private or Ephemeral, 49152-65535 (never assigned)

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodapltest: set default limit max to 1000
Arlin Davis [Tue, 4 Mar 2014 18:48:55 +0000 (10:48 -0800)]
dapltest: set default limit max to 1000

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodapltest: update scripts for regression testing purposes
Arlin Davis [Mon, 3 Mar 2014 23:04:12 +0000 (15:04 -0800)]
dapltest: update scripts for regression testing purposes

cl.sh and srv.sh update to provide better examples and
a methods to quickly regression test any dapltest changes.

 usage: srv.sh devicename
   where devicename is provider (default = ofa-v2-mlx4_0-1)

 usage: cl.sh hostname testname devicename
   where testname
     stop - request DAPLtest server to exit.
     conn - simple connection with limited dater transfer
     trans - single transaction test
     transm - transaction test: multiple transactions [RW SND, RDMA]
     transt - transaction test: multi-threaded, single transaction
     transme - transaction test: multi-endpoints per thread
     transmet - transaction test: multi: threads and endpoints per thread
     transmete - transaction test: multi threads == endpoints
     perf - Performance test
     threads - multi-threaded single transaction test.
     threadsm - multi: threads and endpoints, single transaction test.
     rdma-write - RDMA write
     rdma-read - RDMA read
     bw - bandwidth
     latb - latency tests, blocking for events
     latp - latency tests, polling for events
     lim - limit tests.
     regression - loop over a collection of all tests.
   where devicename is provider (default = ofa-v2-mlx4_0-1)

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodapltest: Add final send/recv "sync" for transaction tests.
swise@opengridcomputing.com [Mon, 3 Mar 2014 22:35:43 +0000 (14:35 -0800)]
dapltest: Add final send/recv "sync" for transaction tests.

The transaction tests need both sides to send a sync message after running the test.  This ensures that all remote operations are complete before dapltest deregeisters memory and disconnects the endpoints.

Without this logic, we see intermittent async errors on iwarp devices because a read response or write arrives after the rmr has been destroyed.
I believe this is more likely to happen with iWARP than IB because iWARP completions only indicate the local buffer can be reused.  It doesn't imply that the message has even arrived at the peer, let alone been placed in the peer application's memory.

Changes from V1:

- allocate new send/recv buffers for the Final Sync message.

- post the Final Sync recv buffer at the beginning of the final iteration of a test.

- tests ok on cxgb4 and mlx4 devices.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
10 years agompxyd: append _free to dqconn/dqlisten for readability, improve logging
Arlin Davis [Fri, 18 Apr 2014 00:05:58 +0000 (17:05 -0700)]
mpxyd: append _free to dqconn/dqlisten for readability, improve logging

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: check for shared CQs in PI mode
Arlin Davis [Fri, 18 Apr 2014 00:04:26 +0000 (17:04 -0700)]
mcm: check for shared CQs in PI mode

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: add dev_id to all mpxyd commands
Arlin Davis [Fri, 18 Apr 2014 00:03:27 +0000 (17:03 -0700)]
mcm: add dev_id to all mpxyd commands

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: scale-up improvements to support 200-300 processes per MIC
Arlin Davis [Tue, 15 Apr 2014 22:28:10 +0000 (15:28 -0700)]
mpxyd: scale-up improvements to support 200-300 processes per MIC

Change scif_send_msg from blocking to non-blocking.
Serialize listen port space across MIC devices on same IB device.
Set CM to reject state after sending or recving user reject.
Serialize usage of scif_ev_ep on CM and DTO events across TX and CM threads.
Increase default scif listen backlog.
Reduce default proxy buffer, WR, and WC queues.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: serialize MD cm port space usage, add swap to rej call, resend dropped rej
Arlin Davis [Tue, 15 Apr 2014 00:09:32 +0000 (17:09 -0700)]
mpxyd: serialize MD cm port space usage, add swap to rej call, resend dropped rej

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: serialize CM cmds on ev_ep, add dev_id, increase dev_open listen backlog
Arlin Davis [Mon, 14 Apr 2014 23:55:47 +0000 (16:55 -0700)]
mcm: serialize CM cmds on ev_ep, add dev_id, increase dev_open listen backlog

user thread or cm thread could be processing CM commands
and events so use of scif_ev_ep needs locking.

Add dev_id to req_id for client to mpxyd device open linking
and increase backlog on MCM scif_listen for mpxyd to avoid
connection refused scenarios during device open.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoopenib: return open failure if port not active
Arlin Davis [Thu, 3 Apr 2014 23:04:59 +0000 (16:04 -0700)]
openib: return open failure if port not active

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: disable inside box support via scif only, use IB
Arlin Davis [Tue, 25 Mar 2014 20:49:31 +0000 (13:49 -0700)]
mpxyd: disable inside box support via scif only, use IB

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: add host to mic cross socket support to proxy-in service to improve performance
Arlin Davis [Tue, 25 Mar 2014 18:36:37 +0000 (11:36 -0700)]
mcm: add host to mic cross socket support to proxy-in service to improve performance

mcm provider running on host will now connect and move data via
the remote mpxyd proxy in (PI) service when connecting to a MIC
cross socket from HCA.

CM protocol/service enhanced to connect multi-pathed QP's
between a non-MIC and MICs for optimized speed paths per
direction.

HOST-> MSS (same socket) will connect QP2 directly to remote MIC rcv
QP1 for send data and connect QP1 to MPXYD PO service QP2 on remote
host for recv data.

HOST-> MXS (cross socket) will connect QP2 to MPXYD PI service QP1
on remote host for send data and connect QP1 to MPXYD PO service QP2
on remote host for recv data.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodat: reduce log level on library load failures
Arlin Davis [Tue, 25 Mar 2014 18:36:22 +0000 (11:36 -0700)]
dat: reduce log level on library load failures

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomic: take cm_msg rsvd byte for segment size exchange, power of 2
Arlin Davis [Thu, 13 Mar 2014 18:30:08 +0000 (11:30 -0700)]
mic: take cm_msg rsvd byte for segment size exchange, power of 2

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: init proxy WC buffer queues for new queue management
Arlin Davis [Sun, 9 Mar 2014 18:49:48 +0000 (11:49 -0700)]
mpxyd: init proxy WC buffer queues for new queue management

change device destroy to dequeue SMD immediately.
change TX thread to run whenever active device is queued.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: simplify WR and WC queue management and fix inline post send
Arlin Davis [Sun, 9 Mar 2014 18:44:59 +0000 (11:44 -0700)]
mpxyd: simplify WR and WC queue management and fix inline post send

power of 2 depths, TL = HD is empty, tl + 1.

inline post sends need locking for TX proxy buffer
and will need to flush inline data from scif message
on failures.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: 64KB segment sizes hang with MPI IMB pingpong cross socket
Arlin Davis [Thu, 6 Mar 2014 17:24:04 +0000 (09:24 -0800)]
mpxyd: 64KB segment sizes hang with MPI IMB pingpong cross socket

proxy out work request processing rounds down starting address
and rounds up size to 64 byte cacheline. The case where rounded
up from non-64 byte resulted in a 0 byte RDMA segment. Add checking
for actual len versus rounded up l_len for last segment.

Add additional perf profileing via MCM_PROFILE_DBG.
Signal on LS if not marked via signal rate modulo.
Add support for new M_READ_FROM_DONE work request state.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: add new M_READ_FROM_DONE state for send WR's and add more profiling options
Arlin Davis [Thu, 6 Mar 2014 05:23:27 +0000 (21:23 -0800)]
mpxyd: add new M_READ_FROM_DONE state for send WR's and add more profiling options

new state added to work request flag along with a m_qp->wr_tl_rf field
to limit wr pending thread processing to just RF pending entries
and avoiding needless processing of M_SEND_POSTED entries.

Add more perf profiling capabilities to defer IB RDMA until after all the post_send
scif_readfrom's, first to last segment, are complete.

disable MCM_PROFILE_DBG compile option by default

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: purge posted send data only if inline
Arlin Davis [Thu, 6 Mar 2014 03:55:36 +0000 (19:55 -0800)]
mpxyd: purge posted send data only if inline

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: move to CONN state immediately on RTU_IN
Arlin Davis [Wed, 5 Mar 2014 22:07:43 +0000 (14:07 -0800)]
mpxyd: move to CONN state immediately on RTU_IN

reject RW_imm messages from remote proxy-in if
in disconnected state.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodtest: fix options on query times printf
Arlin Davis [Tue, 25 Feb 2014 18:29:49 +0000 (10:29 -0800)]
dtest: fix options on query times printf

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: fix ibctx leak with device open
Arlin Davis [Tue, 25 Feb 2014 18:28:51 +0000 (10:28 -0800)]
mpxyd: fix ibctx leak with device open

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodtest: add new -Q option to get provider list and query and time each
Arlin Davis [Mon, 24 Feb 2014 17:44:32 +0000 (09:44 -0800)]
dtest: add new -Q option to get provider list and query and time each

3 modes for -Q added to dtest:

-Q 1 == open/query/close, normal method
-Q 2 == open_query/close_query mode, new extensions
-Q 3 == both modes

Individual query and total times are reported. With
the new -Q option, dtest will query and exit without
connection or data transfer tests

If query extensions are not supported open_query will
simply return DAT_NOT_IMPLEMENTED.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoopenib: cleanup init/fini and device close on all providers
Arlin Davis [Mon, 24 Feb 2014 17:40:05 +0000 (09:40 -0800)]
openib: cleanup init/fini and device close on all providers

fd leak with init/fini with pipe for device thread.

Add check on close to insure HCA is on thread process
list before marking and waiting for destroy.

modify dev_list parsing and use device count returned
from IB verbs call.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodat: dat_ia_open should close provider after failure
Arlin Davis [Mon, 24 Feb 2014 17:11:30 +0000 (09:11 -0800)]
dat: dat_ia_open should close provider after failure

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: sync PI WC trigger to PO MP_SIG
Arlin Davis [Mon, 24 Feb 2014 17:03:15 +0000 (09:03 -0800)]
mpxyd: sync PI WC trigger to PO MP_SIG

The PI and PO segment comp signal was out of sync so the WC update
back from the PI incorrectly updated the m_po_buf_tl
on the PO side. Set signaling on both sides based
on the rdma_write initiator setting via the WR M_SEND_MP_SIG bit.

Modify some log levels and add check for m_idx during tail update.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: improve QP destruction to manage QP1 and QP2 variations
Arlin Davis [Wed, 19 Feb 2014 00:02:18 +0000 (16:02 -0800)]
mpxyd: improve QP destruction to manage QP1 and QP2 variations

With proxy-in and proxy-out connection combinations the
proxy agent sometimes manages 2 QPs. Change QP flush
and destruction to manage all combinations of QPs.

QP can also be on both tx and rx link-list for proxy-in
and proxy-out processing. QP free needs to be modified
to serialize and remove QP object from all lists.

Remove QPN option from mix_get_qp call.

Proxy-in RX_IMM message processing changed to validate
CM connected state and IB QP state before reposting.

Proxy-in pending_wr processing should send WC's to release
proxy buffers more frequently instead of on last segment.
With multiple QP's sharing proxy buffer it could stall
waiting for last segment WC's. It will now signal on last
segment or every 10th segment by default.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: proxy out doesn't release proxy buffer as quickly as necessary
Arlin Davis [Tue, 18 Feb 2014 23:23:37 +0000 (15:23 -0800)]
mpxyd: proxy out doesn't release proxy buffer as quickly as necessary

Change proxy-out WC processing to release the proxy-out buffer
during every event, not just consumer signaled events.
The remote proxy-in will only send WC if this WR segments has been
completely moved and is ready for reuse.

Change WR and proxy memory stall logic to limit retries to 5 seconds.
Print warning messages every 100 retries with appropriate queue info.

10 years agodtest: add times for open_query, remove sleep
Arlin Davis [Tue, 18 Feb 2014 22:47:02 +0000 (14:47 -0800)]
dtest: add times for open_query, remove sleep

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agocommon: add provider name and len to DTO error logging
Arlin Davis [Tue, 18 Feb 2014 22:45:18 +0000 (14:45 -0800)]
common: add provider name and len to DTO error logging

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agonew lightweight open_query/close_query IB extension for fast attribute query
Arlin Davis [Wed, 12 Feb 2014 22:55:25 +0000 (14:55 -0800)]
new lightweight open_query/close_query IB extension for fast attribute query

Consumers that need provider attributes must do a full device open
in order to get any provider/device information. With so many static device
entries in /etc/dat.conf consumers are building classification
mechanisms to identify provider type, locality, name, device
mode, and decide which device is appropriate. The existing DAT interface
doesn't provide a lightweight mechanism for queries.

The following fast query functions have been added to dat_ib_extensions.h:

dat_ib_open_query(name, ia_handle, ia_mask, ia_attr, prov_mask, prov_attr)
dat_ib_close_query(ia_handle)

In addition, DAT extension interface, dat_extension_op, has been
expanded to include new internal calls to handle quick provider load
and function linkage via udat_extension_open, and udat_extension_close
functions. Extended operations needing DAT open/close services need
to be defined from a DAT_OPEN_EXTENSION_BASE or DAT_CLOSE_EXTENSION_BASE
respectively.

NOTE: The ia_handle returned with open query must be closed with subsequent
close_query and not used with any other dat_ia_ operations. Attribute
storage from query_open is not valid after close_query call.

The IB extensions have been rolled to version 2.0.8 with this new API.
The changes are backward compatible.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: need CM to QP linking with CM references
Arlin Davis [Wed, 12 Feb 2014 21:41:37 +0000 (13:41 -0800)]
mpxyd: need CM to QP linking with CM references

Complete coding support for ref_cnt on CM to allow for
proper destruction of CM resourses. Ref count for CM alloc,
QP linking, and queue list. List dequeue will trigger CM
free, move to destroy state, and dealloc if ref_cnt is zero.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodist: ib collective and MIC extension include files missing
Arlin Davis [Tue, 11 Feb 2014 22:31:49 +0000 (14:31 -0800)]
dist: ib collective and MIC extension include files missing

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodapltest: the quit command is missing changes for -n option.
Arlin Davis [Tue, 11 Feb 2014 18:19:05 +0000 (10:19 -0800)]
dapltest: the quit command is missing changes for -n option.

Server-port was not being set properly during param init phase on the client side.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoNULL undefined on Fedora, incorrectly using kernel stddef.h
Arlin Davis [Tue, 11 Feb 2014 18:17:04 +0000 (10:17 -0800)]
NULL undefined on Fedora, incorrectly using kernel stddef.h

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoMerge branch 'proxy' of ssh://beany.openfabrics.org/home/ardavis/scm/dapl into proxy
Arlin Davis [Mon, 10 Feb 2014 17:45:35 +0000 (09:45 -0800)]
Merge branch 'proxy' of ssh://beany.openfabrics.org/home/ardavis/scm/dapl into proxy

10 years agoucm: fix CM service, initial rcv msg posts incorrect
Arlin Davis [Tue, 4 Feb 2014 03:17:33 +0000 (19:17 -0800)]
ucm: fix CM service, initial rcv msg posts incorrect

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoucm: add/cleanup debug log information
Arlin Davis [Tue, 4 Feb 2014 03:15:57 +0000 (19:15 -0800)]
ucm: add/cleanup debug log information

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoscm: add/cleanup debug log information
Arlin Davis [Tue, 4 Feb 2014 03:14:31 +0000 (19:14 -0800)]
scm: add/cleanup debug log information

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomakefile: update for MCM proxy-in changes
Arlin Davis [Tue, 4 Feb 2014 03:13:01 +0000 (19:13 -0800)]
makefile: update for MCM proxy-in changes

add mpxyd.h to dist files
separate funtionallity to multiple source files,
util.c, mix.c, mcm.c, mpxy_out.c, mpxy_in.c

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agodtest: update for ep_mode on MCM providers
Arlin Davis [Tue, 4 Feb 2014 03:12:06 +0000 (19:12 -0800)]
dtest: update for ep_mode on MCM providers

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd.conf: updated for proxy-in parameters
Arlin Davis [Tue, 4 Feb 2014 03:09:44 +0000 (19:09 -0800)]
mpxyd.conf: updated for proxy-in parameters

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: proxy-in added to proxy-out service to increase cross socket performance
Arlin Davis [Tue, 4 Feb 2014 02:58:04 +0000 (18:58 -0800)]
mpxyd: proxy-in added to proxy-out service to increase cross socket performance

Proxy-in service added to MCM dapl providers to
improve cross socket MIC adapter performance.

Additional RX thread created to handle PI service,
new CM wire protocol to exchange WR and WC references,
and new DTO wire protocol to Read remote PO data
segments and forward via SCIF writeto.

In order to maintain DAT API compatibility the IB
MR addr, rkeys are translated to SCIF addresses
and TPT entries created on the MPXYD to handle inbound
rmda writes targeted to MIC adapters.

Code broken out into separate source files:
mpxy_in.c - proxy_in service
mpxy_out.c - proxy_out service
util.c - general utilities
mix.c - MIC to HOST operations
mpxyd.c - device open, RX, TX, OP, CM threads.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: add proxy in support to MCM provider and MPXYD interface
Arlin Davis [Tue, 4 Feb 2014 02:49:07 +0000 (18:49 -0800)]
mcm: add proxy in support to MCM provider and MPXYD interface

Add dapli_mix_post_recv, dapli_mix_mr_create, dapli_mix_mr_free
no QPr exist on MIC with MXS to MXS connections
cm addr becomes addr1, save all QPr addr1 info during rejects
verify CM service exists before freeing port space, could be on mpxyd
system guid support to verify locality to inside/outside the box
change UD mode checking on EP instead of QP, QP doesnt exist on MXS
add reject support
Fix for CM service RX posting, walking queue doesnt include GRH.
Add system_guid field to MCM provider ib_hca_transport struct

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoopen_ib common: qp, cq, and post_recv changes for proxy-in
Arlin Davis [Tue, 4 Feb 2014 02:37:34 +0000 (18:37 -0800)]
open_ib common: qp, cq, and post_recv changes for proxy-in

Modify common QP, CQ, and DTO services to support proxy-in
service that eliminates the need for local QP and CQ resouces
on the MIC adapter.

Change WR UD type check to support no QP mode.
Add dapli_mix_post_recv funtionality for PI, QPr on mpxyd.
Store platform unique guid for EP locality - inside/outside.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agocommom: add lmr support for proxy in service
Arlin Davis [Tue, 4 Feb 2014 02:31:31 +0000 (18:31 -0800)]
commom: add lmr support for proxy in service

Registration details must be tranfered to proxy service
to enable proxy-in data transfers. IB registration
and SCIF registration is sent to mpxyd for inbound
rdma write TPT services for IB RW store and SCIF writeto
forward capabilities. Extend DAT LMR to include
scif information and ID. If proxy service is
in use call new functions dapli_mix_mr_create/free
to sync with mpxyd.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agonew definitions and states for CCL Proxy-in support
Arlin Davis [Tue, 4 Feb 2014 02:26:53 +0000 (18:26 -0800)]
new definitions and states for CCL Proxy-in support

MCM proxy data limits, new CM free state, EP mode support for EP locallity
New ep mapping field in address structure
New MIX ops mix_recv and mix_cm_reject_user
Expanded MIX ops mr structure to include IB and SCIF details
Changed dat_mix_send struct name to dat_mix_sr for send and recv

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoRelease dapl-2.0.39.1-1
Arlin Davis [Thu, 5 Dec 2013 17:43:44 +0000 (09:43 -0800)]
Release dapl-2.0.39.1-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoRename back to dapl, small fixes for spec file
pmmccorm [Fri, 22 Nov 2013 20:31:34 +0000 (12:31 -0800)]
Rename back to dapl, small fixes for spec file

10 years agoRelease intel-mic-ofed-dapl-2.0.36.12-1
Arlin Davis [Wed, 25 Sep 2013 22:14:19 +0000 (15:14 -0700)]
Release intel-mic-ofed-dapl-2.0.36.12-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoucm, scm: UD mode triggers list_head assert with large scale alltoall test
Arlin Davis [Wed, 25 Sep 2013 22:10:56 +0000 (15:10 -0700)]
ucm, scm: UD mode triggers list_head assert with large scale alltoall test

1024+ ranks, IMB alltoall may hit assert when running Intel MPI in UD mode.

CR clean up was implemented with EP to CR references still linked.
During cr_accept, the CR remote_ia_address is linked to EP object
by mistake with UD mode. UD mode my have multiple CRs per EP so
no direct mappings to CR memory can exist unless RC mode which
always has one EP to CR mapping.

In scm, ucm: for CM object free with CR references the search and
unlinking from SP must be under SP lock to serialize. Also,
cleanup thread wakeup logic to only trigger the thread if
reference count indicates the need for more processing.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: ERR: stalled, insufficient proxy memory
Arlin Davis [Fri, 13 Sep 2013 22:12:05 +0000 (15:12 -0700)]
mpxyd: ERR: stalled, insufficient proxy memory

When scaling up/out with lots of QP's using shared
proxy buffer the rdma writes can block waiting for
memory to free. The signal rate on the posted
writes must be reduced to insure proxy buffer
are freed in a more timely manner.

Add logic to return failure if stalling becomes
excessive.

Allow administrator to adjust IB mcm_signal_rate
via mpxyd.conf. Default is now 10 instead of 100.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: handle catastrophic IB async events, including IBV_EVENT_LID_CHANGE
Arlin Davis [Thu, 12 Sep 2013 21:03:58 +0000 (14:03 -0700)]
mpxyd: handle catastrophic IB async events, including IBV_EVENT_LID_CHANGE

cleanup mdev destroy functions, use mcm_ib_async_str for all IB events.
Destroy all mdev resouces, including CM services, and abort all
open clients when receiving the following IB async events:

IBV_EVENT_PATH_MIG
IBV_EVENT_PATH_MIG_ERR
IBV_EVENT_DEVICE_FATAL
IBV_EVENT_PORT_ERR
IBV_EVENT_LID_CHANGE
IBV_EVENT_PKEY_CHANGE
IBV_EVENT_SM_CHANGE

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: add get/set prov attributes op, str print for ib async events
Arlin Davis [Thu, 12 Sep 2013 21:01:05 +0000 (14:01 -0700)]
mcm: add get/set prov attributes op, str print for ib async events

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agomcm: reduce max qp depth and msg size in proxy mode, allow override
Arlin Davis [Thu, 12 Sep 2013 16:12:55 +0000 (09:12 -0700)]
mcm: reduce max qp depth and msg size in proxy mode, allow override

DAPL_MCM_WR_MAX is used set max qp depth on mcm provider, default=500
DAPL_MCM_MSG_MAX is used set max msg size on mcm provider, default=8388608
DAPL_WR_MAX is used to override max qp depth on all IB providers.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: CM_REPLY: RETRIES (7) EXHAUSTED
Arlin Davis [Wed, 11 Sep 2013 22:04:37 +0000 (15:04 -0700)]
mpxyd: CM_REPLY: RETRIES (7) EXHAUSTED

The clients RTU is not processed by mpxyd thread in corner cases.
The SCIF EP, handling the client cm thread (scif_ev_ep) operations,
was not added to select FD set so the op_thread didn't wake up in the
case where RTU's were sent on scif_ev_ep and no operations are
being sent on scif_op_ep.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: reduce default proxy buffer and max message size
Arlin Davis [Wed, 11 Sep 2013 18:56:09 +0000 (11:56 -0700)]
mpxyd: reduce default proxy buffer and max message size

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: set eager completion on by default
Arlin Davis [Tue, 10 Sep 2013 16:26:17 +0000 (09:26 -0700)]
mpxyd: set eager completion on by default

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agocommon: cleanup async event processing and logging
Arlin Davis [Tue, 10 Sep 2013 16:19:18 +0000 (09:19 -0700)]
common: cleanup async event processing and logging

Add formatted string print for ib verbs async events
Remove unecessary logging and duplicate async callbacks
Modify all IB providers to use dapli_async_event_cb()

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoRelease intel-mic-ofed-dapl-2.0.36.11-1
Arlin Davis [Fri, 9 Aug 2013 18:18:04 +0000 (11:18 -0700)]
Release intel-mic-ofed-dapl-2.0.36.11-1

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agoallow DAPL_DBG_TYPE settings between device opens
Arlin Davis [Fri, 9 Aug 2013 18:14:50 +0000 (11:14 -0700)]
allow DAPL_DBG_TYPE settings between device opens

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: add warning message and enable counters for CQ/QP
Arlin Davis [Thu, 8 Aug 2013 23:42:04 +0000 (16:42 -0700)]
mpxyd: add warning message and enable counters for CQ/QP

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
10 years agompxyd: add new logging levels, bit mapped for better control
Arlin Davis [Tue, 6 Aug 2013 19:54:38 +0000 (12:54 -0700)]
mpxyd: add new logging levels, bit mapped for better control

 log_level:
 Indicates the amount of detailed data written to the log file.  Log levels
 are bit mapped as follow: 0xf for full verbose

 0x0 - errors always reported
 0x1 - warnings
 0x2 - cm operations
 0x4 - data operations
 0x8 - all other operations

default is still 0 == errors only

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>