]> git.openfabrics.org - ~ardavis/dapl.git/log
~ardavis/dapl.git
8 years agodtestx: add dat_ib_open_query only option with -q
Arlin Davis [Fri, 16 Oct 2015 20:08:11 +0000 (13:08 -0700)]
dtestx: add dat_ib_open_query only option with -q

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agoscm: CONN_PENDING: SOCKOPT ERR Connection refused ->
Arlin Davis [Fri, 16 Oct 2015 17:21:19 +0000 (10:21 -0700)]
scm: CONN_PENDING: SOCKOPT ERR Connection refused ->

Error caused by cm_msg size compatability issue with new v8
protocol and older socket cm providers (2.1.4 and older).
The ucm, cma, and mcm providers are not affected.

Modify socket data sizes for SCM request/reply to interoperate
between new v8 with smaller private data and older protocols.

Adjust SCM reply/rtu based on remote CM version and retry a failed
request with pre-v8 adjusted size in case of server side failure.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agoRelease 2.1.7 dapl-2.1.7-1
Arlin Davis [Wed, 30 Sep 2015 03:23:58 +0000 (20:23 -0700)]
Release 2.1.7

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agodtest: add -a -i options, all data sizes, incremental size
Arlin Davis [Tue, 29 Sep 2015 16:05:27 +0000 (09:05 -0700)]
dtest: add -a -i options, all data sizes, incremental size

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agodapl: Fix segfault while freeing qp
Bharat Potnuri [Tue, 29 Sep 2015 15:49:10 +0000 (08:49 -0700)]
dapl: Fix segfault while freeing qp

In function dapls_ib_qp_free(), pointers qp and cm_ptr->cm_id->qp are pointing to the same qp
structure, initialized in function dapls_ib_qp_alloc(). The memory pointed by these pointers are freed
twice in function dapls_ib_qp_free(), using rdma_destroy_qp() for the case _OPENIB_CMA defined and
then further using ibv_destroy_qp(), causing a segmentation fault while freeing the qp. Therefore
assigned NULL value to qp to avoid freeing illegal memory.

Fixes: 7ff4f840bf11 ("common: add CM-EP linking to support mutiple CM's and proper protection during
destruction")

Signed-off-by: Bharat Potnuri <bharat@chelsio.com>
Acked-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agompxyd: add P2P inline support for data size <= 96 bytes
Amir Hanania [Wed, 23 Sep 2015 21:43:38 +0000 (14:43 -0700)]
mpxyd: add P2P inline support for data size <= 96 bytes

Improve small message latency for proxy to proxy service
by including data with the proxy work request. Necessary
changes made to preservie order across WR's regardless
of size. Additional logging included. Improves single byte
one-way latency of about 27% on MFO configurations.

Changes made to avoid forwarding 0-byte rdma write to
scif_writeto, remove CPU hand copies, and order.

Changes for numa_node == -1 such that mic0 assumes MSS
and mic1 assumes MXS modes.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agodtest: change rdma_write_ping_pong so client is always last receiver
Arlin Davis [Mon, 21 Sep 2015 22:48:15 +0000 (15:48 -0700)]
dtest: change rdma_write_ping_pong so client is always last receiver

server always waits after test loops for DREQ event so in order
to gracefully shutdown client should always receive last handshake
message and issue DREQ. Remove logging in loop.

Always init data and increase min rdma buffer size to 4KB.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agoucm: add DAPL_NETWORK_PROCESS_NUM option for total ranks
Arlin Davis [Mon, 21 Sep 2015 15:24:01 +0000 (08:24 -0700)]
ucm: add DAPL_NETWORK_PROCESS_NUM option for total ranks

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agoucm: fca create group incorrectly using IB addr instead of socket address.
Amir Hanania [Thu, 17 Sep 2015 00:31:13 +0000 (17:31 -0700)]
ucm: fca create group incorrectly using IB addr instead of socket address.

need the socket address for socket based create group info exchange.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agoucm: fca_comm_destroy called with NULL
Amir Hanania [Thu, 17 Sep 2015 00:27:27 +0000 (17:27 -0700)]
ucm: fca_comm_destroy called with NULL

In some cases dapli_free_collective_group is called without the comm was initialized.
fca_comm_destroy call in this func seg fault.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agodtest: add -W option for rdma write pinpong, similiar to ib_write_lat
Arlin Davis [Tue, 15 Sep 2015 15:45:03 +0000 (08:45 -0700)]
dtest: add -W option for rdma write pinpong, similiar to ib_write_lat

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agodocs: update release notes for collective build
Arlin Davis [Mon, 31 Aug 2015 22:14:46 +0000 (15:14 -0700)]
docs: update release notes for collective build

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agompxyd: reduce log level for rcv message flush
Amir Hanania [Mon, 24 Aug 2015 20:22:53 +0000 (13:22 -0700)]
mpxyd: reduce log level for rcv message flush

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agodapltest: dapltest with no argument not working in ppc64 arch
Carol L Soto [Mon, 24 Aug 2015 19:58:58 +0000 (12:58 -0700)]
dapltest: dapltest with no argument not working in ppc64 arch

If dapltest is run with no args then the client was getting
Warning: conn_event_wait DAT_CONNECTION_EVENT_NON_PEER_REJECTED
Reference to RH1056487- dapltest Read and Write performance
tests are not working

Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
8 years agoRelease 2.1.6 dapl-2.1.6-1
Arlin Davis [Thu, 13 Aug 2015 16:55:47 +0000 (09:55 -0700)]
Release 2.1.6

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agoucm: add cluster size environments to adjust CM timers
Arlin Davis [Thu, 13 Aug 2015 00:30:23 +0000 (17:30 -0700)]
ucm: add cluster size environments to adjust CM timers

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agompxyd: proxy_in data transfers can improperly start before RTU received
Arlin Davis [Wed, 12 Aug 2015 16:46:30 +0000 (09:46 -0700)]
mpxyd: proxy_in data transfers can improperly start before RTU received

Proxy-in data transfers must be defered until RTU is received
and QP is in CONN state. Otherwise, the remote PI WC address/rkey
information is still unitialized.

Check for initial CONN state before processing RR or WT data phase
and set RR to pause state until RTU and remote PI WRC information
is processed. Update pi_req_event error logging.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agomcm: forward open/query for MFO devices in query only mode
Arlin Davis [Wed, 12 Aug 2015 16:19:07 +0000 (09:19 -0700)]
mcm: forward open/query for MFO devices in query only mode

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agompxyd: byte swap incorrect on WRC wr_len
Arlin Davis [Wed, 12 Aug 2015 15:51:03 +0000 (08:51 -0700)]
mpxyd: byte swap incorrect on WRC wr_len

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agodtest: remove ERR message from flush QP function
Amir Hanania [Tue, 11 Aug 2015 00:24:15 +0000 (17:24 -0700)]
dtest: remove ERR message from flush QP function

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agodapltest: Quit command with "-n port" number will core dump
David Dai [Fri, 7 Aug 2015 20:05:56 +0000 (13:05 -0700)]
dapltest: Quit command with "-n port" number will core dump

-n option specified with n, should be n:

Signed-off-by: David Dai <zdai@linux.vnet.ibm.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agoconfig: update dat.conf for MFO qib devices, 2 adapters/ports
Amir Hanania [Wed, 5 Aug 2015 22:01:49 +0000 (15:01 -0700)]
config: update dat.conf for MFO qib devices, 2 adapters/ports

ofa-v2-qib0-1m and libdaplomcm.so

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agompxyd: add MFO support on proxy side
Amir Hanania [Wed, 5 Aug 2015 21:55:30 +0000 (14:55 -0700)]
mpxyd: add MFO support on proxy side

Add checking for MFO and MXS and provide proxy-in and proxy-out
services for each mode. MXS_EP check is now MXF_EP (MFO or MXS).
Add new MIX device open, query, port query, pz operations.
Add new pz list and object management via scif_dev structure.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agomcm: add MFO proxy commands, device, and CM support
Amir Hanania [Wed, 5 Aug 2015 21:46:20 +0000 (14:46 -0700)]
mcm: add MFO proxy commands, device, and CM support

CM will support Proxy-in services on both MFO and MXS modes.
CM thread will not process ibv channels when in MFO mode.

Device open/close will export all verbs calls in MFO mode.

Add MIX (MIC to Proxy) functions for pz, device query, port query.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agomcm: add MFO support to openib_common code base
Amir Hanania [Wed, 5 Aug 2015 20:41:32 +0000 (13:41 -0700)]
mcm: add MFO support to openib_common code base

Provide full proxy support of CQ, QP, PZ, MR and device.
Use use new MXF_EP macro to switch proxy service based
on MXS (cross socket) or MFO (full offload) modes.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agomcm: add full offload (MFO) mode to provider to support qib on MIC
Amir Hanania [Wed, 5 Aug 2015 20:35:28 +0000 (13:35 -0700)]
mcm: add full offload (MFO) mode to provider to support qib on MIC

Add new MIX proxy definitions and commands for query device, query port,
pz create, and pz free.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agodtest: pre-allocated buffer too small for RMR, DTO ops timeout
Amir Hanania [Wed, 5 Aug 2015 20:16:12 +0000 (13:16 -0700)]
dtest: pre-allocated buffer too small for RMR, DTO ops timeout

The buf_len settings (-b) for small IO may cause segfault.
Increase allocation and adjust DTO operations to infinite.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
Signed-off-by: Amir Hanania <amir.hanania@intel.com>
8 years agompxyd: fix buffer initialization when no-inline support is active
Amir Hanania [Fri, 31 Jul 2015 22:35:12 +0000 (15:35 -0700)]
mpxyd: fix buffer initialization when no-inline support is active

wr_buf buffer was zeroed instead of wr_buf_rx

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agompxyd: reduce log level on qp_flush to CM level
Arlin Davis [Thu, 30 Jul 2015 15:16:17 +0000 (08:16 -0700)]
mpxyd: reduce log level on qp_flush to CM level

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agomcm: intra-node proxy missing LID setup on rejects
Arlin Davis [Thu, 30 Jul 2015 15:15:22 +0000 (08:15 -0700)]
mcm: intra-node proxy missing LID setup on rejects

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agomcm: add intra-node support via ibscif device and mcm provider
Arlin Davis [Fri, 24 Jul 2015 23:01:29 +0000 (16:01 -0700)]
mcm: add intra-node support via ibscif device and mcm provider

- New device entry ofa-v2-scif0-m
- Support for different CM and EP locality (MIC vs proxy LID)
- MSS mode for all scif device opens via proxy
- logging changes for multi-lid options

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agomcm: provide MIC address info with proxy device open
Arlin Davis [Fri, 24 Jul 2015 19:48:52 +0000 (12:48 -0700)]
mcm: provide MIC address info with proxy device open

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
8 years agomcm: add device info to non-debug log
Arlin Davis [Fri, 24 Jul 2015 19:45:11 +0000 (12:45 -0700)]
mcm: add device info to non-debug log

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agocommon: add DAPL_DTO_TYPE_EXTENSION_IMM for rdma_write_imm DTO type checking
Arlin Davis [Tue, 14 Jul 2015 22:41:35 +0000 (15:41 -0700)]
common: add DAPL_DTO_TYPE_EXTENSION_IMM for rdma_write_imm DTO type checking

Add new extended DTO type to request cookie to identify rdma write operations
with immediate data during completions.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: fix up some of the PI logging
Arlin Davis [Tue, 14 Jul 2015 22:39:52 +0000 (15:39 -0700)]
mpxyd: fix up some of the PI logging

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodtest: modify rdma_write_with_msg to support uni-direction streaming
Arlin Davis [Tue, 14 Jul 2015 22:30:16 +0000 (15:30 -0700)]
dtest: modify rdma_write_with_msg to support uni-direction streaming

add proper client->server handshake at end of rdma data stream
to insure all data is delivered before disconnecting.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agomcm,mpxyd: fix dreq processing to defer QP flush when proxy WRs still pending
Arlin Davis [Tue, 14 Jul 2015 21:58:32 +0000 (14:58 -0700)]
mcm,mpxyd: fix dreq processing to defer QP flush when proxy WRs still pending

The proxy will now defer DREQ flushing of proxy QPs if PI and PO
data engines have outstanding requests. Add mcm_qp_busy routine
for checking PI and PO data engines. When MIC calls disconnect
always send DREQ up to proxy in order to handle deferred flush
of proxy side posted rcv messages.

Change QP free to modify both local and proxy QPs and check for
outstanding rcv message before qp_destroy to avoid infinite wait
in dapls_ep_flush_cqs.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: update byte_len and comp_cnt for PO to remote HST communications
Arlin Davis [Tue, 14 Jul 2015 21:47:24 +0000 (14:47 -0700)]
mpxyd: update byte_len and comp_cnt for PO to remote HST communications

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agomcm: bug fixes for non-inline devices
Amir Hanania [Wed, 17 Jun 2015 17:12:24 +0000 (10:12 -0700)]
mcm: bug fixes for non-inline devices

mcm proxy mi_send_pi setup registered WR structure properly for no
inline data support but incorrectly overwrote sg.addr with WR
WR structure on stack.

qp create didn't check for no inline and setup create accordingly

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agomcm: return CM_rej with CM_req_in errors
Arlin Davis [Fri, 12 Jun 2015 20:56:38 +0000 (13:56 -0700)]
mcm: return CM_rej with CM_req_in errors

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd,mcm: RDMA write with immed data not signaled on request side
Arlin Davis [Fri, 5 Jun 2015 19:14:37 +0000 (12:14 -0700)]
mpxyd,mcm: RDMA write with immed data not signaled on request side

With eager completions set, the wc_flags is not set properly on event.
With eager completions no set, the proxy CQ reference is incorrect
and event is forwarded to MCM receive EVD instead of transmit EVD.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agomcm: add WC opcode and wc_flags in debug log message
Arlin Davis [Thu, 4 Jun 2015 23:53:59 +0000 (16:53 -0700)]
mcm: add WC opcode and wc_flags in debug log message

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: set options bug fix for mcm_ib_inline
Arlin Davis [Thu, 4 Jun 2015 23:52:11 +0000 (16:52 -0700)]
mpxyd: set options bug fix for mcm_ib_inline

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoUpdate release notes with latest CM times
Arlin Davis [Thu, 28 May 2015 15:22:24 +0000 (08:22 -0700)]
Update release notes with latest CM times

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoRelease 2.1.5 dapl-2.1.5-1
Arlin Davis [Tue, 26 May 2015 17:28:11 +0000 (10:28 -0700)]
Release 2.1.5

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoupdate release notes, readme files
Arlin Davis [Tue, 26 May 2015 17:06:44 +0000 (10:06 -0700)]
update release notes, readme files

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodat.conf: update comments regarding versions
Arlin Davis [Tue, 26 May 2015 16:37:40 +0000 (09:37 -0700)]
dat.conf: update comments regarding versions

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodtest: add logging of provider private data size with -v
Arlin Davis [Fri, 22 May 2015 16:53:23 +0000 (09:53 -0700)]
dtest: add logging of provider private data size with -v

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoscm: remove use of msg.resv field for process id logging
Arlin Davis [Fri, 22 May 2015 16:52:31 +0000 (09:52 -0700)]
scm: remove use of msg.resv field for process id logging

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agocma: report correct CM req private data size on query
Arlin Davis [Fri, 22 May 2015 16:51:04 +0000 (09:51 -0700)]
cma: report correct CM req private data size on query

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: memset ib_wr structure before post_send on WC and WR requests
Arlin Davis [Wed, 20 May 2015 18:56:24 +0000 (11:56 -0700)]
mpxyd: memset ib_wr structure before post_send on WC and WR requests

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agomcm: add HST side provider support for device without inline data capability
Arlin Davis [Wed, 20 May 2015 18:43:03 +0000 (11:43 -0700)]
mcm: add HST side provider support for device without inline data capability

Add registered WR buffers for HST->MXS (proxy in) mode
when inline data is not supported by device. Use registered
memory for source WR buffer instead of stack when sending
RDMA write request to peer proxy-in service.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoucm: CM changes for UD extended port space and indexer
Arlin Davis [Mon, 18 May 2015 21:51:08 +0000 (14:51 -0700)]
ucm: CM changes for UD extended port space and indexer

Tested on 1200n 28ppn cluster, AlltoAll Intel MPI, UD mode.
Both static and dynamic modes, over 500m connections.

Change port manager to indexer and service ID manager
to bitarray indexer. Reduces footprint for service IDs
and allow direct lookup on CM messages.

New insert, remove, lookup functions for processing ID
based CM objects. Inbound requests, with the exception
of new CM requests, will no longer parse list but
use hash table lookups.

AH caching is now used to prevent unnecessarily
creating multiple AH's for same QP destination.

Add 24-bit port space support to CM processing code and
to wire protocol via DCM message reserve space.
Add version check to limit to 16-bit for backward compatibility.

Bump CM protocol version to 8 for xport and rtns fields.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoucm: add device support for new port space hash table
Arlin Davis [Mon, 18 May 2015 21:36:28 +0000 (14:36 -0700)]
ucm: add device support for new port space hash table

Allocate port space hash table during device open
when creating CM services. Default settings are set
to 4K entry chunks and 256K total port slots.

Add environment variables for adjustments

DAPL_UCM_ENTRY_BITS 11
DAPL_UCM_ARRAY_BITS 18

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoucm: allocate/free AH hash table for UD endpoint types
Arlin Davis [Mon, 18 May 2015 21:34:57 +0000 (14:34 -0700)]
ucm: allocate/free AH hash table for UD endpoint types

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoucm: check for AH caching when destroying via UD extension
Arlin Davis [Mon, 18 May 2015 21:31:57 +0000 (14:31 -0700)]
ucm: check for AH caching when destroying via UD extension

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoucm: optimizations for large scale UD communication management
Arlin Davis [Mon, 18 May 2015 21:21:07 +0000 (14:21 -0700)]
ucm: optimizations for large scale UD communication management

AH caching per QP, AH space set to 48K for LID unicast
Bump port space up to 24 bits
Reduce CM object and reduce private data to 68 bytes
Add xport space and rtns to DCM reserve fields.

New indexer macros for port space hash table management

Add hash table storage to ibtrans device objects

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: use wr opcode instead of wc opcode to support logging on error cases
Arlin Davis [Fri, 15 May 2015 22:51:31 +0000 (15:51 -0700)]
mpxyd: use wr opcode instead of wc opcode to support logging on error cases

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agomcm: HST->MXS mode, using RDMA_WRITE_WITH_IMM, fails with dtest -w
Arlin Davis [Fri, 15 May 2015 22:47:38 +0000 (15:47 -0700)]
mcm: HST->MXS mode, using RDMA_WRITE_WITH_IMM, fails with dtest -w

Host side incorrectly sets opcode to IBV_WR_RDMA_WRITE_WITH_IMM on every segment
instead of just the last segment.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodapl: aarch64 support for linux
Mark Salter [Wed, 13 May 2015 23:40:58 +0000 (16:40 -0700)]
dapl: aarch64 support for linux

Add atomic ops to fix builds for aarch64 Linux.

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodapltest: add scripts to dist, set default device to IPoIB
Arlin Davis [Tue, 5 May 2015 18:13:15 +0000 (11:13 -0700)]
dapltest: add scripts to dist, set default device to IPoIB

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: add wc_flags to proxy work completions
Arlin Davis [Thu, 30 Apr 2015 21:04:09 +0000 (14:04 -0700)]
mpxyd: add wc_flags to proxy work completions

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoRelease 2.1.4 dapl-2.1.4-1
Arlin Davis [Fri, 20 Mar 2015 00:05:08 +0000 (17:05 -0700)]
Release 2.1.4

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: fix typo in configuration file
Arlin Davis [Thu, 19 Mar 2015 23:54:17 +0000 (16:54 -0700)]
mpxyd: fix typo in configuration file

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agocma: RR attributes moved to common ib_cm struct
Arlin Davis [Thu, 12 Mar 2015 20:02:40 +0000 (16:02 -0400)]
cma: RR attributes moved to common ib_cm struct

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: tx thread incorrectly sleeps with negative pi_rw_cnt value
Arlin Davis [Thu, 12 Mar 2015 20:00:20 +0000 (16:00 -0400)]
mpxyd: tx thread incorrectly sleeps with negative pi_rw_cnt value

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodat.conf: add entries for True Scale qib device
Arlin Davis [Mon, 9 Mar 2015 14:31:13 +0000 (10:31 -0400)]
dat.conf: add entries for True Scale qib device

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: add support for devices without inline data support
Arlin Davis [Thu, 12 Feb 2015 20:21:37 +0000 (15:21 -0500)]
mpxyd: add support for devices without inline data support

Add function to check for inline support during device open.
If inline data is not supported, the CM service and Proxy
data mover will not use inline data option on small IO.

The PO->PI service will now allocate and register necessary
memory to send mcm_wr_rx and mcm_wc_rx operations from
registered memory locations if inline data not supported.
If inline is supported, no extra memory will be allocated
and src buffer will be built on stack as before.

Cleanup some build warnings.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoucm: long disconnect times with many-to-one applications
Arlin Davis [Wed, 4 Feb 2015 00:27:50 +0000 (16:27 -0800)]
ucm: long disconnect times with many-to-one applications

Improve the DREQ/DREP handshake and state machine to handle
DREQ crossings and dropped DREP with new timewait state.

Change dat_ep_disconnect call to ensure non-blocking
regardless of flags or state.

Add adjustable disconnect reply timer and retry count.
DCM_DREP_TIME, DCM_DREQ_RETRY

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoopenib: add inline data support check during device open
Arlin Davis [Thu, 22 Jan 2015 23:49:25 +0000 (15:49 -0800)]
openib: add inline data support check during device open

Not all rdma devices support inline data, however without
a verbs device attribute the only way to determine
support is with a QP create with max_inline_send set.
Add a common function to verify inline data support
before setting default to 64 bytes.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agocleanup ib/cm attribute management across openib providers
Arlin Davis [Tue, 6 Jan 2015 22:01:39 +0000 (14:01 -0800)]
cleanup ib/cm attribute management across openib providers

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodapltest: fix -Werror=format-security issue with printf
Arlin Davis [Tue, 6 Jan 2015 21:43:57 +0000 (13:43 -0800)]
dapltest: fix -Werror=format-security issue with printf

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoRelease 2.1.3 dapl-2.1.3
Arlin Davis [Mon, 15 Dec 2014 20:47:44 +0000 (12:47 -0800)]
Release 2.1.3

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodapl: mpxyd service changes to support multi-thread single-core option
Arlin Davis [Mon, 15 Dec 2014 20:15:54 +0000 (12:15 -0800)]
dapl: mpxyd service changes to support multi-thread single-core option

The proxy service has been changed to reduce the number of cores required
on the host side. Provides new option, via mpxyd.conf, to use single-core
and allow system adminitrator to bind to specific core id for all Intel
Xeon Phi adapters in the platform.

mcm_affinity = 2 will set to single core (per Intel Xeon Phi).
mcm_affinity_base_mic will set to specific core for all adapters.

Best performance can be acheived with mcm_affinity = 2 and
mcm_affinity_base_mic == 0. This option will cause single core
to remain busy, polling operations from clients, as long
as long as device is open and being used by clients for data
transfers.

Default remains mcm_affinity = 1, multi-thread, multi-core.

See mpxyd.conf for details.

Proxy services work threads have been modified to yield
and limit work processing when data flow is pending.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodapl: add rdma_write_imm and write only option to dtest
Arlin Davis [Mon, 15 Dec 2014 20:05:33 +0000 (12:05 -0800)]
dapl: add rdma_write_imm and write only option to dtest

New write_only (-w) option with rdma_write_imm can
be used with providers that support IB extensions.
Allows more options for write bandwith profiling
with immediate data and signaling rate options
to increase write data rates, especially on MIC
clients that use proxy services.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoucm: add time wait override capability for CM services
Arlin Davis [Tue, 9 Dec 2014 23:35:59 +0000 (15:35 -0800)]
ucm: add time wait override capability for CM services

New environment variable DAPL_UCM_WAIT_TIME (ms) to
override the default wait_time for CM services.
Default setting is 60 seconds.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agocommon: dapl_ep_free must serialize CM object destroy
Arlin Davis [Tue, 9 Dec 2014 22:40:08 +0000 (14:40 -0800)]
common: dapl_ep_free must serialize CM object destroy

CM object could be destroyed from time_wait state from
provider in separate thread. Destruction must be serialized
with EP lock.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodtestx: allow scale up to 1000 EP's
Arlin Davis [Thu, 13 Nov 2014 18:36:33 +0000 (10:36 -0800)]
dtestx: allow scale up to 1000 EP's

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoucm: RTU not retransmitted in TIMEWAIT state
Arlin Davis [Thu, 13 Nov 2014 18:34:52 +0000 (10:34 -0800)]
ucm: RTU not retransmitted in TIMEWAIT state

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: increase max open files for service
Arlin Davis [Wed, 5 Nov 2014 18:10:55 +0000 (10:10 -0800)]
mpxyd: increase max open files for service

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: DTO completion ERR: status 12, op RDMA_WRITE running MPI alltoall test
Arlin Davis [Fri, 21 Nov 2014 22:26:40 +0000 (14:26 -0800)]
mpxyd: DTO completion ERR: status 12, op RDMA_WRITE running MPI alltoall test

Running MIC scale-up configuration with mcm provider on a MXS node
instead of shm causes DTO error due to heavy use of proxy-in buffer pools.
Hit corner case where proxy buffer management hd ptr crossed tl
ptr due to 64 byte alignment on start when hd < 64 bytes behind tl.

Add additional checking on PO and PI buffer management to handle
the case of HD passing TL on start locations. Also changed PO
processing to hold lock until hd ptr is registered with buf_wc slot
management to preserve order of memory usage across threads.

Reduced the size of WC queue for PO and PI buffer management.

Profiling, via MCM_PROFILE, was added to monitor and trigger buffer
management errors.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agomcm: HST->MXS mode incorrectly signals multiple fragments per WR
Arlin Davis [Mon, 13 Oct 2014 21:10:36 +0000 (14:10 -0700)]
mcm: HST->MXS mode incorrectly signals multiple fragments per WR

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agomcm: add segmentation to HST->MXS mode for improved performance
Arlin Davis [Thu, 9 Oct 2014 22:23:24 +0000 (15:23 -0700)]
mcm: add segmentation to HST->MXS mode for improved performance

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: set global seg_sz to 128KB for proxy data service
Arlin Davis [Thu, 9 Oct 2014 22:21:02 +0000 (15:21 -0700)]
mpxyd: set global seg_sz to 128KB for proxy data service

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoopenib: add port_num to provider named attributes
Arlin Davis [Mon, 6 Oct 2014 20:54:39 +0000 (13:54 -0700)]
openib: add port_num to provider named attributes

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agomcm: provide CPU family/model attribute on both host and mic sides
Arlin Davis [Mon, 6 Oct 2014 19:50:09 +0000 (12:50 -0700)]
mcm: provide CPU family/model attribute on both host and mic sides

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodtestx: update IB extension example test with new v2.0.9 features
Arlin Davis [Tue, 30 Sep 2014 21:07:52 +0000 (14:07 -0700)]
dtestx: update IB extension example test with new v2.0.9 features

Add support for new IB extensions for CM and AH resource cleanup.
Check for v2.0.9 and call dat_ib_ud_cm_free after connection
establishment and dat_ib_ud_ah_free after all data has been
transfered on UD endpoints.

Also add socket based address exchange to eliminate the need
to include lid and qpn parameters on the client side.

Change the multiple EP mode to send from EP 0 to EP[0-3] on
server side and EP[0-3] to EP[0-3] on client side.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agodtest: add dtestsrq for SRQ example and provider testing
Amir Hanania [Thu, 25 Sep 2014 23:34:20 +0000 (16:34 -0700)]
dtest: add dtestsrq for SRQ example and provider testing

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agocommon: add srq support for openib verbs providers
Amir Hanania [Thu, 25 Sep 2014 23:32:06 +0000 (16:32 -0700)]
common: add srq support for openib verbs providers

Add necessary components and hooks to support ib_verbs shared
receive queues for both RC and UD QP's. External interfaces
were already provided per DAT 2.0 specification but internal
support was missing.

A new dtestsrq will be provided with package for testing and
example code.

Acked-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoopenib: add IB UD cm_free/ah_free extension support in UCM provider
Arlin Davis [Thu, 25 Sep 2014 23:06:33 +0000 (16:06 -0700)]
openib: add IB UD cm_free/ah_free extension support in UCM provider

Make changes to UCM provider for new CM and AH destroy extensions.
Allow consumer to schedule CM object destroy after CM connection
event has been processed. Active side will put CM object in
TIMEWAIT in case RTU is dropped, passive side can schedule
CM object destroy immediatly when called. In the case where
consumer requests CM object destroy, the provider will remove
all internal references to AH since consumer will call AH
destroy directly when finished with UD sends.

All other providers, MCM, CMA, SCM will return UNSUPPORTED
if new extensions are called.

See dtestx source for code examples of new extensions.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoopenib: add new TIMEWAIT state for CM
Arlin Davis [Thu, 25 Sep 2014 23:01:33 +0000 (16:01 -0700)]
openib: add new TIMEWAIT state for CM

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoextension: add IB UD extensions to reduce provider CM and AH memory footprint
Arlin Davis [Thu, 25 Sep 2014 22:42:38 +0000 (15:42 -0700)]
extension: add IB UD extensions to reduce provider CM and AH memory footprint

dat_ib_ud_cm_free, dat_ib_ud_ah_free added to allow consumers
the option to free provider CM and AH objects, related to AH resolution,
immediately after consuming CONN events instead of waiting for
EP destroy. With existing UD service providers the CM and AH objects
are linked to EP and not destroyed until consumer calls dat_ep_free.

dat_ib_ud_cm_free() frees CM object after AH and private data are copied
and stored by consumer. Provider will destroy internal object
and memory associated with CM and AH resolution.
MAY be called after CM establishment and before EP destroyed

dat_ib_ud_ah_free() destroys UD Address Handle (AH).
MUST be called after all UD sends are complete and
before UD EP is destroyed.

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd/mcm: add provider specific attribute DAT_IB_PROXY_VERSION
Arlin Davis [Mon, 15 Sep 2014 17:30:56 +0000 (10:30 -0700)]
mpxyd/mcm: add provider specific attribute DAT_IB_PROXY_VERSION

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: log warning if running in COMPAT mode
Arlin Davis [Mon, 15 Sep 2014 17:28:40 +0000 (10:28 -0700)]
mpxyd: log warning if running in COMPAT mode

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoadd provider and proxy support for GUID across platform
Arlin Davis [Fri, 5 Sep 2014 15:07:04 +0000 (08:07 -0700)]
add provider and proxy support for GUID across platform

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agocommon: return appropriate handles with affiliated EP and EVD async events
Arlin Davis [Wed, 3 Sep 2014 22:47:51 +0000 (15:47 -0700)]
common: return appropriate handles with affiliated EP and EVD async events

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoRelease 2.1.2 dapl-2.1.2-1
Arlin Davis [Tue, 2 Sep 2014 21:54:51 +0000 (14:54 -0700)]
Release 2.1.2

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agompxyd: add global routing support for proxy connections
Arlin Davis [Tue, 2 Sep 2014 19:53:23 +0000 (12:53 -0700)]
mpxyd: add global routing support for proxy connections

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agomcm: only call mix_get_attr if running on MIC
Arlin Davis [Tue, 2 Sep 2014 19:52:06 +0000 (12:52 -0700)]
mcm: only call mix_get_attr if running on MIC

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>
9 years agoopenib: modify check for link_layer to handle unspecified
Arlin Davis [Tue, 2 Sep 2014 15:47:29 +0000 (08:47 -0700)]
openib: modify check for link_layer to handle unspecified

Signed-off-by: Arlin Davis <arlin.r.davis@intel.com>