From: Arlin Davis Date: Fri, 10 Dec 2010 22:26:07 +0000 (-0800) Subject: Release 2.0.31 X-Git-Tag: dapl-2.0.31-1 X-Git-Url: https://openfabrics.org/gitweb/?a=commitdiff_plain;h=refs%2Ftags%2Fdapl-2.0.31-1;p=~ardavis%2Fdapl.git Release 2.0.31 Signed-off-by: Arlin Davis --- diff --git a/ChangeLog b/ChangeLog index 22b2c63..51992d3 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,399 @@ +commit 2bebd3f72ecd73e2df80ca4712f0f470647f196b +Author: Arlin Davis +Date: Fri Dec 10 14:19:45 2010 -0800 + + common: clean up build warning for unused variable event_ptr + + Signed-off-by: Arlin Davis + +commit 07947b64b776e0c55051bd2db16829fec310db32 +Author: Arlin Davis +Date: Fri Dec 10 13:49:47 2010 -0800 + + scm, ucm: set RAI_NOROUTE flag with rdma_getaddrinfo() call to avoid blocking. + + if path is not returned, print warning message and use default SL. + + Signed-off-by: Arlin Davis + +commit 910f52dc040e56d80aa71ec2220c04d11184a9f7 +Author: Arlin Davis +Date: Fri Dec 10 13:47:15 2010 -0800 + + cma: definition for dapl_sp_remove_ep() is missing in cm.c + + Signed-off-by: Arlin Davis + +commit 04d577b6b5232d1eb7a741caa963e6181008ebd6 +Author: Arlin Davis +Date: Mon Dec 6 16:06:47 2010 -0800 + + libdat: static provider entries created for local SR database not freed + + During load (dat_sr_init) the SR database is created with all dat.conf entries + but are never cleaned up during unload. Add new functions dat_sr_remove_all() + and dat_sr_remove() calls to cleanup and deallocate SR database entries and + database via dat_sr_fini(). + + Signed-off-by: Arlin Davis + +commit 9acb2c022a40d9a92a7aad5663be4ad02434de2f +Author: Arlin Davis +Date: Mon Dec 6 16:02:13 2010 -0800 + + libdat: memory leak in static registration during parsing + + The platform_params char string, allocated when + parsing dat.conf, is not freed. + + Signed-off-by: Arlin Davis + +commit 384c1d71872fdd422b64e7b0e1f7b025f18a7ab2 +Author: Arlin Davis +Date: Fri Dec 3 16:13:31 2010 -0800 + + common: increase default IB inline send threshold to 400 + + Signed-off-by: Arlin Davis + +commit 70b05fed377f37665d9444d1710c0e251d63bf8a +Author: Pradeep Satyanarayana +Date: Fri Dec 3 15:52:55 2010 -0800 + + common cq: a mixup of errno and the -1 return from poll in dapls_wait_comp_channel + + call should return errno and not status returned from poll. + + Signed-off-by: Pradeep Satyanarayana + +commit f0984b3134ac6019e849dc3d449fa33c553344a8 +Author: Arlin Davis +Date: Fri Dec 3 14:56:21 2010 -0800 + + ucm: release UD cm objects after AH is exchanged to avoid duplicate request drops + + When EP is in UD mode, AH resolution is handled with DAT connection semantics + connect and accept. Since AH info can be resolved for the same EPs you can + get false duplicate requests because a previous CR from is still on the + CM processing list. The CM object will remain on the EP free list and not + be freed until EP is destroyed given the possibilty of consumer accessing CR + private data buffer. + + Signed-off-by: Arlin Davis + +commit 334326f1e76c07c08f032df1f69f259a088639dd +Author: Arlin Davis +Date: Fri Dec 3 14:52:26 2010 -0800 + + ucm: decrease timeout retry count for disconnect requests + + Signed-off-by: Arlin Davis + +commit 48c594ba32507e49e4c28be0fc7094df96b52340 +Author: Arlin Davis +Date: Fri Dec 3 14:24:40 2010 -0800 + + ucm: hold lock when sending cm_msgs to sync timer start with packet send + + releasing the lock after setting start timer and before + ucm_send could result in incorrect timeout on CM operations + if thread is scheduled out when releasing lock. + + Signed-off-by: Arlin Davis + +commit 4395b4d95e39d460ead28fc7fe76a3cfcbf16935 +Author: Arlin Davis +Date: Fri Dec 3 14:02:25 2010 -0800 + + ucm: add debugging to include process id for better scale up debug aids + + use part of the resv[] area of the cm_msg to include local and + remote process ids. Add more debug messages to help isolate + problems related to many process problems. + + Signed-off-by: Arlin Davis + +commit c269c9ab83a72a2b4ffa972697b83572410d9cea +Author: Arlin Davis +Date: Fri Dec 3 10:25:46 2010 -0800 + + cma: disconnect can block for excessive times waiting for rdma_cm DREP timeout + + rdma_cm uses the same timeout values for connect and disconnect + request/reply. Disconnect abrupt option allows DAT consumers to + specify a prompt disconnect with immediate event. If the remote + node goes down or is non-responsive a CM disconnect event could + take minutes. Add a time limit waiting for event and move EP to + disconnected state to prevent callback from issuing duplicate + disconnect event via callback. The EP to CM linking will + cleanup/cancel any pending events before destroying cm_id. + + Signed-off-by: Arlin Davis + +commit 4c8275ce6d6243fab26e09bff4227db600197c30 +Author: Arlin Davis +Date: Tue Nov 16 14:48:10 2010 -0800 + + ucm: configure the recv channel FD to non-blocking + + Signed-off-by: Arlin Davis + +commit 551ae2d65d6bf891183b0f9d6c68612120147d07 +Author: Stan Smith +Date: Fri Oct 29 10:11:41 2010 -0700 + + windows: Missing librdmacm include path for build + + Signed-off-by: stan smith + +commit 15148cdd23f641bd7d7642a17f7155c18ba0d76c +Author: Arlin Davis +Date: Thu Oct 28 11:12:33 2010 -0700 + + debug build: only timestamp if sending to stdout to avoid performance hit + + Signed-off-by: Arlin Davis + +commit f80928c991c6df0b6e9b344067f7f9c004c17263 +Author: Arlin Davis +Date: Thu Oct 28 11:11:12 2010 -0700 + + common: print out errors on free build and not just debug builds + + Signed-off-by: Arlin Davis + +commit 7479741ae44627c32af8f8265507ce589c96d8ed +Author: Arlin Davis +Date: Fri Oct 22 11:58:19 2010 -0700 + + cma: fix debug build issue + + Signed-off-by: Arlin Davis + +commit 4ae116163655e5042f57ec22233624065d652552 +Author: Arlin Davis +Date: Fri Oct 22 10:15:15 2010 -0700 + + scm, ucm: MPI spawn test on oversubcribed server taking excessive time to complete + + Simultanious DREQ processing from user and CM thread caused some improper + state change on UCM. State change can incorrectly change from FREE back + to DISC in certain corner cases. Add checking on internal disconnect call + to prevent double callback events and improper state change. + + For SCM, a remote DREQ will shutdown socket which will cause POLLERR + on the disconnected FD. This will in turn cause the cm_thread to + wakeup continuously unnecessarily. Fix thread thrashing by moving + CM object to FREE state and removing object FD from pollfd array. + + Signed-off-by: Arlin Davis + +commit e26513e47d75f96de47020951f1813c1a09d238d +Author: Arlin Davis +Date: Fri Oct 22 10:04:21 2010 -0700 + + common: add high resolution time stamps and thread id to sdtout debug logs + + Signed-off-by: Arlin Davis + +commit 794e3807b019df7163b09745048980632a48fcc2 +Author: ardavis +Date: Fri Oct 22 10:01:12 2010 -0700 + + common: modify debug in dat_evd_dequeue to reduce noise, only output on non-empty + + Signed-off-by: Arlin Davis + +commit 779f2a93f95d7e1d6b18061401cffb0ee56a2b37 +Author: sean.hefty@intel.com +Date: Tue Oct 19 13:54:42 2010 -0700 + + cma: rdma_destroy_id called twice during device open bind error + + Signed-off-by: Pradeep Satyanarayana + +commit 9c4026b3658f4f350d2b9cf4cbbda0cdc8f26482 +Author: ardavis +Date: Tue Oct 19 09:52:45 2010 -0700 + + common: dat_evd_dequeue (poll_cq) fails with invalid parameter after EP (qp) free + + Failure occured during Intel MPI spawn test on windows. + The QP's need to be flushed and processed via EVD's during + the EP (QP) destroy to avoid an error on poll_cq. IBAL + provider was not moving to ERR state during QP destroy. + + Better flush CQ processing was added and pushed down to the provider + level via dapls_ib_qp_free() where it can move QP to ERR, flush CQ, + and then free QP after flushing. Because there is no QP_ERR_FLUSH + state on a QP the spin on poll_cq (until empty) after modify_qp + to ERR could return empty and before all WQE's are flushed. This + could result in a CQE being added to CQ with a invalid QP reference. + So, an additional check was added to flush_evds for the recv_q to + poll_cq until all recv's pending are complete. For transmit_q there + is no quarantee that the posted work is signaled and so the best + that can be done is poll_cq until empty. + + Signed-off-by: Arlin Davis + Signed-off-by: Sean Hefty + +commit 3685811d2c481d1e55d79962710f8779553c0d88 +Author: ardavis +Date: Mon Oct 11 12:24:31 2010 -0700 + + ucm: allow configuration of CM burst (signal) threshold on posting + + Add new DAPL_UCM_TX_BURST environment variable, default=50. + Every 50 posted send messages will signal event which + is 10 percent blocks of default 500 message limit. + + Signed-off-by: Arlin Davis + +commit 167f4d8e50f96f771d1ca50042f61713800e9987 +Author: ardavis +Date: Mon Oct 11 12:23:50 2010 -0700 + + cma: fix debug build + + Signed-off-by: Arlin Davis + +commit af828468ce884615b39afacae4e5270e38fba9a6 +Author: ardavis +Date: Thu Oct 7 14:29:21 2010 -0700 + + windows: debug version of windows does not build. + + Signed-off-by: Sean Hefty + +commit 3da73eccadfbec9b8725c53cf37eb9802342c4d0 +Author: ardavis +Date: Thu Oct 7 11:14:03 2010 -0700 + + Allow DAPL out of band connection models to use ibacm to obtain + path record data. This will enable support for a wider range of + topologies, where the SL is required from the SA to prevent + deadlock. + + DAPL will obtain path record data using rdma_getaddrinfo, provided + that IB ACM support is enabled. On failure, dapl will fall back to + using its default SL value. The IB ACM can be configured to cache + path information or always query the SA to ensure that the SL that is + obtained is current. + + Signed-off-by: Sean Hefty + +commit 8be031e229dfc6afcf7b637d2ea78e43048b7223 +Author: ardavis +Date: Mon Sep 27 11:12:08 2010 -0700 + + ucm: add missing map file for UCM provider + + Signed-off-by: Arlin Davis + +commit 78ae5ca6bc5ca16772d70e4c4e7b76982efac0fa +Author: ardavis +Date: Fri Sep 24 10:47:30 2010 -0700 + + ibal: delay QP transition during disconnect phase + + ibal provider calls ib_cm_drep in response to receiving + a dreq. The result is that the user's QP is transitioned + through the error state, which fails any outstanding send + operations and flushes all receives. The disconnect request + is then reported to the user. + + Since a user can receive errors from the QP before they are + aware of a pending disconnect request, the application may + respond to the errors as, well, actual errors. Fix this by + delaying the QP transition until the user responds to the + dreq. + + This fixes an error with Intel MPI running over the ibal + dapl provider with a 'spawn' test. + + Signed-off-by: Sean Hefty + +commit 17d4b3d1ef11ca4de535e64dfffa20e4771bfb1d +Author: ardavis +Date: Thu Sep 23 13:50:05 2010 -0700 + + Revert "ibal: delay QP transition during disconnect phase" + + This reverts commit 4eda455d9bc80c35743b3a2f6773e6c4a500affc. + +commit 4eda455d9bc80c35743b3a2f6773e6c4a500affc +Author: Arlin Davis +Date: Wed Sep 22 10:35:24 2010 -0700 + + ibal: delay QP transition during disconnect phase + + The ibal provider calls ib_cm_drep in response to receiving + a dreq. The result is that the user's QP is transitioned + through the error state, which fails any outstanding send + operations and flushes all receives. The disconnect request + is then reported to the user. + + Since a user can receive errors from the QP before they are + aware of a pending disconnect request, the application may + respond to the errors as, well, actual errors. Fix this by + delaying the QP transition until the user responds to the + dreq. + + Signed-off-by: Sean Hefty + --- + +commit 6fefe33f9e691d4527a6e077ae1fe71bf138a41c +Author: Arlin Davis +Date: Mon Sep 20 10:42:41 2010 -0700 + + common: restructure EVD processing to handle EP destruction phase + + EVD processing in the common code will return unformated events + if EP context is invalid as a result of destruction. During + EP destruction, add changes to flush EVD and process DTO completions + before the EP freeing is called. Simplified the locking in the + EVD code to eliminate the unecessary and very confusing condition + checking of evd_producer_locking_needed. + + new dapls_ep_flush_cqs() call created to syncronize flush and + event processing. + + unnecessary KDAPL code removed in the EVD processing. + + Signed-off-by: Arlin Davis + Signed-off-by: Sean Hefty + +commit 9a52436fab39201ccccc76cde10c0fc6f54f5585 +Author: Arlin Davis +Date: Mon Sep 13 16:19:44 2010 -0700 + + ibal: sync QP destruction and device close + + Make QP destruction synchronous to ensure that no callbacks are + in progress for a QP after dapl has destroyed it. This fixes a + use after free error accessing the dapl ep structure from a qp + callback that results in an application crash. + + Signed-off-by: Sean Hefty + +commit 8b9b644ad2b33f1e21a43be364feee6dd4fc13ec +Author: Arlin Davis +Date: Mon Sep 13 15:06:42 2010 -0700 + + ucm: remove unnecessary debug warning in async callback + + The switch() cases print when necessary. + + signed-off-by: stan smith + +commit 9073d757198ea2aa43a7b97e75e64fcb1b4c40cf +Author: Arlin Davis +Date: Mon Aug 9 14:28:20 2010 -0700 + + Release 2.0.30 + + Signed-off-by: Arlin Davis + commit f85be199252d12d27c7b7814771a4ca83a43d0c8 Author: Arlin Davis Date: Mon Aug 9 14:25:09 2010 -0700 @@ -3860,7 +4256,7 @@ Date: Tue Dec 4 13:19:27 2007 -0800 Signed-off by: Arlin Davis commit 9bc97e65c1240224d7dc9d6ac9a48e7aed199ee6 -Merge: 11a165a... abb4356... +Merge: 11a165a abb4356 Author: Arlin Davis Date: Tue Nov 27 13:31:32 2007 -0800 diff --git a/configure.in b/configure.in index c8c1422..1b2af46 100644 --- a/configure.in +++ b/configure.in @@ -1,11 +1,11 @@ dnl Process this file with autoconf to produce a configure script. AC_PREREQ(2.57) -AC_INIT(dapl, 2.0.30, linux-rdma@vger.kernel.org) +AC_INIT(dapl, 2.0.31, linux-rdma@vger.kernel.org) AC_CONFIG_SRCDIR([dat/udat/udat.c]) AC_CONFIG_AUX_DIR(config) AM_CONFIG_HEADER(config.h) -AM_INIT_AUTOMAKE(dapl, 2.0.30) +AM_INIT_AUTOMAKE(dapl, 2.0.31) AM_PROG_LIBTOOL diff --git a/dapl.spec.in b/dapl.spec.in index 7867726..d824f24 100644 --- a/dapl.spec.in +++ b/dapl.spec.in @@ -144,6 +144,9 @@ fi %{_mandir}/man5/*.5* %changelog +* Fri Dec 10 2010 Arlin Davis - 2.0.31 +- DAT/DAPL Version 2.0.31 Release 1, OFED 1.5.3 + * Mon Aug 9 2010 Arlin Davis - 2.0.30 - DAT/DAPL Version 2.0.30 Release 1, OFED 1.5.2 RC4