Updated mvapich2 release notes

author Jonathan Perkins <perkinjo@cse.ohio-state.edu>

Fri, 28 Oct 2011 05:41:01 +0000 (07:41 +0200)

committer Vladimir Sokolovsky <vlad@mellanox.co.il>

Fri, 28 Oct 2011 05:41:01 +0000 (07:41 +0200)
author Jonathan Perkins <perkinjo@cse.ohio-state.edu>
Fri, 28 Oct 2011 05:41:01 +0000 (07:41 +0200)
committer Vladimir Sokolovsky <vlad@mellanox.co.il>
Fri, 28 Oct 2011 05:41:01 +0000 (07:41 +0200)
diff --git a/release_notes/mvapich2_release_notes.txt b/release_notes/mvapich2_release_notes.txt

index e1b47d03c62ca4c64926439d41217d4c59361a4e..7e1339aa2167e5280bb33cf91caae62c1b6fa7b4 100644 (file)
--- a/release_notes/mvapich2_release_notes.txt
+++ b/release_notes/mvapich2_release_notes.txt
@@ -1,146 +1,126 @@
-========================================================================
+================================================================================
  
-              Open Fabrics Enterprise Distribution (OFED)
-                MVAPICH2-1.6 in OFED 1.5.3 Release Notes
+                  Open Fabrics Enterprise Distribution (OFED)
+                    MVAPICH2-1.7 in OFED 1.5.4 Release Notes
  
-                               March 2011
+                                  October 2011
  
  
  Overview
  --------
  
-These are the release notes for MVAPICH2-1.6. MVAPICH2 is an MPI-2
-implementation over InfiniBand, iWARP and RoCE (RDMAoE) from the Ohio
-State University (http://mvapich.cse.ohio-state.edu/).
+These are the release notes for MVAPICH2-1.7. MVAPICH2 is an MPI-2
+implementation over InfiniBand, iWARP and RoCE (RDMA over Converged Ethernet)
+from the Ohio State University (http://mvapich.cse.ohio-state.edu/).
  
  
  User Guide
  ----------
  
-For more information on using MVAPICH2-1.6, please visit the user guide
-at http://mvapich.cse.ohio-state.edu/support/.
+For more information on using MVAPICH2-1.7, please visit the user guide at
+http://mvapich.cse.ohio-state.edu/support/.
  
  
  Software Dependencies
  ---------------------
  
-MVAPICH2 depends on the installation of the OFED Distribution stack with
-OpenSM running. The MPI module also requires an established network
-interface (either InfiniBand, IPoIB, iWARP, RoCE uDAPL, or Ethernet).
-BLCR support is needed if built with fault tolerance support. Similarly,
-HWLOC support is needed if built with Portable Hardware Locality feature
-for CPU mapping.
+MVAPICH2 depends on the installation of the OFED Distribution stack with OpenSM
+running. The MPI module also requires an established network interface (either
+InfiniBand, IPoIB, iWARP, RoCE, uDAPL, or Ethernet).  BLCR support is needed if
+built with fault tolerance support. Similarly, HWLOC support is needed if built
+with Portable Hardware Locality feature for CPU mapping.
  
  
  ChangeLog
  ---------
  
-* Features and Enhancements
-    - Optimization and enhanced performance for clusters with nVIDIA
-      GPU adapters (with and without GPUDirect technology)
-    - Support for InfiniBand Quality of Service (QoS) with multiple lanes
-    - Support for 3D torus topology with appropriate SL settings
-        - For both CH3 and Nemesis interfaces
-        - Thanks to Jim Schutt, Marcus Epperson and John Nagle from
-          Sandia for the initial patch
-    - Enhanced R3 rendezvous protocol
-        - For both CH3 and Nemesis interfaces
-    - Robust RDMA Fast Path setup to avoid memory allocation
-      failures
-        - For both CH3 and Nemesis interfaces
-    - Multiple design enhancements for better performance of
-      small and medium sized messages
-    - Using LiMIC2 for efficient intra-node RMA transfer to avoid extra 
-      memory copies
-    - Upgraded to LiMIC2 version 0.5.4
-    - Support of Shared-Memory-Nemesis interface on multi-core platforms
-      requiring intra-node communication only (SMP-only systems, 
-      laptops, etc. )
-    - Enhancements to mpirun_rsh job start-up scheme on large-scale systems
-    - Optimization in MPI_Finalize
-    - XRC support with Hydra Process Manager
-    - Updated Hydra launcher with MPICH2-1.3.3 Hydra process manager
-    - Hydra is the default mpiexec process manager
-    - Enhancements and optimizations for one sided Put and Get operations
-    - Removing the limitation on number of concurrent windows in RMA
-      operations
-    - Optimized thresholds for one-sided RMA operations
-    - Support for process-to-rail binding policy (bunch, scatter and
-      user-defined) in multi-rail configurations (OFA-IB-CH3, OFA-iWARP-CH3,
-      and OFA-RoCE-CH3 interfaces)
-    - Enhancements to Multi-rail Design and features including striping
-      of one-sided messages
-    - Dynamic detection of multiple InfiniBand adapters and using these
-      by default in multi-rail configurations (OLA-IB-CH3, OFA-iWARP-CH3 and
-      OFA-RoCE-CH3 interfaces)
-    - Optimized and tuned algorithms for Gather, Scatter, Reduce,
-      AllReduce and AllGather collective  operations
-    - Enhanced support for multi-threaded applications
-    - Fast Checkpoint-Restart support with aggregation scheme
-    - Job Pause-Migration-Restart Framework for Pro-active Fault-Tolerance
-    - Support for new standardized Fault Tolerant Backplane (FTB) Events
-      for Checkpoint-Restart and Job Pause-Migration-Restart Framework
-    - Enhanced designs for automatic detection of various
-      architectures and adapters
-    - Configuration file support (similar to the one available in MVAPICH).
-      Provides a convenient method for handling all runtime variables 
-      through a configuration file.
-    - User-friendly configuration options to enable/disable various
-      checkpoint/restart and migration features
-    - Enabled ROMIO's auto detection scheme for filetypes
-      on Lustre file system
-    - Improved error checking for system and BLCR calls in
-      checkpoint-restart and migration code path
-    - Enhanced OSU Micro-benchmarks suite (version 3.3)
-    - Building and installation of OSU micro benchmarks during default
-      MVAPICH2 installation
-    - Improved configure help for MVAPICH2 features
-    - Improved usability of process to CPU mapping with support of
-      delimiters (',' , '-') in CPU listing
-       - Thanks to Gilles Civario for the initial patch
-    - Use of gfortran as the default F77 compiler
-
-* Bug fixes
-    - Fix for shmat() return code check
-    - Fix for issues in one-sided RMA
-    - Fix for issues with inter-communicator collectives in Nemesis
-    - KNEM patch for osu_bibw issue with KNEM version 0.9.2
-    - Fix for osu_bibw error with Shared-memory-Nemesis interface
-    - Fix for a hang in collective when thread level is set to multiple
-    - Fix for intel test errors with rsend, bsend and ssend 
-      operations in Nemesis
-    - Fix for memory free issue when it allocated by scandir
-    - Fix for a hang in Finalize
-    - Fix for issue with MPIU_Find_local_and_external when it is called
-      from MPIDI_CH3I_comm_create
-    - Fix for handling CPPFLAGS values with spaces
-    - Dynamic Process Management to work with XRC support
-    - Fix related to disabling CPU affinity when shared memory is 
-      turned off at run time
-    - Resolving a hang in mpirun_rsh termination when CR is enabled
-    - Fixing issue in MPI_Allreduce and Reduce when called with MPI_IN_PLACE
-        - Thanks to the initial patch by Alexander Alekhin
-    - Fix for threading related errors with comm_dup
-    - Fix for alignment issues in RDMA Fast Path
-    - Fix for extra memcpy in header caching
-    - Only set FC and F77 if gfortran is executable
-    - Fix in aggregate ADIO alignment
-    - XRC connection management
-    - Fixes in registration cache
-    - Fixes for multiple memory leaks
-    - Fix for issues in mpirun_rsh
-    - Checks before enabling aggregation and migration
-    - Fixing the build errors with --disable-cxx
-       - Thanks to Bright Yang for reporting this issue
+* Features and Enhancements (since MVAPICH2 1.6)
+    - Based on MPICH2-1.4.1p1
+    - Integrated Hybrid (UD-RC/XRC) design to get best performance on
+      large-scale systems with reduced/constant memory footprint
+    - CH3 shared memory channel for standalone hosts (including laptops)
+      without any InfiniBand adapters
+    - HugePage support
+    - Improved intra-node shared memory communication performance
+    - Shared memory backed windows for One-Sided Communication
+    - Support for truly passive locking for intra-node RMA in shared memory and
+      LIMIC based windows
+    - Improved on-demand InfiniBand connection setup (CH3 and RoCE)
+    - Tuned RDMA Fast Path Buffer size to get better performance with less
+      memory footprint (CH3 and Nemesis)
+    - Support for large data transfers (>2GB)
+    - Integrated with enhanced LiMIC2 (v0.5.5) to support Intra-node large
+      message (>2GB) transfers
+    - Optimized Fence synchronization (with and without LIMIC2 support)
+    - Automatic intra-node communication parameter tuning based on platform
+    - Efficient connection set-up for multi-core systems
+    - Enhanced designs and tuning for collectives (bcast, reduce, barrier,
+      gather, allreduce, allgather, gatherv, allgatherv and alltoall)
+    - Support for shared-memory collectives for modern clusters with up to 64
+      cores/node
+    - MPI_THREAD_SINGLE provided by default and MPI_THREAD_MULTIPLE as an
+      option
+    - Fast process migration using RDMA
+    - Enabling Checkpoint/Restart support in pure SMP mode
+    - Compact and shorthand way to specify blocks of processes on the same host
+      with mpirun_rsh
+    - Support for latest stable version of HWLOC v1.2.2
+    - Enhanced mpirun_rsh design to avoid race conditions, support for
+      fault-tolerance functionality and improved debug messages
+    - Enhanced debugging config options to generate core files and back-traces
+    - Automatic inter-node communication parameter tuning based on platform and
+      adapter detection (Nemesis)
+    - Integrated with latest OSU Micro-benchmarks (3.4)
+    - Improved performance for medium sized messages (QLogic PSM interface)
+    - Multi-core-aware collective support (QLogic PSM interface)
+    - Performance optimization for QDR cards
+    - Support for Chelsio T4 Adapter
+    - Support for Ekopath Compiler
+
+* Bug fixes (since MVAPICH2 1.6)
+    - Fixes in Checkpoint/Restart and Migration support
+    - Fix Restart when using automatic checkpoint
+        - Thanks to Alexandr for reporting this
+    - Handling very large one-sided transfers using RDMA
+    - Fixes for memory leaks
+    - Graceful handling of unknown HCAs
+    - Better handling of shmem file creation errors
+    - Fix for a hang in intra-node transfer
+    - Fix for a build error with --disable-weak-symbols
+        - Thanks to Peter Willis for reporting this issue
+    - Fixes for one-sided communication with passive target synchronization
+    - Better handling of memory allocation and registration failures
+    - Fixes for compilation warnings
+    - Fix a bug that disallows '=' from mpirun_rsh arguments
+    - Handling of non-contiguous transfer in Nemesis interface
+    - Bug fix in gather collective when ranks are in cyclic order
+    - Fix for the ignore_locks bug in MPI-IO with Lustre
+    - Compiler preference lists reordered to avoid mixing GCC and Intel
+      compilers if both are found by configure
+    - Fix a bug in transferring very large messages (>2GB)
+        - Thanks to Tibor Pausz from Univ. of Frankfurt for reporting it
+    - Fix a hang with One-Sided Put operation
+    - Fix a bug in ptmalloc integration
+    - Avoid double-free crash with mpispawn
+    - Avoid crash and print an error message in mpirun_rsh when the hostfile is
+      empty
+    - Checking for error codes in PMI design
+    - Verify programs can link with LiMIC2 at runtime
+    - Fix for compilation issue when BLCR or FTB installed in non-system paths
+    - Fix an issue with RDMA-Migration
+    - Fix a hang with RDMA CM
+    - Fix an issue in supporting RoCE with second port on available on HCA
+        - Thanks to Jeffrey Konz from HP for reporting it
+    - Fix for a hang with passive RMA tests (QLogic PSM interface)
  
  Main Verification Flows
  -----------------------
  
-In order to verify the correctness of MVAPICH2-1.6, the following tests
+In order to verify the correctness of MVAPICH2-1.7, the following tests
  and parameters were run.
  
  Test                            Description
-=======================================================================
+================================================================================
  Intel                           Intel's MPI functionality test suite
  OSU Benchmarks                  OSU's performance tests
  IMB                             Intel's MPI Benchmark test
@@ -152,16 +132,16 @@ SPEC MPI2007                    SPEC's benchmark suite for MPI
  User Guide
  ----------
  
-The MVAPICH2 team provides a very detailed guide to build, install and
-use MVAPICH2 on various platforms. The latest version of the user guide
-can be obtained from http://mvapich.cse.ohio-state.edu/support/
+The MVAPICH2 team provides a very detailed guide to build, install and use
+MVAPICH2 on various platforms. The latest version of the user guide can be
+obtained from http://mvapich.cse.ohio-state.edu/support/
  
  Mailing List
  ------------
  
-There is a public mailing list mvapich-discuss@cse.ohio-state.edu for
-mvapich users and developers to
+There is a public mailing list mvapich-discuss@cse.ohio-state.edu for mvapich
+users and developers to
  - Ask for help and support from each other and get prompt response
  - Contribute patches and enhancements
  
-========================================================================
+================================================================================
author	Jonathan Perkins <perkinjo@cse.ohio-state.edu>
	Fri, 28 Oct 2011 05:41:01 +0000 (07:41 +0200)
committer	Vladimir Sokolovsky <vlad@mellanox.co.il>
	Fri, 28 Oct 2011 05:41:01 +0000 (07:41 +0200)