From: Jonathan Perkins Date: Fri, 28 Oct 2011 05:41:01 +0000 (+0200) Subject: Updated mvapich2 release notes X-Git-Url: https://openfabrics.org/gitweb/?a=commitdiff_plain;h=091f1fc59649ef96ac192ba9ca894063a7d9e7b9;p=~ardavis%2Fofed_docs%2F.git Updated mvapich2 release notes Signed-off-by: Jonathan Perkins --- diff --git a/release_notes/mvapich2_release_notes.txt b/release_notes/mvapich2_release_notes.txt index e1b47d0..7e1339a 100644 --- a/release_notes/mvapich2_release_notes.txt +++ b/release_notes/mvapich2_release_notes.txt @@ -1,146 +1,126 @@ -======================================================================== +================================================================================ - Open Fabrics Enterprise Distribution (OFED) - MVAPICH2-1.6 in OFED 1.5.3 Release Notes + Open Fabrics Enterprise Distribution (OFED) + MVAPICH2-1.7 in OFED 1.5.4 Release Notes - March 2011 + October 2011 Overview -------- -These are the release notes for MVAPICH2-1.6. MVAPICH2 is an MPI-2 -implementation over InfiniBand, iWARP and RoCE (RDMAoE) from the Ohio -State University (http://mvapich.cse.ohio-state.edu/). +These are the release notes for MVAPICH2-1.7. MVAPICH2 is an MPI-2 +implementation over InfiniBand, iWARP and RoCE (RDMA over Converged Ethernet) +from the Ohio State University (http://mvapich.cse.ohio-state.edu/). User Guide ---------- -For more information on using MVAPICH2-1.6, please visit the user guide -at http://mvapich.cse.ohio-state.edu/support/. +For more information on using MVAPICH2-1.7, please visit the user guide at +http://mvapich.cse.ohio-state.edu/support/. Software Dependencies --------------------- -MVAPICH2 depends on the installation of the OFED Distribution stack with -OpenSM running. The MPI module also requires an established network -interface (either InfiniBand, IPoIB, iWARP, RoCE uDAPL, or Ethernet). -BLCR support is needed if built with fault tolerance support. Similarly, -HWLOC support is needed if built with Portable Hardware Locality feature -for CPU mapping. +MVAPICH2 depends on the installation of the OFED Distribution stack with OpenSM +running. The MPI module also requires an established network interface (either +InfiniBand, IPoIB, iWARP, RoCE, uDAPL, or Ethernet). BLCR support is needed if +built with fault tolerance support. Similarly, HWLOC support is needed if built +with Portable Hardware Locality feature for CPU mapping. ChangeLog --------- -* Features and Enhancements - - Optimization and enhanced performance for clusters with nVIDIA - GPU adapters (with and without GPUDirect technology) - - Support for InfiniBand Quality of Service (QoS) with multiple lanes - - Support for 3D torus topology with appropriate SL settings - - For both CH3 and Nemesis interfaces - - Thanks to Jim Schutt, Marcus Epperson and John Nagle from - Sandia for the initial patch - - Enhanced R3 rendezvous protocol - - For both CH3 and Nemesis interfaces - - Robust RDMA Fast Path setup to avoid memory allocation - failures - - For both CH3 and Nemesis interfaces - - Multiple design enhancements for better performance of - small and medium sized messages - - Using LiMIC2 for efficient intra-node RMA transfer to avoid extra - memory copies - - Upgraded to LiMIC2 version 0.5.4 - - Support of Shared-Memory-Nemesis interface on multi-core platforms - requiring intra-node communication only (SMP-only systems, - laptops, etc. ) - - Enhancements to mpirun_rsh job start-up scheme on large-scale systems - - Optimization in MPI_Finalize - - XRC support with Hydra Process Manager - - Updated Hydra launcher with MPICH2-1.3.3 Hydra process manager - - Hydra is the default mpiexec process manager - - Enhancements and optimizations for one sided Put and Get operations - - Removing the limitation on number of concurrent windows in RMA - operations - - Optimized thresholds for one-sided RMA operations - - Support for process-to-rail binding policy (bunch, scatter and - user-defined) in multi-rail configurations (OFA-IB-CH3, OFA-iWARP-CH3, - and OFA-RoCE-CH3 interfaces) - - Enhancements to Multi-rail Design and features including striping - of one-sided messages - - Dynamic detection of multiple InfiniBand adapters and using these - by default in multi-rail configurations (OLA-IB-CH3, OFA-iWARP-CH3 and - OFA-RoCE-CH3 interfaces) - - Optimized and tuned algorithms for Gather, Scatter, Reduce, - AllReduce and AllGather collective operations - - Enhanced support for multi-threaded applications - - Fast Checkpoint-Restart support with aggregation scheme - - Job Pause-Migration-Restart Framework for Pro-active Fault-Tolerance - - Support for new standardized Fault Tolerant Backplane (FTB) Events - for Checkpoint-Restart and Job Pause-Migration-Restart Framework - - Enhanced designs for automatic detection of various - architectures and adapters - - Configuration file support (similar to the one available in MVAPICH). - Provides a convenient method for handling all runtime variables - through a configuration file. - - User-friendly configuration options to enable/disable various - checkpoint/restart and migration features - - Enabled ROMIO's auto detection scheme for filetypes - on Lustre file system - - Improved error checking for system and BLCR calls in - checkpoint-restart and migration code path - - Enhanced OSU Micro-benchmarks suite (version 3.3) - - Building and installation of OSU micro benchmarks during default - MVAPICH2 installation - - Improved configure help for MVAPICH2 features - - Improved usability of process to CPU mapping with support of - delimiters (',' , '-') in CPU listing - - Thanks to Gilles Civario for the initial patch - - Use of gfortran as the default F77 compiler - -* Bug fixes - - Fix for shmat() return code check - - Fix for issues in one-sided RMA - - Fix for issues with inter-communicator collectives in Nemesis - - KNEM patch for osu_bibw issue with KNEM version 0.9.2 - - Fix for osu_bibw error with Shared-memory-Nemesis interface - - Fix for a hang in collective when thread level is set to multiple - - Fix for intel test errors with rsend, bsend and ssend - operations in Nemesis - - Fix for memory free issue when it allocated by scandir - - Fix for a hang in Finalize - - Fix for issue with MPIU_Find_local_and_external when it is called - from MPIDI_CH3I_comm_create - - Fix for handling CPPFLAGS values with spaces - - Dynamic Process Management to work with XRC support - - Fix related to disabling CPU affinity when shared memory is - turned off at run time - - Resolving a hang in mpirun_rsh termination when CR is enabled - - Fixing issue in MPI_Allreduce and Reduce when called with MPI_IN_PLACE - - Thanks to the initial patch by Alexander Alekhin - - Fix for threading related errors with comm_dup - - Fix for alignment issues in RDMA Fast Path - - Fix for extra memcpy in header caching - - Only set FC and F77 if gfortran is executable - - Fix in aggregate ADIO alignment - - XRC connection management - - Fixes in registration cache - - Fixes for multiple memory leaks - - Fix for issues in mpirun_rsh - - Checks before enabling aggregation and migration - - Fixing the build errors with --disable-cxx - - Thanks to Bright Yang for reporting this issue +* Features and Enhancements (since MVAPICH2 1.6) + - Based on MPICH2-1.4.1p1 + - Integrated Hybrid (UD-RC/XRC) design to get best performance on + large-scale systems with reduced/constant memory footprint + - CH3 shared memory channel for standalone hosts (including laptops) + without any InfiniBand adapters + - HugePage support + - Improved intra-node shared memory communication performance + - Shared memory backed windows for One-Sided Communication + - Support for truly passive locking for intra-node RMA in shared memory and + LIMIC based windows + - Improved on-demand InfiniBand connection setup (CH3 and RoCE) + - Tuned RDMA Fast Path Buffer size to get better performance with less + memory footprint (CH3 and Nemesis) + - Support for large data transfers (>2GB) + - Integrated with enhanced LiMIC2 (v0.5.5) to support Intra-node large + message (>2GB) transfers + - Optimized Fence synchronization (with and without LIMIC2 support) + - Automatic intra-node communication parameter tuning based on platform + - Efficient connection set-up for multi-core systems + - Enhanced designs and tuning for collectives (bcast, reduce, barrier, + gather, allreduce, allgather, gatherv, allgatherv and alltoall) + - Support for shared-memory collectives for modern clusters with up to 64 + cores/node + - MPI_THREAD_SINGLE provided by default and MPI_THREAD_MULTIPLE as an + option + - Fast process migration using RDMA + - Enabling Checkpoint/Restart support in pure SMP mode + - Compact and shorthand way to specify blocks of processes on the same host + with mpirun_rsh + - Support for latest stable version of HWLOC v1.2.2 + - Enhanced mpirun_rsh design to avoid race conditions, support for + fault-tolerance functionality and improved debug messages + - Enhanced debugging config options to generate core files and back-traces + - Automatic inter-node communication parameter tuning based on platform and + adapter detection (Nemesis) + - Integrated with latest OSU Micro-benchmarks (3.4) + - Improved performance for medium sized messages (QLogic PSM interface) + - Multi-core-aware collective support (QLogic PSM interface) + - Performance optimization for QDR cards + - Support for Chelsio T4 Adapter + - Support for Ekopath Compiler + +* Bug fixes (since MVAPICH2 1.6) + - Fixes in Checkpoint/Restart and Migration support + - Fix Restart when using automatic checkpoint + - Thanks to Alexandr for reporting this + - Handling very large one-sided transfers using RDMA + - Fixes for memory leaks + - Graceful handling of unknown HCAs + - Better handling of shmem file creation errors + - Fix for a hang in intra-node transfer + - Fix for a build error with --disable-weak-symbols + - Thanks to Peter Willis for reporting this issue + - Fixes for one-sided communication with passive target synchronization + - Better handling of memory allocation and registration failures + - Fixes for compilation warnings + - Fix a bug that disallows '=' from mpirun_rsh arguments + - Handling of non-contiguous transfer in Nemesis interface + - Bug fix in gather collective when ranks are in cyclic order + - Fix for the ignore_locks bug in MPI-IO with Lustre + - Compiler preference lists reordered to avoid mixing GCC and Intel + compilers if both are found by configure + - Fix a bug in transferring very large messages (>2GB) + - Thanks to Tibor Pausz from Univ. of Frankfurt for reporting it + - Fix a hang with One-Sided Put operation + - Fix a bug in ptmalloc integration + - Avoid double-free crash with mpispawn + - Avoid crash and print an error message in mpirun_rsh when the hostfile is + empty + - Checking for error codes in PMI design + - Verify programs can link with LiMIC2 at runtime + - Fix for compilation issue when BLCR or FTB installed in non-system paths + - Fix an issue with RDMA-Migration + - Fix a hang with RDMA CM + - Fix an issue in supporting RoCE with second port on available on HCA + - Thanks to Jeffrey Konz from HP for reporting it + - Fix for a hang with passive RMA tests (QLogic PSM interface) Main Verification Flows ----------------------- -In order to verify the correctness of MVAPICH2-1.6, the following tests +In order to verify the correctness of MVAPICH2-1.7, the following tests and parameters were run. Test Description -======================================================================= +================================================================================ Intel Intel's MPI functionality test suite OSU Benchmarks OSU's performance tests IMB Intel's MPI Benchmark test @@ -152,16 +132,16 @@ SPEC MPI2007 SPEC's benchmark suite for MPI User Guide ---------- -The MVAPICH2 team provides a very detailed guide to build, install and -use MVAPICH2 on various platforms. The latest version of the user guide -can be obtained from http://mvapich.cse.ohio-state.edu/support/ +The MVAPICH2 team provides a very detailed guide to build, install and use +MVAPICH2 on various platforms. The latest version of the user guide can be +obtained from http://mvapich.cse.ohio-state.edu/support/ Mailing List ------------ -There is a public mailing list mvapich-discuss@cse.ohio-state.edu for -mvapich users and developers to +There is a public mailing list mvapich-discuss@cse.ohio-state.edu for mvapich +users and developers to - Ask for help and support from each other and get prompt response - Contribute patches and enhancements -======================================================================== +================================================================================