========================================================================
Open Fabrics Enterprise Distribution (OFED)
- MVAPICH2-1.4 in OFED 1.5 Release Notes
+ MVAPICH2-1.4.1 in OFED 1.5.1 Release Notes
- December 2009
+ March 2010
Overview
User Guide
----------
-For more information on using MVAPICH2-1.4, please visit the user guide
-at http://mvapich.cse.ohio-state.edu/support/.
+For more information on using MVAPICH2-1.4.1, please visit the user
+guide at http://mvapich.cse.ohio-state.edu/support/.
Software Dependencies
------------
MVAPICH2 (MPI-2 over InfiniBand and iWARP) is an MPI-2 implementation
-based on MPICH2. MVAPICH2 1.4 is available as a single integrated
-package (with MPICH2 1.0.8p1). This version of MVAPICH2-1.4 for OFED
-has the following changes from MVAPICH2-1.2p1:
-
-MVAPICH2-1.4 (10/29/09)
-
-- Enhancements since mvapich2-1.4rc2
- - Efficient runtime CPU binding
- - Add an environment variable for controlling the use of multiple
- cq's for
- iWARP interface.
- - Add environmental variables to disable registration cache for
- All-to-All
- on large systems.
- - Performance tune for pt-to-pt Intra-node communication with LiMIC2
- - Performance tune for MPI_Broadcast
-
-- Bug fixes since mvapich2-1.4rc2
- - Fix the reading error in lock_get_response by adding
- initialization to req->mrail.protocol
- - Fix mpirun_rsh scalability issue with hierarchical ssh scheme
- when launching greater than 8K processes.
- - Add mvapich_ prefix to yacc functions. This can avoid some
- namespace
- issues when linking with other libraries. Thanks to Manhui Wang
- for contributing the patch.
-
-MVAPICH2-1.4-RC2 (08/31/2009)
-
-- Added Feature: Check-point Restart with Fault-Tolerant Backplane
- Support
- (FTB_CR)
-
-- Added Feature: Multiple CQ-based design for Chelsio iWARP
-
-- Fix for hang with packetized send using RDMA Fast path
-
-- Fix for allowing to use user specified P_Key's (Thanks to Mike Heinz @
- QLogic)
-
-- Fix for allowing mpirun_rsh to accept parameters through the
- parameters file (Thanks to Mike Heinz @ QLogic)
-
-- Distribute LiMIC2-0.5.2 with MVAPICH2. Added flexibility for selecting
- and using a pre-existing installation of LiMIC2
-
-- Modify the default value of shmem_bcast_leaders to 4K
-
-- Fix for one-sided with XRC support
-
-- Fix hang with XRC
-
-- Fix to always enabling MVAPICH2_Sync_Checkpoint functionality
-
-- Increase the amount of command line that mpirun_rsh can handle (Thanks
- for the suggestion by Bill Barth @ TACC)
-
-- Fix build error on RHEL 4 systems (Reported by Nathan Baca and
- Jonathan
- Atencio)
-
-- Fix issue with PGI compilation for PSM interface
-
-- Fix for one-sided accumulate function with user-defined contiguous
- datatypes
-
-- Fix linear/hierarchical switching logic and reduce threshold for the
- enhanced mpirun_rsh framework.
-
-- Clean up intra-node connection management code for iWARP
-
-- Fix --enable-g=all issue with uDAPL interface
-
-- Fix one sided operation with on demand CM.
-
-- Fix VPATH build
-
-MVAPICH2-1.4-RC1 (06/02/2009)
-
-- MPI 2.1 standard compliant
-
-- Based on MPICH2 1.0.8p1
-
-- Dynamic Process Management (DPM) Support with mpirun_rsh and MPD
- - Available for OpenFabrics (IB) interface
-
-- Support for eXtended Reliable Connection (XRC)
- - Available for OpenFabrics (IB) interface
-
-- Kernel-level single-copy intra-node communication support based on
- LiMIC2
- - Delivers superior intra-node performance for medium and
- large messages
- - Available for all interfaces (IB, iWARP and uDAPL)
-
-- Enhancement to mpirun_rsh framework for faster job startup
- on large clusters
- - Hierarchical ssh to nodes to speedup job startup
- - Available for OpenFabrics (IB and iWARP), uDAPL interfaces
- (including Solaris) and the New QLogic-InfiniPath interface
-
-- Scalable checkpoint-restart with mpirun_rsh framework
-
-- Checkpoint-restart with intra-node shared memory (kernel-level with
- LiMIC2) support
- - Available for OpenFabrics (IB) Interface
-
-- K-nomial tree-based solution together with shared memory-based
- broadcast for scalable MPI_Bcast operation
- - Available for all interfaces (IB, iWARP and uDAPL)
-
-- Native support for QLogic InfiniPath
- - Provides support over PSM interface
-
-* Bugs fixed since MVAPICH2-1.2p1
-
- - Changed parameters for iWARP for increased scalability
-
- - Fix error with derived datatypes and Put and Accumulate operations
- Request was being marked complete before data transfer
- had actually taken place when MV_RNDV_PROTOCOL=R3 was used
-
- - Unregister stale memory registrations earlier to prevent
- malloc failures
-
- - Fix for compilation issues with --enable-g=mem and --enable-g=all
-
- - Change dapl_prepost_noop_extra value from 5 to 8 to prevent
- credit flow issues.
-
- - Re-enable RGET (RDMA Read) functionality
-
- - Fix SRQ Finalize error
- Make sure that finalize does not hang when the srq_post_cond is
- being waited on.
-
- - Fix a multi-rail one-sided error when multiple QPs are used
-
- - PMI Lookup name failure with SLURM
-
- - Port auto-detection failure when the 1st HCA did
- not have an active failure
-
- - Change default small message scheduling for multirail
- for higher performance
-
- - MPE support for shared memory collectives now available
+based on MPICH2. MVAPICH2 1.4.1 is available as a single integrated
+package (with MPICH2 1.0.8p1). This version of MVAPICH2-1.4.1 for OFED
+has the following changes from MVAPICH2-1.4:
+
+MVAPICH2-1.4.1 (03/12/10)
+
+* Enhancements since mvapich2-1.4
+ - MPMD launch capability to mpirun_rsh
+ - Portable Hardware Locality (hwloc) support,
+ - Patch suggested by Dr. Bernd Kallies
+ - Multi-port support for iWARP
+ - Enhanced iWARP design for scalability to higher process count
+ - Ring based startup support for RDMAoE
+
+* Bug fixes since mvapich2-1.4
+ - Fixes for MPE and other profiling tools
+ - As suggested by Anthony Chan (chan@mcs.anl.gov)
+ - Fixes for finalization issue with dynamic process management
+ - Removed overrides to PSM_SHAREDCONTEXT, PSM_SHAREDCONTEXTS_MAX variables.
+ - Suggested by Ben Truscott .
+ - Fixing the error check for buffer aliasing in MPI_Reduce as
+ - Suggested by Dr. Rajeev Thakur
+ - Fix Totalview integration for RHEL5
+ - Update simplemake to handle build timestamp issues
+ - Fixes for --enable-g={mem, meminit}
+ - Improved logic to control the receive and send requests to handle the
+ limitation of CQ Depth on iWARP
+ - Fixing assertion failures with IMB-EXT tests
+ - VBUF size for very small iWARP clusters bumped up to 33K
+ - Replace internal mallocs with MPIU_Malloc uniformly for correct
+ tracing with --enable-g=mem
+ - Fixing multi-port for iWARP
+ - Fix memory leaks
+ - Shared-memory reduce fixes for MPI_Reduce invoked with MPI_IN_PLACE
+ - Handling RDMA_CM_EVENT_TIMEWAIT_EXIT event
+ - Fix for threaded-ctxdup mpich2 test
+ - Detecting spawn errors
+ - Patch contributed by Dr. Bernd Kallies
+ - IMB-EXT fixes reported by Yutaka from Cray Japan
+ - Fix alltoall assertion error when LiMIC2 is used
Main Verification Flows
-----------------------
-In order to verify the correctness of MVAPICH2-1.4, the following tests
-and parameters were run.
+In order to verify the correctness of MVAPICH2-1.4.1, the following
+tests and parameters were run.
Test Description
====================================================================
OSU Benchmarks OSU's performance tests
IMB Intel's MPI Benchmark test
mpich2 Test suite distributed with MPICH2
-mpitest b_eff test
-Linpack Linpack benchmark
NAS NAS Parallel Benchmarks (NPB3.2)
-NAMD NAMD application
Mailing List