From d582eb10a389b8787cf6c2327d13e3c3733dbb4b Mon Sep 17 00:00:00 2001 From: Tziporet Koren Date: Tue, 19 May 2009 17:19:23 +0300 Subject: [PATCH] Update for OMPI 1.3.2 Signed-off-by: Jeff Squyres --- MPI_README.txt | 44 +- open_mpi_release_notes.txt | 1050 ++++++++++++++++++++++++++++-------- 2 files changed, 845 insertions(+), 249 deletions(-) diff --git a/MPI_README.txt b/MPI_README.txt index ac96446..4105a33 100644 --- a/MPI_README.txt +++ b/MPI_README.txt @@ -1,7 +1,7 @@ Open Fabrics Enterprise Distribution (OFED) - MPI in OFED 1.4.0 README + MPI in OFED 1.4.1 README - December 2008 + May 2009 =============================================================================== @@ -18,7 +18,7 @@ Table of Contents =============================================================================== Three MPI stacks are included in this release of OFED: - MVAPICH 1.1.0-3143 -- Open MPI 1.2.8 +- Open MPI 1.3.2 - MVAPICH2 1.2p1 Setup, compilation and run information of MVAPICH, Open MPI and MVAPICH2 is @@ -194,7 +194,7 @@ rsh, add to the mpirun_rsh command the parameter: -rsh =============================================================================== Open MPI is a next-generation MPI implementation from the Open MPI -Project (http://www.open-mpi.org/). Version 1.2.8 of Open MPI is +Project (http://www.open-mpi.org/). Version 1.3.2 of Open MPI is included in this release, which is also available directly from the main Open MPI web site. @@ -208,8 +208,8 @@ for the compiler with which to build the Open MPI RPM. Note that more than one compiler can be selected simultaneously, if desired. Users should check the main Open MPI web site for additional -documentation and support. (Note: The FAQ file considers -InfiniBand tuning among other issues.) +documentation and support. (Note: The FAQ file considers OpenFabrics +tuning among other issues.) 3.1 Setting up for Open MPI --------------------------- @@ -338,33 +338,27 @@ options that can be tuned to obtain optimal performance of your MPI applications (see the Open MPI web site / FAQ for more information: http://www.open-mpi.org/faq/). -It is worth noting that the "mpi_leave_pinned" run-time tunable -parameter is usually *very* good for running benchmarks, but can -actually be detrimental to real-world MPI applications -- and is -therefore disabled by default. When running the benchmarks listed -below, it is advistable enable the "mpi_leave_pinned" option in order -to see maximum performance (*). + - is an integer indicating how many MPI processes to run (e.g., 2) + - is the filename of a hostfile, as described above Example 1: Running the OSU bandwidth: - > cd /usr/mpi/gcc/openmpi-1.2.8/tests/osu_benchmarks-3.0 - > mpirun -np --mca mpi_leave_pinned 1 -hostfile osu_bw + > cd /usr/mpi/gcc/openmpi-1.3.2/tests/osu_benchmarks-3.0 + > mpirun -np -hostfile osu_bw Example 2: Running the Intel MPI Benchmark benchmarks: - > cd /usr/mpi/gcc/openmpi-1.2.8/tests/IMB-3.1 - > mpirun -np --mca mpi_leave_pinned 1 -hostfile IMB-MPI1 + > cd /usr/mpi/gcc/openmpi-1.3.2/tests/IMB-3.1 + > mpirun -np -hostfile IMB-MPI1 -Example 3: Running the Presta benchmarks: + --> Note that the version of IMB-EXT that ships in this version of + OFED contains a bug that will cause it to immediately error + out when run with Open MPI. - > cd /usr/mpi/gcc/openmpi-1.2.8/tests/presta-1.4.0 - > mpirun -np --mca mpi_leave_pinned 1 -hostfile com -o 100 +Example 3: Running the Presta benchmarks: -(*) The "mpi_leave_pinned" option can increase bandwidth and decrease - latency for applications that repeatedly send and/or receive from - the same buffers. If your application does not repeatedly - send/receive from the same buffers, mpi_leave_pinned will likely - have little effect on your performance. + > cd /usr/mpi/gcc/openmpi-1.3.2/tests/presta-1.4.0 + > mpirun -np -hostfile com -o 100 3.5 More Open MPI Information ----------------------------- @@ -381,8 +375,6 @@ page for more information: http://www.open-mpi.org/community/help/ - - =============================================================================== 4. MVAPICH2 MPI =============================================================================== diff --git a/open_mpi_release_notes.txt b/open_mpi_release_notes.txt index b8fd016..1c9c5bd 100644 --- a/open_mpi_release_notes.txt +++ b/open_mpi_release_notes.txt @@ -1,30 +1,47 @@ Open Fabrics Enterprise Distribution (OFED) - Open MPI in OFED 1.4 Copyrights, License, and Release Notes + Open MPI in OFED 1.4.1 Copyrights, License, and Release Notes - December 2008 - + May 2009 Open MPI Copyrights ------------------- -Copyright (c) 2004-2007 The Trustees of Indiana University and Indiana +Most files in this release are marked with the copyrights of the +organizations who have edited them. The copyrights below generally +reflect members of the Open MPI core team who have contributed code to +this release. The copyrights for code used under license from other +parties are included in the corresponding files. + +Copyright (c) 2004-2008 The Trustees of Indiana University and Indiana University Research and Technology Corporation. All rights reserved. -Copyright (c) 2004-2007 The University of Tennessee and The University +Copyright (c) 2004-2009 The University of Tennessee and The University of Tennessee Research Foundation. All rights reserved. -Copyright (c) 2004-2006 High Performance Computing Center Stuttgart, +Copyright (c) 2004-2008 High Performance Computing Center Stuttgart, University of Stuttgart. All rights reserved. -Copyright (c) 2004-2006 The Regents of the University of California. +Copyright (c) 2004-2007 The Regents of the University of California. All rights reserved. -Copyright (c) 2006-2007 Los Alamos National Security, LLC. All rights +Copyright (c) 2006-2009 Los Alamos National Security, LLC. All rights reserved. -Copyright (c) 2006-2007 Cisco Systems, Inc. All rights reserved. -Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved. -Copyright (c) 2006 Sandia National Laboratories. All rights reserved. -Copyright (c) 2006-2007 Sun Microsystems, Inc. All rights reserved. +Copyright (c) 2006-2009 Cisco Systems, Inc. All rights reserved. +Copyright (c) 2006-2008 Voltaire, Inc. All rights reserved. +Copyright (c) 2006-2008 Sandia National Laboratories. All rights reserved. +Copyright (c) 2006-2009 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. -Copyright (c) 2006-2007 The University of Houston. All rights reserved. -Copyright (c) 2006 Myricom, Inc. All rights reserved. +Copyright (c) 2006-2009 The University of Houston. All rights reserved. +Copyright (c) 2006-2008 Myricom, Inc. All rights reserved. +Copyright (c) 2007-2008 UT-Battelle, LLC. All rights reserved. +Copyright (c) 2007-2008 IBM Corporation. All rights reserved. +Copyright (c) 1998-2005 Forschungszentrum Juelich, Juelich Supercomputing + Centre, Federal Republic of Germany +Copyright (c) 2005-2008 ZIH, TU Dresden, Federal Republic of Germany +Copyright (c) 2007 Evergrid, Inc. All rights reserved. +Copyright (c) 2008 Institut National de Recherche en + Informatique. All rights reserved. +Copyright (c) 2007 Lawrence Livermore National Security, LLC. + All rights reserved. +Copyright (c) 2007-2009 Mellanox Technologies. All rights reserved. +Copyright (c) 2006 QLogic Corporation. All rights reserved. Additional copyrights may follow @@ -67,6 +84,12 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. =========================================================================== +When submitting questions and problems, be sure to include as much +extra information as possible. This web page details all the +information that we request in order to provide assistance: + + http://www.open-mpi.org/community/help/ + The best way to report bugs, send comments, or ask questions is to sign up on the user's and/or developer's mailing list (for user-level and developer-level questions; when in doubt, send to the user's @@ -97,17 +120,6 @@ Much, much more information is also available in the Open MPI FAQ: OFED-Specific Release Notes --------------------------- -** iWARP support - -The version of Open MPI included in OFED 1.4 does not include iWARP -support. iWARP support is included in the upcoming Open MPI v1.3 -release (which was not ready in time for the OFED 1.4 release); -please see http://www.open-mpi.org/ for updates. - -See the "Installing newer versions of Open MPI after OFED is -installed" section, below, for details about how to download and -install newer versions of Open MPI from its web site. - ** SLES 10 with Pathscale compiler support: Using the Pathscale compiler to build Open MPI on SLES10 may result in @@ -142,33 +154,265 @@ running (the exact version numbers displayed may be different; the important part is that the "openib" BTL is displayed): shell$ ompi_info | grep openib - MCA btl: openib (MCA v1.0, API v1.0.1, Component v1.2.8) + MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.2) See the rest of the documentation below for other configure command line options and installation instructions. -** OFED 1.4 bug fixes (corresponding to Open MPI v1.2.7 and v1.2.8): - -v1.2.8 -- Tweaked one memory barrier in the openib component to be more - conservative. May fix a problem observed on PPC machines. See - ticket #1532. +** Changelog summary + +Showing versions 1.2.7 - 1.3.2; see the "NEWS" file in an Open MPI +distribution for the full list. + +1.3.2 +----- + +- Fixed a potential infinite loop in the openib BTL that could occur + in senders in some frequent-communication scenarios. Thanks to Don + Wood for reporting the problem. +- Add a new checksum PML variation on ob1 (main MPI point-to-point + communication engine) to detect memory corruption in node-to-node + messages +- Add a new configuration option to add padding to the openib + header so the data is aligned +- Add a new configuration option to use an alternative checksum algo + when using the checksum PML +- Fixed a problem reported by multiple users on the mailing list that + the LSF support would fail to find the appropriate libraries at + run-time. +- Allow empty shell designations from getpwuid(). Thanks to Sergey + Koposov for the bug report. +- Ensure that mpirun exits with non-zero status when applications die + due to user signal. Thanks to Geoffroy Pignot for suggesting the + fix. +- Ensure that MPI_VERSION / MPI_SUBVERSION match what is returned by + MPI_GET_VERSION. Thanks to Rob Egan for reporting the error. +- Updated MPI_*KEYVAL_CREATE functions to properly handle Fortran + extra state. +- A variety of ob1 (main MPI point-to-point communication engine) bug + fixes that could have caused hangs or seg faults. +- Do not install Open MPI's signal handlers in MPI_INIT if there are + already signal handlers installed. Thanks to Kees Verstoep for + bringing the issue to our attention. +- Fix GM support to not seg fault in MPI_INIT. +- Various VampirTrace fixes. +- Various PLPA fixes. +- No longer create BTLs for invalid (TCP) devices. +- Various man page style and lint cleanups. +- Fix critical OpenFabrics-related bug noted here: + http://www.open-mpi.org/community/lists/announce/2009/03/0029.php. + Open MPI now uses a much more robust memory intercept scheme that is + quite similar to what is used by MX. The use of "-lopenmpi-malloc" + is no longer necessary, is deprecated, and is expected to disappear + in a future release. -lopenmpi-malloc will continue to work for the + duration of the Open MPI v1.3 and v1.4 series. +- Fix some OpenFabrics shutdown errors, both regarding iWARP and SRQ. +- Allow the udapl BTL to work on Solaris platforms that support + relaxed PCI ordering. +- Fix problem where the mpirun would sometimes use rsh/ssh to launch on + the localhost (instead of simply forking). +- Minor SLURM stdin fixes. +- Fix to run properly under SGE jobs. +- Scalability and latency improvements for shared memory jobs: convert + to using one message queue instead of N queues. +- Automatically size the shared-memory area (mmap file) to match + better what is needed; specifically, so that large-np jobs will start. +- Use fixed-length MPI predefined handles in order to provide ABI + compatibility between Open MPI releases. +- Fix building of the posix paffinity component to properly get the + number of processors in loosely tested environments (e.g., + FreeBSD). Thanks to Steve Kargl for reporting the issue. +- Fix --with-libnuma handling in configure. Thanks to Gus Correa for + reporting the problem. + + +1.3.1 +----- + +- Added "sync" coll component to allow users to synchronize every N + collective operations on a given communicator. +- Increased the default values of the IB and RNR timeout MCA parameters. +- Fix a compiler error noted by Mostyn Lewis with the PGI 8.0 compiler. +- Fix an error that prevented stdin from being forwarded if the + rsh launcher was in use. Thanks to Branden Moore for pointing out + the problem. +- Correct a case where the added datatype is considered as contiguous but + has gaps in the beginning. +- Fix an error that limited the number of comm_spawns that could + simultaneously be running in some environments +- Correct a corner case in OB1's GET protocol for long messages; the + error could sometimes cause MPI jobs using the openib BTL to hang. +- Fix a bunch of bugs in the IO forwarding (IOF) subsystem and add some + new options to output to files and redirect output to xterm. Thanks to + Jody Weissmann for helping test out many of the new fixes and + features. +- Fix SLURM race condition. +- Fix MPI_File_c2f(MPI_FILE_NULL) to return 0, not -1. Thanks to + Lisandro Dalcin for the bug report. +- Fix the DSO build of tm PLM. +- Various fixes for size disparity between C int's and Fortran + INTEGER's. Thanks to Christoph van Wullen for the bug report. +- Ensure that mpirun exits with a non-zero exit status when daemons or + processes abort or fail to launch. +- Various fixes to work around Intel (NetEffect) RNIC behavior. +- Various fixes for mpirun's --preload-files and --preload-binary + options. +- Fix the string name in MPI::ERRORS_THROW_EXCEPTIONS. +- Add ability to forward SIFTSTP and SIGCONT to MPI processes if you + set the MCA parameter orte_forward_job_control to 1. +- Allow the sm BTL to allocate larger amounts of shared memory if + desired (helpful for very large multi-core boxen). +- Fix a few places where we used PATH_MAX instead of OMPI_PATH_MAX, + leading to compile problems on some platforms. Thanks to Andrea Iob + for the bug report. +- Fix mca_btl_openib_warn_no_device_params_found MCA parameter; it + was accidentally being ignored. +- Fix some run-time issues with the sctp BTL. +- Ensure that RTLD_NEXT exists before trying to use it (e.g., it + doesn't exist on Cygwin). Thanks to Gustavo Seabra for reporting + the issue. +- Various fixes to VampirTrace, including fixing compile errors on + some platforms. +- Fixed missing MPI_Comm_accept.3 man page; fixed minor issue in + orterun.1 man page. Thanks to Dirk Eddelbuettel for identifying the + problem and submitting a patch. +- Implement the XML formatted output of stdout/stderr/stddiag. +- Fixed mpirun's -wdir switch to ensure that working directories for + multiple app contexts are properly handled. Thanks to Geoffroy + Pignot for reporting the problem. +- Improvements to the MPI C++ integer constants: + - Allow MPI::SEEK_* constants to be used as constants + - Allow other MPI C++ constants to be used as array sizes +- Fix minor problem with orte-restart's command line options. See + ticket #1761 for details. Thanks to Gregor Dschung for reporting + the problem. + +1.3 +--- + +- Extended the OS X 10.5.x (Leopard) workaround for a problem when + assembly code is compiled with -g[0-9]. Thanks to Barry Smith for + reporting the problem. See ticket #1701. +- Disabled MPI_REAL16 and MPI_COMPLEX32 support on platforms where the + bit representation of REAL*16 is different than that of the C type + of the same size (usually long double). Thanks to Julien Devriendt + for reporting the issue. See ticket #1603. +- Increased the size of MPI_MAX_PORT_NAME to 1024 from 36. See ticket #1533. +- Added "notify debugger on abort" feature. See tickets #1509 and #1510. + Thanks to Seppo Sahrakropi for the bug report. +- Upgraded Open MPI tarballs to use Autoconf 2.63, Automake 1.10.1, + Libtool 2.2.6a. +- Added missing MPI::Comm::Call_errhandler() function. Thanks to Dave + Goodell for bringing this to our attention. +- Increased MPI_SUBVERSION value in mpi.h to 1 (i.e., MPI 2.1). +- Changed behavior of MPI_GRAPH_CREATE, MPI_TOPO_CREATE, and several + other topology functions per MPI-2.1. +- Fix the type of the C++ constant MPI::IN_PLACE. +- Various enhancements to the openib BTL: + - Added btl_openib_if_[in|ex]clude MCA parameters for + including/excluding comma-delimited lists of HCAs and ports. + - Added RDMA CM support, includng btl_openib_cpc_[in|ex]clude MCA + parameters + - Added NUMA support to only use "near" network adapters + - Added "Bucket SRQ" (BSRQ) support to better utilize registered + memory, including btl_openib_receive_queues MCA parameter + - Added ConnectX XRC support (and integrated with BSRQ) + - Added btl_openib_ib_max_inline_data MCA parameter + - Added iWARP support + - Revamped flow control mechansisms to be more efficient + - "mpi_leave_pinned=1" is now the default when possible, + automatically improving performance for large messages when + application buffers are re-used +- Elimiated duplicated error messages when multiple MPI processes fail + with the same error. +- Added NUMA support to the shared memory BTL. +- Add Valgrind-based memory checking for MPI-semantic checks. +- Add support for some optional Fortran datatypes (MPI_LOGICAL1, + MPI_LOGICAL2, MPI_LOGICAL4 and MPI_LOGICAL8). +- Remove the use of the STL from the C++ bindings. +- Added support for Platform/LSF job launchers. Must be Platform LSF + v7.0.2 or later. +- Updated ROMIO with the version from MPICH2 1.0.7. +- Added RDMA capable one-sided component (called rdma), which + can be used with BTL components that expose a full one-sided + interface. +- Added the optional datatype MPI_REAL2. As this is added to the "end of" + predefined datatypes in the fortran header files, there will not be + any compatibility issues. +- Added Portable Linux Processor Affinity (PLPA) for Linux. +- Addition of a finer symbols export control via the visibiliy feature + offered by some compilers. +- Added checkpoint/restart process fault tolerance support. Initially + support a LAM/MPI-like protocol. +- Removed "mvapi" BTL; all InfiniBand support now uses the OpenFabrics + driver stacks ("openib" BTL). +- Added more stringent MPI API parameter checking to help user-level + debugging. +- The ptmalloc2 memory manager component is now by default built as + a standalone library named libopenmpi-malloc. Users wanting to + use leave_pinned with ptmalloc2 will now need to link the library + into their application explicitly. All other users will use the + libc-provided allocator instead of Open MPI's ptmalloc2. This change + may be overriden with the configure option enable-ptmalloc2-internal +- The leave_pinned options will now default to using mallopt on + Linux in the cases where ptmalloc2 was not linked in. mallopt + will also only be available if munmap can be intercepted (the + default whenever Open MPI is not compiled with --without-memory- + manager. +- Open MPI will now complain and refuse to use leave_pinned if + no memory intercept / mallopt option is available. +- Add option of using Perl-based wrapper compilers instead of the + C-based wrapper compilers. The Perl-based version does not + have the features of the C-based version, but does work better + in cross-compile environments. + + +1.2.9 +----- + +- Fix a segfault when using one-sided communications on some forms of derived + datatypes. Thanks to Dorian Krause for reporting the bug. See #1715. +- Fix an alignment problem affecting one-sided communications on + some architectures (e.g., SPARC64). See #1738. +- Fix compilation on Solaris when thread support is enabled in Open MPI + (e.g., when using --with-threads). See #1736. +- Correctly take into account the MTU that an OpenFabrics device port + is using. See #1722 and + https://bugs.openfabrics.org/show_bug.cgi?id=1369. +- Fix two datatype engine bugs. See #1677. + Thanks to Peter Kjellstrom for the bugreport. +- Fix the bml r2 help filename so the help message can be found. See #1623. +- Fix a compilation problem on RHEL4U3 with the PGI 32 bit compiler + caused by . See ticket #1613. +- Fix the --enable-cxx-exceptions configure option. See ticket #1607. +- Properly handle when the MX BTL cannot open an endpoint. See ticket #1621. +- Fix a double free of events on the tcp_events list. See ticket #1631. +- Fix a buffer overun in opal_free_list_grow (called by MPI_Init). + Thanks to Patrick Farrell for the bugreport and Stephan Kramer for + the bugfix. See ticket #1583. +- Fix a problem setting OPAL_PREFIX for remote sh-based shells. + See ticket #1580. + + +1.2.8 +----- + +- Tweaked one memory barrier in the openib component to be more conservative. + May fix a problem observed on PPC machines. See ticket #1532. - Fix OpenFabrics IB partition support. See ticket #1557. -- Restore v1.1 feature that sourced .profile on remote nodes if the - default shell will not do so (e.g. /bin/sh and /bin/ksh). See - ticket #1560. -- Fix segfault in MPI_Init_thread() if ompi_mpi_init() fails. See - ticket #1562. -- Adjust SLURM support to first look for $SLURM_JOB_CPUS_PER_NODE - instead of the deprecated $SLURM_TASKS_PER_NODE environment - variable. This change may be *required* when using SLURM v1.2 and - above. See ticket #1536. -- Fix the MPIR_Proctable to be in process rank order. See ticket - #1529. -- Fix a regression introduced in 1.2.6 for the IBM eHCA. See ticket - #1526. - -v1.2.7 +- Restore v1.1 feature that sourced .profile on remote nodes if the default + shell will not do so (e.g. /bin/sh and /bin/ksh). See ticket #1560. +- Fix segfault in MPI_Init_thread() if ompi_mpi_init() fails. See ticket #1562. +- Adjust SLURM support to first look for $SLURM_JOB_CPUS_PER_NODE instead of + the deprecated $SLURM_TASKS_PER_NODE environment variable. This change + may be *required* when using SLURM v1.2 and above. See ticket #1536. +- Fix the MPIR_Proctable to be in process rank order. See ticket #1529. +- Fix a regression introduced in 1.2.6 for the IBM eHCA. See ticket #1526. + + +1.2.7 +----- + - Add some Sun HCA vendor IDs. See ticket #1461. - Fixed a memory leak in MPI_Alltoallw when called from Fortran. Thanks to Dave Grote for the bugreport. See ticket #1457. @@ -210,72 +454,182 @@ Much, much more information is also available in the Open MPI FAQ: General Release Notes --------------------- +Detailed Open MPI v1.3 Feature List: + + o Open MPI RunTime Environment (ORTE) improvements + - General robustness improvements + - Scalable job launch (we've seen ~16K processes in less than a + minute in a highly-optimized configuration) + - New process mappers + - Support for Platform/LSF environments (v7.0.2 and later) + - More flexible processing of host lists + - new mpirun cmd line options and associated functionality + + o Fault-Tolerance Features + - Asynchronous, transparent checkpoint/restart support + - Fully coordinated checkpoint/restart coordination component + - Support for the following checkpoint/restart services: + - blcr: Berkley Lab's Checkpoint/Restart + - self: Application level callbacks + - Support for the following interconnects: + - tcp + - mx + - openib + - sm + - self + - Improved Message Logging + + o MPI_THREAD_MULTIPLE support for point-to-point messaging in the + following BTLs (note that only MPI point-to-point messaging API + functions support MPI_THREAD_MULTIPLE; other API functions likely + do not): + - tcp + - sm + - mx + - elan + - self + + o Point-to-point Messaging Layer (PML) improvements + - Memory footprint reduction + - Improved latency + - Improved algorithm for multiple communication device + ("multi-rail") support + + o Numerous Open Fabrics improvements/enhancements + - Added iWARP support (including RDMA CM) + - Memory footprint and performance improvements + - "Bucket" SRQ support for better registered memory utilization + - XRC/ConnectX support + - Message coalescing + - Improved error report mechanism with Asynchronous events + - Automatic Path Migration (APM) + - Improved processor/port binding + - Infrastructure for additional wireup strategies + - mpi_leave_pinned is now enabled by default + + o uDAPL BTL enhancements + - Multi-rail support + - Subnet checking + - Interface include/exclude capabilities + + o Processor affinity + - Linux processor affinity improvements + - Core/socket <--> process mappings + + o Collectives + - Performance improvements + - Support for hierarchical collectives (must be activated + manually; see below) + + o Miscellaneous + - MPI 2.1 compliant + - Sparse process groups and communicators + - Support for Cray Compute Node Linux (CNL) + - One-sided RDMA component (BTL-level based rather than PML-level + based) + - Aggregate MCA parameter sets + - MPI handle debugging + - Many small improvements to the MPI C++ bindings + - Valgrind support + - VampirTrace support + - Updated ROMIO to the version from MPICH2 1.0.7 + - Removed the mVAPI IB stacks + - Display most error messages only once (vs. once for each + process) + - Many other small improvements and bug fixes, too numerous to + list here + +Known issues +------------ + + o There is a segfault that sometimes occurs on one of our x86_64 test + clusters when using MPI onesided communications over Myrinet MX. + Since no one else has reported this problem we are not holding + up the 1.3 release. See ticket #1757 for the details, and any + possible workarounds. + + o XGrid support is currently broken. + https://svn.open-mpi.org/trac/ompi/ticket/1777 + + o MPI_REDUCE_SCATTER does not work with counts of 0. + https://svn.open-mpi.org/trac/ompi/ticket/1559 + + o Please also see the Open MPI bug tracker for bugs beyond this release. + https://svn.open-mpi.org/trac/ompi/report + +=========================================================================== + The following abbreviated list of release notes applies to this code -base as of this writing (19 September 2007): +base as of this writing (14 April 2009): + +General notes +------------- - Open MPI includes support for a wide variety of supplemental hardware and software package. When configuring Open MPI, you may need to supply additional flags to the "configure" script in order to tell Open MPI where the header files, libraries, and any other required files are located. As such, running "configure" by itself - may include support for all the devices (etc.) that you expect, + may not include support for all the devices (etc.) that you expect, especially if their support headers / libraries are installed in non-standard locations. Network interconnects are an easy example - to discuss -- Myrinet and InfiniBand, for example, both have - supplemental headers and libraries that must be found before Open - MPI can build support for them. You must specify where these files - are with the appropriate options to configure. See the listing of - configure command-line switches, below, for more details. + to discuss -- Myrinet and OpenFabrics networks, for example, both + have supplemental headers and libraries that must be found before + Open MPI can build support for them. You must specify where these + files are with the appropriate options to configure. See the + listing of configure command-line switches, below, for more details. -- The Open MPI installation must be in your PATH on all nodes (and - potentially LD_LIBRARY_PATH, if libmpi is a shared library), unless - using the --prefix or --enable-mpirun-prefix-by-default - functionality (see below). - -- LAM/MPI-like mpirun notation of "C" and "N" is not yet supported. +- The majority of Open MPI's documentation is here in this file, the + included man pages, and on the web site FAQ + (http://www.open-mpi.org/). This will eventually be supplemented + with cohesive installation and user documentation files. -- Striping MPI messages across multiple networks is supported (and - happens automatically when multiple networks are available), but - needs performance tuning. +- Note that Open MPI documentation uses the word "component" + frequently; the word "plugin" is probably more familiar to most + users. As such, end users can probably completely substitute the + word "plugin" wherever you see "component" in our documentation. + For what it's worth, we use the word "component" for historical + reasons, mainly because it is part of our acronyms and internal API + functionc calls. - The run-time systems that are currently supported are: - rsh / ssh - - BProc versions 3 and 4 with LSF - LoadLeveler - PBS Pro, Open PBS, Torque + - Platform LSF (v7.0.2 and later) - SLURM - - XGrid + - XGrid (known to be broken in 1.3 through 1.3.2) - Cray XT-3 and XT-4 - - Sun N1 Grid Engine (N1GE) 6 and open source Grid Engine - -- The majority of Open MPI's documentation is here in this file, the - included man pages, and on the web site FAQ - (http://www.open-mpi.org/). This will eventually be supplemented - with cohesive installation and user documentation files. + - Sun Grid Engine (SGE) 6.1, 6.2 and open source Grid Engine + - Microsoft Windows CCP (Microsoft Windows server 2003 and 2008) - Systems that have been tested are: - - Linux, 32 bit, with gcc - - Linux, 64 bit (x86), with gcc + - Linux (various flavors/distros), 32 bit, with gcc, and Sun Studio 12 + - Linux (various flavors/distros), 64 bit (x86), with gcc, Absoft, + Intel, Portland, Pathscale, and Sun Studio 12 compilers (*) - OS X (10.4), 32 and 64 bit (i386, PPC, PPC64, x86_64), with gcc - - Solaris 10 updates 2 and 3, SPARC and AMD, 32 and 64 bit, with Sun - Studio 10 and 11 + and Absoft compilers (*) + - Solaris 10 update 2, 3 and 4, 32 and 64 bit (SPARC, i386, x86_64), + with Sun Studio 10, 11 and 12 + + (*) Be sure to read the Compiler Notes, below. - Other systems have been lightly (but not fully tested): - - Other compilers on Linux, 32 and 64 bit - Other 64 bit platforms (e.g., Linux on PPC64) + - Microsoft Windows CCP (Microsoft Windows server 2003 and 2008); + more testing and support is expected later in the Open MPI v1.3.x + series. -- Some MCA parameters can be set in a way that renders Open MPI - inoperable (see notes about MCA parameters later in this file). In - particular, some parameters have required options that must be - included. - - If specified, the "btl" parameter must include the "self" - component, or Open MPI will not be able to deliver messages to the - same rank as the sender. For example: "mpirun --mca btl tcp,self - ..." - - If specified, the "btl_tcp_if_exclude" paramater must include the - loopback device ("lo" on many Linux platforms), or Open MPI will - not be able to route MPI messages using the TCP BTL. For example: - "mpirun --mca btl_tcp_if_exclude lo,eth1 ..." +Compiler Notes +-------------- + +- Mixing compilers from different vendors when building Open MPI + (e.g., using the C/C++ compiler from one vendor and the F77/F90 + compiler from a different vendor) has been successfully employed by + some Open MPI users (discussed on the Open MPI user's mailing list), + but such configurations are not tested and not documented. For + example, such configurations may require additional compiler / + linker flags to make Open MPI build properly. - Open MPI does not support the Sparc v8 CPU target, which is the default on Sun Solaris. The v8plus (32 bit) or v9 (64 bit) @@ -314,6 +668,12 @@ base as of this writing (19 September 2007): also automatically add "-Msignextend" when the C and C++ MPI wrapper compilers are used to compile user MPI applications. +- Using the MPI C++ bindings with the Pathscale compiler is known + to fail, possibly due to Pathscale compiler issues. + +- Using the Absoft compiler to build the MPI Fortran bindings on Suse + 9.3 is known to fail due to a Libtool compatibility issue. + - Open MPI will build bindings suitable for all common forms of Fortran 77 compiler symbol mangling on platforms that support it (e.g., Linux). On platforms that do not support weak symbols (e.g., @@ -349,43 +709,6 @@ base as of this writing (19 September 2007): You can use the ompi_info command to see the Fortran compiler that Open MPI was configured with. -- Running on nodes with different endian and/or different datatype - sizes within a single parallel job is supported in this release. - However, Open MPI does not resize data when datatypes differ in size - (for example, sending a 4 byte MPI_DOUBLE and receiving an 8 byte - MPI_DOUBLE will fail). - -- MPI_THREAD_MULTIPLE support is included, but is only lightly tested. - It likely does not work for thread-intensive applications. - -- Asynchronous message passing progress using threads can be turned on - with the --enable-progress-threads option to configure. - Asynchronous message passing progress is only supported for TCP, - shared memory, and Myrinet/GM. Myrinet/GM has only been lightly - tested. - -- The XGrid support is experimental - see the Open MPI FAQ and this - post on the Open MPI user's mailing list for more information: - - http://www.open-mpi.org/community/lists/users/2006/01/0539.php - -- The OpenFabrics Enterprise Distribution (OFED) software package v1.0 - will not work properly with Open MPI v1.2 (and later) due to how its - Mellanox InfiniBand plugin driver is created. The problem is fixed - OFED v1.1 (and later). - -- The use of the mvapi BTL is deprecated. All new InfiniBand work is - being done in the openib BTL (i.e., the OpenFabrics driver stack). - -- The use of fork() with the openib BTL is only partially supported, - and only on Linux kernels >= v2.6.15 with libibverbs v1.1 or later - (first released as part of OFED v1.2). More complete support will - be included in a future release of Open MPI (see the OFED 1.2 - distribution for details). - -- iWARP support is not yet included in the Open MPI OpenFabrics - support. - - The Fortran 90 MPI bindings can now be built in one of three sizes using --with-mpi-f90-size=SIZE (see description below). These sizes reflect the number of MPI functions included in the "mpi" Fortran 90 @@ -427,21 +750,129 @@ base as of this writing (19 September 2007): interface. A "large" size that includes the two choice buffer MPI functions is possible in future versions of Open MPI. -- Starting with Open MPI v1.2, there are two MPI network models - available: "ob1" and "cm". "ob1" uses the familiar BTL components - for each supported network. "cm" introduces MTL components for + +General Run-Time Support Notes +------------------------------ + +- The Open MPI installation must be in your PATH on all nodes (and + potentially LD_LIBRARY_PATH, if libmpi is a shared library), unless + using the --prefix or --enable-mpirun-prefix-by-default + functionality (see below). + +- LAM/MPI-like mpirun notation of "C" and "N" is not yet supported. + +- The XGrid support is experimental - see the Open MPI FAQ and this + post on the Open MPI user's mailing list for more information: + + http://www.open-mpi.org/community/lists/users/2006/01/0539.php + +- Open MPI's run-time behavior can be customized via MCA ("MPI + Component Architecture") parameters (see below for more information + on how to get/set MCA parameter values). Some MCA parameters can be + set in a way that renders Open MPI inoperable (see notes about MCA + parameters later in this file). In particular, some parameters have + required options that must be included. + + - If specified, the "btl" parameter must include the "self" + component, or Open MPI will not be able to deliver messages to the + same rank as the sender. For example: "mpirun --mca btl tcp,self + ..." + - If specified, the "btl_tcp_if_exclude" paramater must include the + loopback device ("lo" on many Linux platforms), or Open MPI will + not be able to route MPI messages using the TCP BTL. For example: + "mpirun --mca btl_tcp_if_exclude lo,eth1 ..." + +- Running on nodes with different endian and/or different datatype + sizes within a single parallel job is supported in this release. + However, Open MPI does not resize data when datatypes differ in size + (for example, sending a 4 byte MPI_DOUBLE and receiving an 8 byte + MPI_DOUBLE will fail). + + +MPI Functionality and Features +------------------------------ + +- All MPI-2.1 functionality is supported. + +- MPI_THREAD_MULTIPLE support is included, but is only lightly tested. + It likely does not work for thread-intensive applications. Note + that *only* the MPI point-to-point communication functions for the + BTL's listed above are considered thread safe. Other support + functions (e.g., MPI attributes) have not been certified as safe + when simultaneously used by multiple threads. + + Note that Open MPI's thread support is in a fairly early stage; the + above devices are likely to *work*, but the latency is likely to be + fairly high. Specifically, efforts so far have concentrated on + *correctness*, not *performance* (yet). + +- MPI_REAL16 and MPI_COMPLEX32 are only supported on platforms where a + portable C datatype can be found that matches the Fortran type + REAL*16, both in size and bit representation. + +- Asynchronous message passing progress using threads can be turned on + with the --enable-progress-threads option to configure. + Asynchronous message passing progress is only supported with devices + that support MPI_THREAD_MULTIPLE, but is only very lightly tested + (and may not provide very much performance benefit). + + +Collectives +----------- + +- The "hierarch" coll component (i.e., an implementation of MPI + collective operations) attempts to discover network layers of + latency in order to segregate individual "local" and "global" + operations as part of the overall collective operation. In this + way, network traffic can be reduced -- or possibly even minimized + (similar to MagPIe). The current "hierarch" component only + separates MPI processes into on- and off-node groups. + + Hierarch has had sufficient correctness testing, but has not + received much performance tuning. As such, hierarch is not + activated by default -- it must be enabled manually by setting its + priority level to 100: + + mpirun --mca coll_hierarch_priority 100 ... + + We would appreciate feedback from the user community about how well + hierarch works for your applications. + + +Network Support +--------------- + +- The OpenFabrics Enterprise Distribution (OFED) software package v1.0 + will not work properly with Open MPI v1.2 (and later) due to how its + Mellanox InfiniBand plugin driver is created. The problem is fixed + OFED v1.1 (and later). + +- Older mVAPI-based InfiniBand drivers (Mellanox VAPI) are no longer + supported. Please use an older version of Open MPI (1.2 series or + earlier) if you need mVAPI support. + +- The use of fork() with the openib BTL is only partially supported, + and only on Linux kernels >= v2.6.15 with libibverbs v1.1 or later + (first released as part of OFED v1.2), per restrictions imposed by + the OFED network stack. + +- There are two MPI network models available: "ob1" and "cm". "ob1" + uses BTL ("Byte Transfer Layer") components for each supported + network. "cm" uses MTL ("Matching Tranport Layer") components for each supported network. - "ob1" supports a variety of networks that can be used in combination with each other (per OS constraints; e.g., there are reports that the GM and OpenFabrics kernel drivers do not operate well together): - - InfiniBand: mVAPI and the OpenFabrics stack + - OpenFabrics: InfiniBand and iWARP - Loopback (send-to-self) - Myrinet: GM and MX - Portals + - Quadrics Elan - Shared memory - TCP + - SCTP - uDAPL - "cm" supports a smaller number of networks (and they cannot be @@ -451,43 +882,46 @@ base as of this writing (19 September 2007): - InfiniPath PSM - Portals - Open MPI will, by default, choose to use "cm" if it finds a - cm-supported network at run-time. Users can force the use of ob1 if - desired by setting the "pml" MCA parameter at run-time: + Open MPI will, by default, choose to use "cm" when the InfiniPath + PSM MTL can be used. Otherwise, OB1 will be used and the + corresponding BTLs will be selected. Users can force the use of ob1 + or cm if desired by setting the "pml" MCA parameter at run-time: shell$ mpirun --mca pml ob1 ... - -- The MX support is shared between the 2 internal devices, the MTL - and the BTL. MTL stands for Message Transport Layer, while BTL - stands for Byte Transport Layer. The design of the BTL interface - in Open MPI assumes that only naive one-sided communication - capabilities are provided by the low level communication layers. - However, modern communication layers such as MX, PSM or Portals, - natively implement highly-optimized two-sided communication - semantics. To leverage these capabilities, Open MPI provides the - MTL interface to transfer messages rather than bytes. + or + shell$ mpirun --mca pml cm ... + +- Myrinet MX support is shared between the 2 internal devices, the MTL + and the BTL. The design of the BTL interface in Open MPI assumes + that only naive one-sided communication capabilities are provided by + the low level communication layers. However, modern communication + layers such as Myrinet MX, InfiniPath PSM, or Portals, natively + implement highly-optimized two-sided communication semantics. To + leverage these capabilities, Open MPI provides the "cm" PML and + corresponding MTL components to transfer messages rather than bytes. The MTL interface implements a shorter code path and lets the - low-level network library decide which protocol to use, depending - on message length, internal resources and other parameters - specific to the interconnect used. However, Open MPI cannot - currently use multiple MTL modules at once. In the case of the - MX MTL, self and shared memory communications are provided by the - MX library. Moreover, the current MX MTL does not support message - pipelining resulting in lower performances in case of non-contiguous - data-types. - In the case of the BTL, MCA parameters allow Open MPI to use our own - shared memory and self device for increased performance. + low-level network library decide which protocol to use (depending on + issues such as message length, internal resources and other + parameters specific to the underlying interconnect). However, Open + MPI cannot currently use multiple MTL modules at once. In the case + of the MX MTL, process loopback and on-node shared memory + communications are provided by the MX library. Moreover, the + current MX MTL does not support message pipelining resulting in + lower performances in case of non-contiguous data-types. + +The "ob1" PML and BTL components use Open MPI's internal on-node + shared memory and process loopback devices for high performance. The BTL interface allows multiple devices to be used simultaneously. - For the MX BTL it is recommended that the first segment (which is - as a threshold between the eager and the rendezvous protocol) should - always be at most 4KB, but there is no further restriction on - the size of subsequent fragments. - The MX MTL is recommended in the common case for best performance - on 10G hardware, when most of the data transfers cover contiguous - memory layouts. The MX BTL is recommended in all other cases, more - specifically when using multiple interconnects at the same time - (including TCP), transferring non contiguous data-types or when - using the DR PML. + For the MX BTL it is recommended that the first segment (which is as + a threshold between the eager and the rendezvous protocol) should + always be at most 4KB, but there is no further restriction on the + size of subsequent fragments. + + The MX MTL is recommended in the common case for best performance on + 10G hardware when most of the data transfers cover contiguous memory + layouts. The MX BTL is recommended in all other cases, such as when + using multiple interconnects at the same time (including TCP), or + transferring non contiguous data-types. =========================================================================== @@ -510,9 +944,27 @@ for a full list); a summary of the more commonly used ones follows: Open MPI will place its executables in /bin, its header files in /include, its libraries in /lib, etc. +--with-elan= + Specify the directory where the Quadrics Elan library and header + files are located. This option is generally only necessary if the + Elan headers and libraries are not in default compiler/linker + search paths. + + Elan is the support library for Quadrics-based networks. + +--with-elan-libdir= + Look in directory for the Quadrics Elan libraries. By default, Open + MPI will look in /lib and /lib64, + which covers most cases. This option is only needed for special + configurations. + --with-gm= Specify the directory where the GM libraries and header files are - located. This enables GM support in Open MPI. + located. This option is generally only necessary if the GM headers + and libraries are not in default compiler/linker search paths. + + GM is the support library for older Myrinet-based networks (GM has + been obsoleted by MX). --with-gm-libdir= Look in directory for the GM libraries. By default, Open MPI will @@ -521,27 +973,23 @@ for a full list); a summary of the more commonly used ones follows: --with-mx= Specify the directory where the MX libraries and header files are - located. This enables MX support in Open MPI. + located. This option is generally only necessary if the MX headers + and libraries are not in default compiler/linker search paths. + + MX is the support library for Myrinet-based networks. --with-mx-libdir= Look in directory for the MX libraries. By default, Open MPI will look in /lib and /lib64, which covers most cases. This option is only needed for special configurations. ---with-mvapi= - Specify the directory where the mVAPI libraries and header files are - located. This enables mVAPI support in Open MPI (although it is - deprecated). - ---with-mvapi-libdir= - Look in directory for the MVAPI libraries. By default, Open MPI will - look in /lib and /lib64, which covers - most cases. This option is only needed for special configurations. - --with-openib= Specify the directory where the OpenFabrics (previously known as - OpenIB) libraries and header files are located. This enables - OpenFabrics support in Open MPI. + OpenIB) libraries and header files are located. This option is + generally only necessary if the OpenFabrics headers and libraries + are not in default compiler/linker search paths. + + "OpenFabrics" refers to iWARP- and InifiniBand-based networks. --with-openib-libdir= Look in directory for the OpenFabrics libraries. By default, Open @@ -549,20 +997,60 @@ for a full list); a summary of the more commonly used ones follows: directory>/lib64, which covers most cases. This option is only needed for special configurations. +--with-portals= + Specify the directory where the Portals libraries and header files + are located. This option is generally only necessary if the Portals + headers and libraries are not in default compiler/linker search + paths. + + Portals is the support library for Cray interconnects, but is also + available on other platforms (e.g., there is a Portals library + implemented over regular TCP). + +--with-portals-config= + Configuration to use for Portals support. The following + values are possible: "utcp", "xt3", "xt3-modex" (default: utcp). + +--with-portals-libs= + Additional libraries to link with for Portals support. + --with-psm= - Specify the directory where the QLogic PSM library and header files - are located. This enables InfiniPath support in Open MPI. + Specify the directory where the QLogic InfiniPath PSM library and + header files are located. This option is generally only necessary + if the InfiniPath headers and libraries are not in default + compiler/linker search paths. + + PSM is the support library for QLogic InfiniPath network adapters. --with-psm-libdir= Look in directory for the PSM libraries. By default, Open MPI will look in /lib and /lib64, which covers most cases. This option is only needed for special configurations. +--with-sctp= + Specify the directory where the SCTP libraries and header files are + located. This option is generally only necessary if the SCTP headers + and libraries are not in default compiler/linker search paths. + + SCTP is a special network stack over ethernet networks. + +--with-sctp-libdir= + Look in directory for the SCTP libraries. By default, Open MPI will + look in /lib and /lib64, which covers + most cases. This option is only needed for special configurations. + --with-udapl= Specify the directory where the UDAPL libraries and header files are - located. This enables UDAPL support in Open MPI. Note that UDAPL - support is disabled by default on Linux; the --with-udapl flag must - be specified in order to enable it. + located. Note that UDAPL support is disabled by default on Linux; + the --with-udapl flag must be specified in order to enable it. + Specifying the directory argument is generally only necessary if the + UDAPL headers and libraries are not in default compiler/linker + search paths. + + UDAPL is the support library for high performance networks in Sun + HPC ClusterTools and on Linux OpenFabrics networks (although the + "openib" options are preferred for Linux OpenFabrics networks, not + UDAPL). --with-udapl-libdir= Look in directory for the UDAPL libraries. By default, Open MPI @@ -570,9 +1058,35 @@ for a full list); a summary of the more commonly used ones follows: which covers most cases. This option is only needed for special configurations. +--with-lsf= + Specify the directory where the LSF libraries and header files are + located. This option is generally only necessary if the LSF headers + and libraries are not in default compiler/linker search paths. + + LSF is a resource manager system, frequently used as a batch + scheduler in HPC systems. + +--with-lsf-libdir= + Look in directory for the LSF libraries. By default, Open MPI will + look in /lib and /lib64, which covers + most cases. This option is only needed for special configurations. + --with-tm= Specify the directory where the TM libraries and header files are - located. This enables PBS / Torque support in Open MPI. + located. This option is generally only necessary if the TM headers + and libraries are not in default compiler/linker search paths. + + TM is the support library for the Torque and PBS Pro resource + manager systems, both of which are frequently used as a batch + scheduler in HPC systems. + +--with-sge + Specify to build support for the Sun Grid Engine (SGE) resource + manager. SGE support is disabled by default; this option must be + specified to build OMPI's SGE support. + + The Sun Grid Engine (SGE) is a resource manager system, frequently + used as a batch scheduler in HPC systems. --with-mpi-param_check(=value) "value" can be one of: always, never, runtime. If --with-mpi-param @@ -601,7 +1115,8 @@ for a full list); a summary of the more commonly used ones follows: --enable-progress-threads Allows asynchronous progress in some transports. See - --with-threads; this is currently disabled by default. + --with-threads; this is currently disabled by default. See the + above note about asynchronous progress. --disable-mpi-cxx Disable building the C++ MPI bindings. Note that this does *not* @@ -654,7 +1169,7 @@ for a full list); a summary of the more commonly used ones follows: are built as dynamic shared objects (DSOs). This switch disables this default; it is really only useful when used with --enable-static. Specifically, this option does *not* imply - --disable-shared; enabling static libraries and disabling shared + --enable-static; enabling static libraries and disabling shared libraries are two independent options. --enable-static @@ -663,6 +1178,80 @@ for a full list); a summary of the more commonly used ones follows: --disable-shared; enabling static libraries and disabling shared libraries are two independent options. +--enable-sparse-groups + Enable the usage of sparse groups. This would save memory + significantly especially if you are creating large + communicators. (Disabled by default) + +--enable-peruse + Enable the PERUSE MPI data analysis interface. + +--enable-dlopen + Build all of Open MPI's components as standalone Dynamic Shared + Objects (DSO's) that are loaded at run-time. The opposite of this + option, --disable-dlopen, causes two things: + + 1. All of Open MPI's components will be built as part of Open MPI's + normal libraries (e.g., libmpi). + 2. Open MPI will not attempt to open any DSO's at run-time. + + Note that this option does *not* imply that OMPI's libraries will be + built as static objects (e.g., libmpi.a). It only specifies the + location of OMPI's components: standalone DSOs or folded into the + Open MPI libraries. You can control whenther Open MPI's libraries + are build as static or dynamic via --enable|disable-static and + --enable|disable-shared. + +--enable-heterogeneous + Enable support for running on heterogeneous clusters (e.g., machines + with different endian representations). Heterogeneous support is + disabled by default because it imposes a minor performance penalty. + +--enable-ptmalloc2-internal + ***NOTE: This option no longer exists. + + This option was introduced in Open MPI v1.3 and was then removed in + Open MPI v1.3.2. Open MPI fundamentally changed how it uses + ptmalloc2 support in v1.3.2 such that the + --enable-ptmalloc2-internal flag was no longer necessary. It can + still harmlessly be supplied to Open MPI's configure script, but a + warning will appear about how it is an unrecognized option. + + In v1.3 and v1.3.1, Open MPI built the ptmalloc2 library as a + standalone library that users could choose to link in or not (by + adding -lopenmpi-malloc to their link command). Using this option + restored pre-v1.3 behavior of *always* forcing the user to use the + ptmalloc2 memory manager (because it is part of libmpi). + + Starting with v1.3.2, ptmalloc2 is always built into Open MPI, but + is only activated in certain scenarios. + +--with-wrapper-cflags= +--with-wrapper-cxxflags= +--with-wrapper-fflags= +--with-wrapper-fcflags= +--with-wrapper-ldflags= +--with-wrapper-libs= + Add the specified flags to the default flags that used are in Open + MPI's "wrapper" compilers (e.g., mpicc -- see below for more + information about Open MPI's wrapper compilers). By default, Open + MPI's wrapper compilers use the same compilers used to build Open + MPI and specify an absolute minimum set of additional flags that are + necessary to compile/link MPI applications. These configure options + give system administrators the ability to embed additional flags in + OMPI's wrapper compilers (which is a local policy decision). The + meanings of the different flags are: + + : Flags passed by the mpicc wrapper to the C compiler + : Flags passed by the mpic++ wrapper to the C++ compiler + : Flags passed by the mpif77 wrapper to the F77 compiler + : Flags passed by the mpif90 wrapper to the F90 compiler + : Flags passed by all the wrappers to the linker + : Flags passed by all the wrappers to the linker + + There are other ways to configure Open MPI's wrapper compiler + behavior; see the Open MPI FAQ for more information. + There are many other options available -- see "./configure --help". Changing the compilers that Open MPI uses to build itself uses the @@ -692,6 +1281,12 @@ For example: shell$ ./configure CC=mycc CXX=myc++ F77=myf77 F90=myf90 ... +***Note: We generally suggest using the above command line form for + setting different compilers (vs. setting environment variables and + then invoking "./configure"). The above form will save all + variables and values in the config.log file, which makes + post-mortem analysis easier when problems occur. + It is required that the compilers specified be compile and link compatible, meaning that object files created by one compiler must be able to be linked with object files from the other compilers and @@ -708,14 +1303,14 @@ clean - clean out the build tree Once Open MPI has been built and installed, it is safe to run "make clean" and/or remove the entire build tree. -VPATH builds are fully supported. +VPATH and parallel builds are fully supported. Generally speaking, the only thing that users need to do to use Open MPI is ensure that /bin is in their PATH and /lib is in their LD_LIBRARY_PATH. Users may need to ensure to set the PATH and LD_LIBRARY_PATH in their shell setup files (e.g., .bashrc, .cshrc) -so that rsh/ssh-based logins will be able to find the Open MPI -executables. +so that non-interactive rsh/ssh-based logins will be able to find the +Open MPI executables. =========================================================================== @@ -774,6 +1369,10 @@ are solely command-line manipulators, and have nothing to do with the actual compilation or linking of programs. The end result is an MPI executable that is properly linked to all the relevant libraries. +Customizing the behavior of the wrapper compilers is possible (e.g., +changing the compiler [not recommended] or specifying additional +compiler/linker flags); see the Open MPI FAQ for more information. + =========================================================================== Running Open MPI Applications @@ -783,9 +1382,7 @@ Open MPI supports both mpirun and mpiexec (they are exactly equivalent). For example: shell$ mpirun -np 2 hello_world_mpi - or - shell$ mpiexec -np 1 hello_world_mpi : -np 1 hello_world_mpi are equivalent. Some of mpiexec's switches (such as -host and -arch) @@ -814,16 +1411,16 @@ shell$ mpirun -hostfile my_hostfile -np 8 hello_world_mpi will launch MPI_COMM_WORLD rank 0 on node1, rank 1 on node2, ranks 2 and 3 on node3, and ranks 4 through 7 on node4. -Other starters, such as the batch scheduling environments, do not -require hostfiles (and will ignore the hostfile if it is supplied). -They will also launch as many processes as slots have been allocated -by the scheduler if no "-np" argument has been provided. For example, -running an interactive SLURM job with 8 processors: +Other starters, such as the resource manager / batch scheduling +environments, do not require hostfiles (and will ignore the hostfile +if it is supplied). They will also launch as many processes as slots +have been allocated by the scheduler if no "-np" argument has been +provided. For example, running a SLURM job with 8 processors: -shell$ srun -n 8 -A -shell$ mpirun a.out +shell$ salloc -n 8 mpirun a.out -The above command will launch 8 copies of a.out in a single +The above command will reserve 8 processors and run 1 copy of mpirun, +which will, in turn, launch 8 copies of a.out in a single MPI_COMM_WORLD on the processors that were allocated by SLURM. Note that the values of component parameters can be changed on the @@ -839,20 +1436,24 @@ are implemented through MCA components. Here is a list of all the component frameworks in Open MPI: --------------------------------------------------------------------------- + MPI component frameworks: ------------------------- allocator - Memory allocator bml - BTL management layer -btl - MPI point-to-point byte transfer layer, used for MPI +btl - MPI point-to-point Byte Transfer Layer, used for MPI point-to-point messages on some types of networks coll - MPI collective algorithms +crcp - Checkpoint/restart coordination protocol +dpm - MPI-2 dynamic process management io - MPI-2 I/O mpool - Memory pooling mtl - Matching transport layer, used for MPI point-to-point messages on some types of networks osc - MPI-2 one-sided communications pml - MPI point-to-point management layer +pubsub - MPI-2 publish/subscribe management rcache - Memory registration cache topo - MPI topology routines @@ -860,39 +1461,42 @@ Back-end run-time environment component frameworks: --------------------------------------------------- errmgr - RTE error manager -gpr - General purpose registry +ess - RTE environment-specfic services +filem - Remote file management +grpcomm - RTE group communications iof - I/O forwarding -ns - Name server +notifier - System/network administrator noficiation system odls - OpenRTE daemon local launch subsystem oob - Out of band messaging -pls - Process launch system +plm - Process lifecycle management ras - Resource allocation system -rds - Resource discovery system rmaps - Resource mapping system -rmgr - Resource manager rml - RTE message layer -schema - Name schemas -sds - Startup / discovery service -smr - State-of-health monitoring subsystem +routed - Routing table for the RML +snapc - Snapshot coordination Miscellaneous frameworks: ------------------------- -backtrace - Debugging call stack backtrace support -maffinity - Memory affinity -memory - Memory subsystem hooks -memcpy - Memopy copy support -memory - Memory management hooks -paffinity - Processor affinity -timer - High-resolution timers +backtrace - Debugging call stack backtrace support +carto - Cartography (host/network mapping) support +crs - Checkpoint and restart service +installdirs - Installation directory relocation services +maffinity - Memory affinity +memchecker - Run-time memory checking +memcpy - Memopy copy support +memory - Memory management hooks +paffinity - Processor affinity +timer - High-resolution timers --------------------------------------------------------------------------- Each framework typically has one or more components that are used at -run-time. For example, the btl framework is used by MPI to send bytes -across underlying networks. The tcp btl, for example, sends messages -across TCP-based networks; the gm btl sends messages across GM -Myrinet-based networks. +run-time. For example, the btl framework is used by the MPI layer to +send bytes across different types underlying networks. The tcp btl, +for example, sends messages across TCP-based networks; the openib btl +sends messages across OpenFabrics-based networks; the MX btl sends +messages across Myrinet networks. Each component typically has some tunable parameters that can be changed at run-time. Use the ompi_info command to check a component @@ -959,6 +1563,12 @@ Got more questions? Found a bug? Got a question? Want to make a suggestion? Want to contribute to Open MPI? Please let us know! +When submitting questions and problems, be sure to include as much +extra information as possible. This web page details all the +information that we request in order to provide assistance: + + http://www.open-mpi.org/community/help/ + User-level questions and comments should generally be sent to the user's mailing list (users@open-mpi.org). Because of spam, only subscribers are allowed to post to this list (ensure that you @@ -977,10 +1587,4 @@ the following web page to subscribe: http://www.open-mpi.org/mailman/listinfo.cgi/devel -When submitting bug reports to either list, be sure to include as much -extra information as possible. This web page details all the -information that we request in order to provide assistance: - - http://www.open-mpi.org/community/help/ - Make today an Open MPI day! -- 2.46.0