From: Hefty, Sean Date: Fri, 2 Jul 2010 22:59:13 +0000 (-0700) Subject: ibacm: release notes X-Git-Url: https://openfabrics.org/gitweb/?a=commitdiff_plain;h=2ece3ca8550a2afb7116ba883b94a80b65fa0d48;p=~ardavis%2Fofed_docs%2F.git ibacm: release notes Signed-off-by: Sean Hefty --- diff --git a/ibacm_release_notes.txt b/ibacm_release_notes.txt new file mode 100644 index 0000000..4048b39 --- /dev/null +++ b/ibacm_release_notes.txt @@ -0,0 +1,144 @@ + Open Fabrics Enterprise Distribution (OFED) + IB ACM in OFED 1.5 Release Notes + + July 2010 + + +=============================================================================== +Table of Contents +=============================================================================== +1. Overview +2. Quick Start Guide +3. Operation Details +4. Known Issues + +=============================================================================== +1. Overview +=============================================================================== +The IB ACM package implements and provides a framework for experimental name, +address, and route resolution services over InfiniBand. It is intended to +address connection setup scalability issues running MPI applications on +large clusters. The IB ACM provides information needed to establish a +connection, but does not implement the CM protocol. + +The librdmacm can invoke IB ACM services when built using the --with-ib_acm +option. The IB ACM services tie in under the rdma_resolve_addr, +rdma_resolve_route, and rdma_getaddrinfo routines. For maximum benefit, +the rdma_getaddrinfo routine should be used, however existing applications +should still see significant connection scaling benefits using the calls +available in librdmacm 1.0.11 and previous releases. + +The IB ACM is focused on being scalable and efficient. The current +implementation limits network traffic, SA interactions, and centralized +services. ACM supports multiple resolution protocols in order to handle +different fabric topologies. + +The IB ACM package is comprised of two components: the ib_acm service +and a test/configuration utility - ib_acme. Both are userspace components +and are available for Linux and Windows. Additional details are given below. + +=============================================================================== +2. Quick Start Guide +=============================================================================== + +1. Prerequisites: libibverbs and libibumad must be installed. + The IB stack should be running with IPoIB configured. + These steps assume that the user has administrative privileges. +2. Install the IB ACM package + This installs ib_acm, and ib_acme. +3. Run ib_acme -A -O + This will generate IB ACM address and options configuration files. + (acm_addr.cfg and acm_opts.cfg) +4. Run ib_acm and leave running. + ib_acm will eventually be converted to a service/daemon, but for now + is a userspace application. Because ib_acm uses the libibumad + interfaces, it should be run with administrative privileges. +5. Optionally, run ib_acme -s -d -v + This will verify that the ib_acm service is running. +5. Install librdmacm using the build option --with-ib_acm. + The librdmacm will automatically use the ib_acm service. + On failures, the librdmacm will fall back to normal resolution. + +=============================================================================== +3. Operation Details +=============================================================================== + +ib_acme: +The ib_acme program serves a dual role. It acts as a utility to test +ib_acm operation and help verify if the ib_acm service and selected +protocol is usable for a given cluster configuration. Additionally, +it automatically generates ib_acm configuration files to assist with +or eliminate manual setup. + + +acm configuration files: +The ib_acm service relies on two configuration files. + +The acm_addr.cfg file contains name and address mappings for each IB + endpoint. Although the names in the acm_addr.cfg +file can be anything, ib_acme maps the host name and IP addresses to +the IB endpoints. + +The acm_opts.cfg file provides a set of configurable options for the +ib_acm service, such as timeout, number of retries, logging level, etc. +ib_acme generates the acm_opts.cfg file using static information. A +future enhancement would adjust options based on the current system +and cluster size. + + +ib_acm: +The ib_acm service is responsible for resolving names and addresses to +InfiniBand path information and caching such data. It is currently +implemented as an executable application, but is a conceptual service +or daemon that should execute with administrative privileges. + +The ib_acm implements a client interface over TCP sockets, which is +abstracted by the librdmacm library. One or more back-end protocols are +used by the ib_acm service to satisfy user requests. Although the +ib_acm supports standard SA path record queries on the back-end, it +provides an experimental multicast resolution protocol in hope of +achieving greater scalability. The latter is not usable on all fabric +topologies, specifically ones that may not have reversible paths. +Users should use the ib_acme utility to verify that multicast protocol +is usable before running other applications. + +Conceptually, the ib_acm service implements an ARP like protocol and either +uses IB multicast records to construct path record data or queries the +SA directly, depending on the selected route protocol. By default, the +ib_acm services uses and caches SA path record queries. + +Specifically, all IB endpoints join a number of multicast groups. +Multicast groups differ based on rates, mtu, sl, etc., and are prioritized. +All participating endpoints must be able to communicate on the lowest +priority multicast group. The ib_acm assigns one or more names/addresses +to each IB endpoint using the acm_addr.cfg file. Clients provide source +and destination names or addresses as input to the service, and receive +as output path record data. + +The service maps a client's source name/address to a local IB endpoint. +If a client does not provide a source address, then the ib_acm service +will select one based on the destination and local routing tables. If the +destination name/address is not cached locally, it sends a multicast +request out on the lowest priority multicast group on the local endpoint. +The request carries a list of multicast groups that the sender can use. +The recipient of the request selects the highest priority multicast group +that it can use as well and returns that information directly to the sender. +The request data is cached by all endpoints that receive the multicast +request message. The source endpoint also caches the response and uses +the multicast group that was selected to construct or obtain path record +data, which is returned to the client. + +=============================================================================== +4. Known Issues +=============================================================================== + +The current implementation of the IB ACM has several restrictions: +- The ib_acm is limited in its handling of dynamic changes; + the ib_acm must be stopped and restarted if a cluster is reconfigured. +- Cached data does not timed out and is only updated if a new resolution + request is received from a different QPN than a cached request. +- Support for IPv6 has not been verified. +- The number of addresses that can be assigned to a single endpoint is + limited to 4. +- The number of multicast groups that an endpoint can support is limited to 2. +