From 7289bf0d72f260dbbe8d11d34e974dcaa06186e9 Mon Sep 17 00:00:00 2001 From: Stan Smith Date: Wed, 23 Jun 2010 20:58:35 +0000 Subject: [PATCH] [DOCS] updated OpenSM discussion to reflect OpenSM 3.3.6 git-svn-id: svn://openib.tc.cornell.edu/gen1@2830 ad392aa1-c5ef-ae45-8dd8-e69d62a5ef86 --- trunk/docs/Manual.htm | 1190 +++++++++++++++++++++-------------------- 1 file changed, 620 insertions(+), 570 deletions(-) diff --git a/trunk/docs/Manual.htm b/trunk/docs/Manual.htm index aa0d775c..45feec18 100644 --- a/trunk/docs/Manual.htm +++ b/trunk/docs/Manual.htm @@ -17,7 +17,7 @@ span.GramE

User's Manual

Release 2.3

-05/12/2010

+06/18/2010

Overview

The OpenFabrics Enterprise Distribution for Windows package is composed of software modules intended @@ -2425,7 +2425,8 @@ vstat - HCA Stats and Counters

<return-to-top>

 

-

Subnet Management with OpenSM version 3.3.3

+

Subnet Management with OpenSM version +3.3.6


A single running process (opensm.exe) is required to configure and thus make an Infiniband subnet useable.  For most cases, InfiniBand @@ -2459,9 +2460,11 @@ Management service.

    When opensm.exe is run as a Windows Service, the 'normal' case, %temp% is defined as %windir%\TEMP\.
    If opensm.exe is run from a command window, %TEMP% is not - defined as %windir%\TEMP\; just use %TEMP%
  + defined as %windir%\TEMP\.
 

InfiniBand Subnet Management from a command window

+ +opensm - InfiniBand subnet manager and administration (SM/SA)

SYNOPSIS

opensm @@ -2498,7 +2501,7 @@ Management service.

[-console-port <port>] [-i(gnore-guids) <equalize-ignore-guids-file>] [-w | --hop_weights_file <path to file>] -[-f <log file path> | --log_file <log file path> ] +[-O | --dimn_ports_file <path to file>] [-f <log file path> | --log_file <log file path> ] [-L | --log_limit <size in MB>] [-e(rase_log_file)] [-P(config) <partition config file> ] [-N | --no_part_enforce] @@ -2510,7 +2513,7 @@ Management service.

[--perfmgr_sweep_time_s <seconds>] [--prefix_routes_file <path>] [--consolidate_ipv6_snm_req] -[-v(erbose)] [-V] [-D <flags>] [-d(ebug) <number>] +[--log_prefix <prefix text>] [-v(erbose)] [-V] [-D <flags>] [-d(ebug) <number>] [-h(elp)] [-?]

DESCRIPTION

@@ -2521,29 +2524,32 @@ and runs on top of OFED for Windows. opensm provides an implementation of an Inf Administration. Such a software entity is required to run for in order to initialize the InfiniBand hardware (at least one per each InfiniBand subnet). - +

opensm also now contains an experimental version of a performance manager as well. - +

opensm defaults were designed to meet the common case usage on clusters with up to a few hundred nodes. Thus, in this default mode, opensm will scan the IB fabric, initialize it, and sweep occasionally for changes. - +

opensm attaches to a specific IB port on the local machine and configures only the fabric connected to it. (If the local machine has other IB ports, opensm will ignore the fabrics connected to those other ports). If no port is specified, it will select the first "best" available port. - -opensm can present the available ports and prompt for a port number to attach -to. By default, the run is logged to two files:%windir%\temp\osm.syslog and -%windir%\temp\osm.log. +

+opensm can present the available ports and prompt for a port number to +attach to. +

+By default, the run is logged to two files: %TEMP%\osm.syslog (aka +%windir%\temp\osm.syslog) and %windir%\temp\opensm.log. The first file will register only general major events, whereas the second will include details of reported errors. All errors reported in this second file should be treated as indicators of IB fabric health issues. (Note that when a fatal and non-recoverable error occurs, opensm will exit.) Both log files should include the message "SUBNET UP" if opensm was able to -setup the subnet correctly.

OPTIONS

- +setup the subnet correctly. Note when opensm.exe is run as a service, %TEMP% == +%windir%\temp .

OPTIONS

+

@@ -2551,7 +2557,7 @@ setup the subnet correctly.

OPTIONS

Prints OpenSM version and exits.
-F, --config <config file>
The name of the OpenSM config file. When not specified -%ProgramFiles%\OFED\OpenSM\opensm.conf will be used (if exists). + %ProgramFiles%\OFED\opensm\opensm.conf will be used (if exists).
-c, --create-config <file name>
OpenSM will dump its configuration to the specified file and exit. This is a way to generate OpenSM configuration file template. @@ -2617,8 +2623,8 @@ is host reboot, which otherwise would cause two full routing recalculations: one when the host goes down, and the other when the host comes back online.
-z, --connect_roots
-This option enforces a routing engine (currently up/down -only) to make connectivity between root switches and in +This option enforces routing engines (up/down and +fat-tree) to make connectivity between root switches and in this way to be fully IBA complaint. In many cases this can violate "pure" deadlock free algorithm, so use it carefully.
-M, --lid_matrix_file <file name>
@@ -2699,14 +2705,22 @@ and weighting factor. Any port not listed in the file defaults to a weighting factor of 1. Lines starting with # are comments. Weights affect only the output route from the port, so many useful configurations will require weights to be specified in pairs. +
-O, --dimn_ports_file <path to file>
This option provides a +mapping between hypercube dimensions and ports on a per switch basis for the DOR +routing engine. The file consists of lines containing a switch node GUID +(specified as a 64 bit hex number, with leading 0x) followed by a list of +non-zero port numbers, separated by spaces, one switch per line. The order for +the port numbers is in one to one correspondence to the dimensions. Ports not +listed on a line are assigned to the remaining dimensions, in port order. +Anything after a # is a comment.
-x, --honor_guid2lid
This option forces OpenSM to honor the guid2lid file, when it comes out of Standby state, if such file exists under OSM_CACHE_DIR, and is valid. By default, this is FALSE.
-f, --log_file <file name>
-This option defines the log to be the given file. By default, the log goes to -%windir%\temp\osm.log. +This option defines the log to be the given file. By default, the log goes to +%windir%\temp\opensm.log. For the log to go to standard output use -f stdout.
-L, --log_limit <size in MB>
This option defines maximal log file size in MB. When @@ -2718,19 +2732,19 @@ This option will cause deletion of the log file is accumulative.
-P, --Pconfig <partition config file>
This option defines the optional partition configuration file. -The default name is %ProgramFiles%\OFED\OpenSM\partitions.conf. +The default name is %ProgramFiles%\OFED\opensm\partitions.conf.
--prefix_routes_file <file name>
Prefix routes control how the SA responds to path record queries for off-subnet DGIDs. By default, the SA fails such queries. The PREFIX ROUTES section below describes the format of the configuration file. -The default path is %ProgramFiles%\OFED\OpenSM\prefix-routes.conf. +The default path is %ProgramFiles%\OFED\opensm\prefix-routes.conf.
-Q, --qos
This option enables QoS setup. It is disabled by default.
-Y, --qos_policy_file <file name>
This option defines the optional QoS policy file. The default -name is %ProgramFiles%\OFED\OpenSM\qos-policy.conf. See +name is %ProgramFiles%\OFED\opensm\qos-policy.conf. See QoS_management_in_OpenSM.txt in opensm doc for more information on configuring QoS policy via this file.
-N, --no_part_enforce
@@ -2756,11 +2770,15 @@ more information on running perfmgr. Specify the sweep time for the performance manager in seconds (default is 180 seconds). Only takes effect if --enable-perfmgr was specified at configure time. -
--consolidate_ipv6_snm_req - -
-Consolidate IPv6 Solicited Node Multicast group join requests into one -multicast group per MGID PKey. +
--consolidate_ipv6_snm_req
+Use shared MLID for IPv6 Solicited Node Multicast groups per MGID scope +and P_Key. +
-log_prefix <prefix text>
+This option specifies the prefix to the syslog messages from OpenSM. A suitable +prefix can be used to identify the IB subnet in syslog messages when two or more +instances of OpenSM run in a single node to manage multiple fabrics. For +example, in a dual-fabric (or dual-rail) IB cluster, the prefix for the first +fabric could be "mpi" and the other fabric could be "storage".
-v, --verbose
This option increases the log verbosity level. The -v option may be specified multiple times @@ -2778,8 +2796,8 @@ This option sets the log verbosity level. A flags field must follow the -D option. A bit set/clear in the flags enables/disables a specific log level as follows: - -
 BIT    LOG LEVEL ENABLED +

+ BIT    LOG LEVEL ENABLED
 ----   -----------------
 0x01 - ERROR (error messages)
 0x02 - INFO (basic messages, low volume) @@ -2789,7 +2807,7 @@ specific log level as follows:
 0x20 - FRAMES (dumps all SMP and GMP frames)
 0x40 - ROUTING (dump FDB routing information)
 0x80 - currently unused. - +

Without -D, OpenSM defaults to ERROR + INFO (0x3). Specifying -D 0 disables all messages. Specifying -D 0xFF enables all messages (see -V). @@ -2800,7 +2818,7 @@ This option specifies a debug option. These options are not normally needed. The number following -d selects the debug option to enable as follows: - +

 OPT   Description
 ---    -----------------
 -d0  - Ignore other SM nodes @@ -2811,9 +2829,8 @@ option to enable as follows: Display this usage info then exit.

-?
Display this usage info then exit. - -
+

+

ENVIRONMENT VARIABLES

@@ -2842,7 +2859,7 @@ logrotate purposes.
Examples:

    sc.exe control OpenSM 128            -# will clear the contents of %windir%\temp\osm.log
+# will clear the contents of %windir%\temp\osm.log, logrotate.
    sc.exe control OpenSM 129            # start a new heavy sweep

@@ -2853,54 +2870,54 @@ Examples:

The default name of OpenSM partitions configuration file is %ProgramFiles\OFED\OpenSM\partitions.conf. The default may be changed by using the --Pconfig (-P) option with OpenSM. - +

The default partition will be created by OpenSM unconditionally even when partition configuration file does not exist or cannot be accessed. - +

The default partition has P_Key value 0x7fff. OpenSM's port will always have full membership in default partition. All other end ports will have full membership if the partition configuration file is not found or cannot be accessed, or limited membership if the file exists and can be accessed but there is no rule for the Default partition. - +

Effectively, this amounts to the same as if one of the following rules below appear in the partition configuration file. - +

In the case of no rule for the Default partition: - +

Default=0x7fff : ALL=limited, SELF=full ; - +

In the case of no partition configuration file or file cannot be accessed: - +

Default=0x7fff : ALL=full ; - - +

+

File Format - +

Comments: - +

Line content followed after '#' character is comment and ignored by parser. - +

General file format: - +

<Partition Definition>:<PortGUIDs list> ; - +

Partition Definition: - +

[PartitionName][=PKey][,flag[=value]][,defmember=full|limited] - -
 PartitionName - string, will be used with logging. When omitted +

+ PartitionName - string, will be used with logging. When omitted
                 empty string will be used.
 PKey          - P_Key value for this partition. Only low 15 bits will
                 be used. When omitted will be autogenerated.
 flag          - used to indicate IPoIB capability of this partition.
 defmember=full|limited - specifies default membership for port guid
                 list. Default is limited. - +

Currently recognized flags are: - -
 ipoib       - indicates that this partition may be used for IPoIB, as +

+ ipoib       - indicates that this partition may be used for IPoIB, as
               result IPoIB capable MC group will be created.
 rate=<val>  - specifies rate for this IPoIB MC group
               (default is 3 (10GBps)) @@ -2911,69 +2928,69 @@ Currently recognized flags are:
 scope=<val> - specifies scope for this IPoIB MC group
               (default is 2 (link local)).  Multiple scope settings
               are permitted for a partition. - +

Note that values for rate, mtu, and scope should be specified as defined in the IBTA specification (for example, mtu=4 for 2048). - +

PortGUIDs list: - -
 PortGUID         - GUID of partition member EndPort. Hexadecimal +

+ PortGUID         - GUID of partition member EndPort. Hexadecimal
                    numbers should start from 0x, decimal numbers
                    are accepted too.
 full or limited  - indicates full or limited membership for this
                    port.  When omitted (or unrecognized) limited
                    membership is assumed. - +

There are two useful keywords for PortGUID definition: - -
 - 'ALL' means all end ports in this subnet. +

+ - 'ALL' means all end ports in this subnet.
 - 'ALL_CAS' means all Channel Adapter end ports in this subnet.
 - 'ALL_SWITCHES' means all Switch end ports in this subnet.
 - 'ALL_ROUTERS' means all Router end ports in this subnet.
 - 'SELF' means subnet manager's port. - +

Empty list means no ports in this partition. - +

Notes: - +

White space is permitted between delimiters ('=', ',',':',';'). - +

The line can be wrapped after ':' followed after Partition Definition and between. - +

PartitionName does not need to be unique, PKey does need to be unique. If PKey is repeated then those partition configurations will be merged and first PartitionName will be used (see also next note). - +

It is possible to split partition configuration in more than one definition, but then PKey should be explicitly specified (otherwise different PKey values will be generated for those definitions). - +

Examples: - -
 Default=0x7fff : ALL, SELF=full ; +

+ Default=0x7fff : ALL, SELF=full ;
 Default=0x7fff : ALL, ALL_SWITCHES=full, SELF=full ; - -
 NewPartition , ipoib : 0x123456=full, 0x3456789034=limi, 0x2134af2306 ; - -
 YetAnotherOne = 0x300 : SELF=full ; +

+ NewPartition , ipoib : 0x123456=full, 0x3456789034=limi, 0x2134af2306 ; +

+ YetAnotherOne = 0x300 : SELF=full ;
 YetAnotherOne = 0x300 : ALL=limited ; - -
 ShareIO = 0x80 , defmember=full : 0x123451, 0x123452; +

+ ShareIO = 0x80 , defmember=full : 0x123451, 0x123452;
 # 0x123453, 0x123454 will be limited
 ShareIO = 0x80 : 0x123453, 0x123454, 0x123455=full;
 # 0x123456, 0x123457 will be limited
 ShareIO = 0x80 : defmember=limited : 0x123456, 0x123457, 0x123458=full;
 ShareIO = 0x80 , defmember=full : 0x123459, 0x12345a;
 ShareIO = 0x80 , defmember=full : 0x12345b, 0x12345c=limited, 0x12345d; - - +

+

Note: - +

The following rule is equivalent to how OpenSM used to run prior to the partition manager: - -
 Default=0x7fff,ipoib:ALL=full; +

+ Default=0x7fff,ipoib:ALL=full;  

QOS CONFIGURATION

@@ -2983,8 +3000,8 @@ partition manager: There are a set of QoS related low-level configuration parameters. All these parameter names are prefixed by "qos_" string. Here is a full list of these parameters: - -
 qos_max_vls    - The maximum number of VLs that will be on the subnet +

+ qos_max_vls    - The maximum number of VLs that will be on the subnet
 qos_high_limit - The limit of High Priority component of VL
                  Arbitration table (IBA 7.6.9)
 qos_vlarb_low  - Low priority VL Arbitration table (IBA 7.6.9) @@ -2996,29 +3013,29 @@ list of these parameters:
 qos_sl2vl      - SL2VL Mapping table (IBA 7.6.6) template. It is
                  a list of VLs corresponding to SLs 0-15 (Note
                  that VL15 used here means drop this SL) - +

Typical default values (hard-coded in OpenSM initialization) are: - -
 qos_max_vls 15 +

+ qos_max_vls 15
 qos_high_limit 0
 qos_vlarb_low 0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4
 qos_vlarb_high 0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0
 qos_sl2vl 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7 - +

The syntax is compatible with rest of OpenSM configuration options and values may be stored in OpenSM config file (cached options file). - +

In addition to the above, we may define separate QoS configuration parameters sets for various target types. As targets, we currently support CAs, routers, switch external ports, and switch's enhanced port 0. The names of such specialized parameters are prefixed by "qos_<type>_" string. Here is a full list of the currently supported sets: - -
 qos_ca_  - QoS configuration parameters set for CAs. +

+ qos_ca_  - QoS configuration parameters set for CAs.
 qos_rtr_ - parameters set for routers.
 qos_sw0_ - parameters set for switches' port 0.
 qos_swe_ - parameters set for switches' external ports. - +

Examples:
 qos_sw0_max_vls=2
 qos_ca_sl2vl=0,1,2,3,5,5,5,12,12,0, @@ -3061,47 +3078,45 @@ are wild-carded means that a path record query specifying any off-subnet DGID should return a path to the first available router. This configuration yields the same behavior formerly achieved by compiling opensm with -DROUTER_EXP which has been obsoleted. - - 

ROUTING

OpenSM now offers five routing engines: - +

1. Min Hop Algorithm - based on the minimum hops to each node where the path length is optimized. - +

2. UPDN Unicast routing algorithm - also based on the minimum hops to each node, but it is constrained to ranking rules. This algorithm should be chosen if the subnet is not a pure Fat Tree, and deadlock may occur due to a loop in the subnet. - +

3. Fat Tree Unicast routing algorithm - this algorithm optimizes routing for congestion-free "shift" communication pattern. It should be chosen if a subnet is a symmetrical or almost symmetrical fat-tree of various types, not just K-ary-N-Trees: non-constant K, not fully staffed, any Constant Bisectional Bandwidth (CBB) ratio. Similar to UPDN, Fat Tree routing is constrained to ranking rules. - +

4. LASH unicast routing algorithm - uses Infiniband virtual layers (SL) to provide deadlock-free shortest-path routing while also distributing the paths between layers. LASH is an alternative deadlock-free topology-agnostic routing algorithm to the non-minimal UPDN algorithm avoiding the use of a potentially congested root node. - +

5. DOR Unicast routing algorithm - based on the Min Hop algorithm, but avoids port equalization except for redundant links between the same two switches. This provides deadlock free routes for hypercubes when the fabric is cabled as a hypercube and for meshes when cabled as a mesh (see details below). - +

OpenSM also supports a file method which can load routes from a table. See 'Modular Routing Engine' for more information on this. - +

The basic routing algorithm is comprised of two stages: - +

1. MinHop matrix calculation
   How many hops are required to get from each port to each LID ?
   The algorithm to fill these tables is different if you run standard @@ -3111,7 +3126,7 @@ min hop from every destination LID through neighbor switches
   For Up/Down routing, a BFS from every target is used. The BFS tracks link direction (up or down) and avoid steps that will perform up after a down step was used. - +

2. Once MinHop matrices exist, each switch is visited and for each target LID a decision is made as to what port should be used to get to that LID.
   This step is common to standard and Up/Down routing. Each port has a @@ -3125,12 +3140,12 @@ same target port, the previous LID of the same LMC group)
   c. if none - prefer those which go through another NodeGuid
   d. fall back to the number of paths method (if all go to same node). - +

Effect of Topology Changes - +

OpenSM will preserve existing routing in any case where there is no change in the fabric switches unless the -r (--reassign_lids) option is specified. - +

-r
@@ -3140,30 +3155,30 @@ the fabric switches unless the -r (--reassign_lids) option is specified.
          may disrupt subnet traffic.
          Without -r, OpenSM attempts to preserve existing
          LID assignments resolving multiple use of same LID. - +

If a link is added or removed, OpenSM does not recalculate the routes that do not have to change. A route has to change if the port is no longer UP or no longer the MinHop. When routing changes are performed, the same algorithm for balancing the routes is invoked. - +

In the case of using the file based routing, any topology changes are currently ignored The 'file' routing engine just loads the LFTs from the file specified, with no reaction to real topology. Obviously, this will not be able to recheck LIDs (by GUID) for disconnected nodes, and LFTs for non-existent switches will be skipped. Multicast is not affected by 'file' routing engine (this uses min hop tables). - - +

+

Min Hop Algorithm - +

The Min Hop algorithm is invoked by default if no routing algorithm is specified. It can also be invoked by specifying '-R minhop'. - +

The Min Hop algorithm is divided into two stages: computation of min-hop tables on every switch and LFT output port assignment. Link subscription is also equalized with the ability to override based on port GUID. The latter is supplied by: - +

-i <equalize-ignore-guids-file>
@@ -3173,21 +3188,21 @@ port GUID. The latter is supplied by:
          equalization algorithm. Note that only endports (CA,
          switch port 0, and router ports) and not switch external
          ports are supported. - +

LMC awareness routes based on (remote) system or switch basis. - - +

+

Purpose of UPDN Algorithm - +

The UPDN algorithm is designed to prevent deadlocks from occurring in loops of the subnet. A loop-deadlock is a situation in which it is no longer possible to send data between any two hosts connected through the loop. As such, the UPDN routing algorithm should be used if the subnet is not a pure Fat Tree, and one of its loops may experience a deadlock (due, for example, to high pressure). - +

The UPDN algorithm is based on the following main stages: - +

1. Auto-detect root nodes - based on the CA hop length from any switch in the subnet, a statistical histogram is built for each switch (hop num vs number of occurrences). If the histogram reflects a specific column (higher @@ -3195,45 +3210,45 @@ than others) for a certain node, then it is marked as a root node. Since the algorithm is statistical, it may not find any root nodes. The list of the root nodes found by this auto-detect stage is used by the ranking process stage. - +


    Note 1: The user can override the node list manually.
    Note 2: If this stage cannot find any root nodes, and the user did
            not specify a guid list file, OpenSM defaults back to the
            Min Hop routing algorithm. - +

2. Ranking process - All root switch nodes (found in stage 1) are assigned a rank of 0. Using the BFS algorithm, the rest of the switch nodes in the subnet are ranked incrementally. This ranking aids in the process of enforcing rules that ensure loop-free paths. - +

3. Min Hop Table setting - after ranking is done, a BFS algorithm is run from each (CA or switch) node in the subnet. During the BFS process, the FDB table of each switch node traversed by BFS is updated, in reference to the starting node, based on the ranking rules and guid values. - +

At the end of the process, the updated FDB tables ensure loop-free paths through the subnet. - +

Note: Up/Down routing does not allow LID routing communication between switches that are located inside spine "switch systems". The reason is that there is no way to allow a LID route between them that does not break the Up/Down rule. One ramification of this is that you cannot run SM on switches other than the leaf switches of the fabric. - - +

+

UPDN Algorithm Usage - +

Activation through OpenSM - +

Use '-R updn' option (instead of old '-u') to activate the UPDN algorithm. Use '-a <root_guid_file>' for adding an UPDN guid file that contains the root nodes for ranking. If the `-a' option is not used, OpenSM uses its auto-detect root nodes algorithm. - +

Notes on the guid list file: - +

1. A valid guid file specifies one guid in each line. Lines with an invalid format will be discarded.
@@ -3241,17 +3256,17 @@ format will be discarded. 2. The user should specify the root switch guids. However, it is also possible to specify CA guids; OpenSM will use the guid of the switch (if it exists) that connects the CA to the subnet as a root node. - - +

+

Fat-tree Routing Algorithm - +

The fat-tree algorithm optimizes routing for "shift" communication pattern. It should be chosen if a subnet is a symmetrical or almost symmetrical fat-tree of various types. It supports not just K-ary-N-Trees, by handling for non-constant K, cases where not all leafs (CAs) are present, any CBB ratio. As in UPDN, fat-tree also prevents credit-loop-deadlocks. - +

If the root guid file is not provided ('-a' or '--root_guid_file' options), the topology has to be pure fat-tree that complies with the following rules:
  - Tree rank should be between two and eight (inclusively) @@ -3265,24 +3280,24 @@ the topology has to be pure fat-tree that complies with the following rules:
  - Switches of the same rank should have the same number
    of ports in each DOWN-going port group.
  - All the CAs have to be at the same tree level (rank). - +

If the root guid file is provided, the topology doesn't have to be pure fat-tree, and it should only comply with the following rules:
  - Tree rank should be between two and eight (inclusively)
  - All the Compute Nodes** have to be at the same tree level (rank).
    Note that non-compute node CAs are allowed here to be at different
    tree ranks. - +

* ports that are connected to the same remote switch are referenced as 'port group'. - +

** list of compute nodes (CNs) can be specified by '-u' or '--cn_guid_file' OpenSM options. - +

Topologies that do not comply cause a fallback to min hop routing. Note that this can also occur on link failures which cause the topology to no longer be "pure" fat-tree. - +

Note that although fat-tree algorithm supports trees with non-integer CBB ratio, the routing will not be as balanced as in case of integer CBB ratio. In addition to this, although the algorithm allows leaf switches to have any @@ -3290,97 +3305,97 @@ number of CAs, the closer the tree is to be fully populated, the more effective the "shift" communication pattern will be. In general, even if the root list is provided, the closer the topology to a pure and symmetrical fat-tree, the more optimal the routing will be. - +

The algorithm also dumps compute node ordering file (opensm-ftree-ca-order.dump) in the same directory where the OpenSM log resides. This ordering file provides the CN order that may be used to create efficient communication pattern, that will match the routing tables. - +

Routing between non-CN nodes - +

The use of the cn_guid_file option allows non-CN nodes to be located on different levels in the fat tree. In such case, it is not guaranteed that the Fat Tree algorithm will route between two non-CN nodes. To solve this problem, a list of non-CN nodes can be specified by '-G' or '--io_guid_file' option. Theses nodes will be allowed to use switches the wrong way round a specific number of times (specified by '-H' or '--max_reverse_hops'. With the proper max_reverse_hops and io_guid_file values, you can ensure full connectivity in the Fat Tree. - +

Please note that using max_reverse_hops creates routes that use the switch in a counter-stream way. This option should never be used to connect nodes with high bandwidth traffic between them ! It should only be used to allow connectivity for HA purposes or similar. Also having routes the other way around can in theory cause credit loops. - +

Use these options with extreme care ! - +

Activation through OpenSM - +

Use '-R ftree' option to activate the fat-tree algorithm. Use '-a <root_guid_file>' to provide root nodes for ranking. If the `-a' option is not used, routing algorithm will detect roots automatically. Use '-u <root_cn_file>' to provide the list of compute nodes. If the `-u' option is not used, all the CAs are considered as compute nodes. - +

Note: LMC > 0 is not supported by fat-tree routing. If this is specified, the default routing algorithm is invoked instead. - - +

+

LASH Routing Algorithm - +

LASH is an acronym for LAyered SHortest Path Routing. It is a deterministic shortest path routing algorithm that enables topology agnostic deadlock-free routing within communication networks. - +

When computing the routing function, LASH analyzes the network topology for the shortest-path routes between all pairs of sources / destinations and groups these paths into virtual layers in such a way as to avoid deadlock. - +

Note LASH analyzes routes and ensures deadlock freedom between switch pairs. The link from HCA between and switch does not need virtual layers as deadlock will not arise between switch and HCA. - +

In more detail, the algorithm works as follows: - +

1) LASH determines the shortest-path between all pairs of source / destination switches. Note, LASH ensures the same SL is used for all SRC/DST - DST/SRC pairs and there is no guarantee that the return path for a given DST/SRC will be the reverse of the route SRC/DST. - +

2) LASH then begins an SL assignment process where a route is assigned to a layer (SL) if the addition of that route does not cause deadlock within that layer. This is achieved by maintaining and analysing a channel dependency graph for each layer. Once the potential addition of a path could lead to deadlock, LASH opens a new layer and continues the process. - +

3) Once this stage has been completed, it is highly likely that the first layers processed will contain more paths than the latter ones. To better balance the use of layers, LASH moves paths from one layer to another so that the number of paths in each layer averages out. - +

Note, the implementation of LASH in opensm attempts to use as few layers as possible. This number can be less than the number of actual layers available. - +

In general LASH is a very flexible algorithm. It can, for example, reduce to Dimension Order Routing in certain topologies, it is topology agnostic and fares well in the face of faults. - +

It has been shown that for both regular and irregular topologies, LASH outperforms Up/Down. The reason for this is that LASH distributes the traffic more evenly through a network, avoiding the bottleneck issues related to a root node and always routes shortest-path. - +

The algorithm was developed by Simula Research Laboratory. - - +

+

Use '-R lash -Q ' option to activate the LASH algorithm. - +

Note: QoS support has to be turned on in order that SL/VL mappings are used. - +

Note: LMC > 0 is not supported by the LASH routing. If this is specified, the default routing algorithm is invoked instead. - +

For open regular cartesian meshes the DOR algorithm is the ideal routing algorithm. For toroidal meshes on the other hand there are routing loops that can cause deadlocks. LASH can be used to @@ -3393,76 +3408,77 @@ add an additional phase that analyses the mesh to try to determine the dimension and size of a mesh. If it determines that the mesh looks like an open or closed cartesian mesh it reorders the ports in dimension order before the rest of the LASH algorithm runs. - +

DOR Routing Algorithm - -The Dimension Order Routing algorithm is based on the Min Hop -algorithm and so uses shortest paths. Instead of spreading traffic -out across different paths with the same shortest distance, it chooses -among the available shortest paths based on an ordering of dimensions. -Each port must be consistently cabled to represent a hypercube -dimension or a mesh dimension. Paths are grown from a destination -back to a source using the lowest dimension (port) of available paths -at each step. This provides the ordering necessary to avoid deadlock. -When there are multiple links between any two switches, they still -represent only one dimension and traffic is balanced across them -unless port equalization is turned off. In the case of hypercubes, -the same port must be used throughout the fabric to represent the -hypercube dimension and match on both ends of the cable. In the case -of meshes, the dimension should consistently use the same pair of -ports, one port on one end of the cable, and the other port on the -other end, continuing along the mesh dimension. - +

+The Dimension Order Routing algorithm is based on the Min Hop algorithm and so +uses shortest paths. Instead of spreading traffic
+out across different paths with the same shortest distance, it chooses among the +available shortest paths based on an ordering of dimensions.  Each port +must be consistently cabled to represent a hypercube dimension or a mesh +dimension. Alternatively, the -O option can be
+used to assign a custom mapping between the ports on a given switch, and the +associated dimension. Paths are grown from a destination back
+to a source using the lowest dimension (port) of available paths at each step. +This provides the ordering necessary to avoid deadlock. When there are multiple +links between any two switches, they still represent only one dimension and +traffic is balanced across them
+unless port equalization is turned off. In the case of hypercubes, the same port +must be used throughout the fabric to represent the hypercube dimension and +match on both ends of the cable, or the -O option used to accomplish the +alignment. In the case of meshes, the dimension should consistently use the same +pair of ports, one port on one end of the cable, and the other port on the other +end, continuing along the mesh dimension, or the -O option used as an override.

Use '-R dor' option to activate the DOR algorithm. - - +

+

Routing References - +

To learn more about deadlock-free routing, see the article "Deadlock Free Message Routing in Multiprocessor Interconnection Networks" by William J Dally and Charles L Seitz (1985). - +

To learn more about the up/down algorithm, see the article "Effective Strategy to Compute Forwarding Tables for InfiniBand Networks" by Jose Carlos Sancho, Antonio Robles, and Jose Duato at the Universidad Politecnica de Valencia. - +

To learn more about LASH and the flexibility behind it, the requirement for layers, performance comparisons to other algorithms, see the following articles: - +

"Layered Routing in Irregular Networks", Lysne et al, IEEE Transactions on Parallel and Distributed Systems, VOL.16, No12, December 2005. - +

"Routing for the ASI Fabric Manager", Solheim et al. IEEE Communications Magazine, Vol.44, No.7, July 2006. - +

"Layered Shortest Path (LASH) Routing in Irregular System Area Networks", Skeie et al. IEEE Computer Society Communication Architecture for Clusters 2002. - - +

+

Modular Routine Engine - +

Modular routing engine structure allows for the ease of "plugging" new routing modules. - +

Currently, only unicast callbacks are supported. Multicast can be added later. - +

One existing routing module is up-down "updn", which may be activated with '-R updn' option (instead of old '-u'). - +

General usage is: $ opensm -R 'module-name' - +

There is also a trivial routing module which is able to load LFT tables from a file. - +

Main features: - -
 - this will load switch LFTs and/or LID matrices (min hops tables) +

+ - this will load switch LFTs and/or LID matrices (min hops tables)
 - this will load switch LFTs according to the path entries introduced
   in the file
 - no additional checks will be performed (such as "is port connected", @@ -3471,39 +3487,34 @@ Main features:
   LFTs correctly if endport GUIDs are represented in the file
   (in order to disable this, GUIDs may be removed from the file
    or zeroed) - +

The file format is compatible with output of 'ibroute' util and for whole fabric can be generated with dump_lfts.sh script. - +

To activate file based routing module, use: - -
  opensm -R file -U \path\to\lfts_file - +  opensm -R file -U /path/to/lfts_file +

If the lfts_file is not found or is in error, the default routing algorithm is utilized. - +

The ability to dump switch lid matrices (aka min hops tables) to file and later to load these is also supported. - +

The usage is similar to unicast forwarding tables loading from a lfts file (introduced by 'file' routing engine), but new lid matrix file -name should be specified by -M or --lid_matrix_file option. For example: - +name should be specified by -M or --lid_matrix_file option. For example:

  opensm -R file -M ./opensm-lid-matrix.dump - -The dump file is named 'opensm-lid-matrix.dump' and will be generated -in standard opensm dump directory (/var/log by default) when -OSM_LOG_ROUTING logging flag is set. - +

+The dump file is named 'opensm-lid-matrix.dump' and will be generated in +standard opensm dump directory (%TEMP% by default) when OSM_LOG_ROUTING logging flag is set. +

When routing engine 'file' is activated, but the lfts file is not specified or not cannot be open default lid matrix algorithm will be used. - +

There is also a switch forwarding tables dumper which generates a file compatible with dump_lfts.sh output. This file can be used as input for forwarding tables loading by 'file' routing engine. Both or one of options -U and -M can be specified together with '-R file'. - - 

FILES

@@ -3511,25 +3522,27 @@ Both or one of options -U and -M can be specified together with '-R file'.
default OpenSM config file. +

+

%ProgramFiles%\OFED\OpenSM\ib-node-name-map -
%ProgramFiles\OFED\OpenSM\ib-node-name-map.conf
-default node name map file. See ibnetdiscover for more information on format. - -
%ProgramFiles\OFED\OpenSM\partitions.conf +
+default node name map file. See ibnetdiscover for more information on format. +

+

%ProgramFiles%\OFED\OpenSM\partitions.conf
default partition config file - -
%ProgramFiles\OFED\OpenSM\qos-policy.conf +

+

%ProgramFiles%\OFED\OpenSM\qos-policy.conf
default QOS policy config file - -
%ProgramFiles\OFED\OpenSM\prefix-routes.conf +

+

%ProgramFiles%\OFED\OpenSM\prefix-routes.conf
default prefix routes file. - +

AUTHORS

@@ -3553,7 +3566,10 @@ default prefix routes file. <weiny2@llnl.gov>
Stan Smith
-<stan.smith@intel.com> +<stan.smith@intel.com>
+Dale Purdy
+    < purdy@sgi.com >
+

<return-to-top>

 

@@ -3564,165 +3580,199 @@ test program

osmtest currently can not run on the same HCA port which OpenSM is currently using.

-

SYNOPSIS

- osmtest [-f(low) <c|a|v|s|e|f|m|q|t>] [-w(ait) <trap_wait_time>] [-d(ebug) - <number>] [-m(ax_lid) <LID in hex>] [-g(uid)[=]<GUID in hex>] [-p(ort)] [-i(nventory) - <filename>] [-s(tress)] [-M(ulticast_Mode)] [-t(imeout) <milliseconds>] [-l - | --log_file] [-v] [-vf <flags>] [-h(elp)]   -

DESCRIPTION

-

osmtest is a test program to validate InfiniBand subnet manager and - administration (SM/SA). Default is to run all flows with the exception of - the QoS flow. osmtest provides a test suite for opensm. osmtest has the - following capabilities and testing flows: It creates an inventory file of - all available Nodes, Ports, and PathRecords, including all their fields. It - verifies the existing inventory, with all the object fields, and matches it - to a pre-saved one. A Multicast Compliancy test. An Event Forwarding test. A - Service Record registration test. An RMPP stress test. A Small SA Queries - stress test. It is recommended that after installing opensm, the user should - run "osmtest -f c" to generate the inventory file, and immediately - afterwards run "osmtest -f a" to test OpenSM. Another recommendation for - osmtest usage is to create the inventory when the IB fabric is stable, and - occasionally run "osmtest -v" to verify that nothing has changed. -  

-

OPTIONS

-
-
-f, --flow
-
This option directs osmtest to run a specific flow:
-  FLOW  DESCRIPTION
-  c = create an inventory file with all nodes, ports and paths
-  a = run all validation tests (expecting an input inventory)
-  v = only validate the given inventory file
-  s = run service registration, deregistration, and lease test
-  e = run event forwarding test
-  f = flood the SA with queries according to the stress mode
-  m = multicast flow
-  q = QoS info: dump VLArb and SLtoVL tables
-  t = run trap 64/65 flow (this flow requires running of external tool) -
-  (default is all flows except QoS) -
-
-w, --wait
-
This option specifies the wait time for trap 64/65 in seconds It is - used only when running -f t - the trap 64/65 flow (default to 10 sec) -
-
-d, --debug
-
This option specifies a debug option. These options are not normally - needed. The number following -d selects the debug option to enable as - follows:
-  OPT   Description
-  ---    -----------------
-  -d0  - Ignore other SM nodes
-  -d1  - Force single threaded dispatching
-  -d2  - Force log flushing after each log message
-  -d3  - Disable multicast support -
-
-m, --max_lid
-
This option specifies the maximal LID number to be searched for - during inventory file build (default to 100) -
-
-g, --guid
-
This option specifies the local port GUID value with which OpenSM - should bind. OpenSM may be bound to 1 port at a time. If GUID given is - 0, OpenSM displays a list of possible port GUIDs and waits for user - input. Without -g, OpenSM trys to use the default port. -
-
-p, --port
-
This option displays a menu of possible local port GUID values with - which osmtest could bind -
-
-i, --inventory
-
This option specifies the name of the inventory file Normally, - osmtest expects to find an inventory file, which osmtest uses to - validate real-time information received from the SA during testing If -i - is not specified, osmtest defaults to the file 'osmtest.dat' See -c - option for related information -
-
-s, --stress
-
This option runs the specified stress test instead of the normal - test suite Stress test options are as follows:
-  OPT    Description
-  ---    -----------------
-  -s1  - Single-MAD (RMPP) response SA queries
-  -s2  - Multi-MAD (RMPP) response SA queries
-  -s3  - Multi-MAD (RMPP) Path Record SA queries
-  -s4  - Single-MAD (non RMPP) get Path Record SA queries Without -s, - stress testing is not performed -
-
-M, --Multicast_Mode
-
This option specify length of Multicast test:
-  OPT    Description
-  ---    -----------------
-  -M1  - Short Multicast Flow (default) - single mode
-  -M2  - Short Multicast Flow - multiple mode
-  -M3  - Long Multicast Flow - single mode
-  -M4  - Long Multicast Flow - multiple mode Single mode - Osmtest is - tested alone, with no other apps that interact with OpenSM MC Multiple - mode - Could be run with other apps using MC with OpenSM. Without -M, - default flow testing is performed -
-
-t, --timeout
-
This option specifies the time in milliseconds used for transaction - timeouts. Specifying -t 0 disables timeouts. Without -t, OpenSM defaults - to a timeout value of 200 milliseconds. -
-
-l, --log_file
-
This option defines the log to be the given file. By default the log - goes to stdout. -
-
-v, --verbose
-
This option increases the log verbosity level. The -v option may be - specified multiple times to further increase the verbosity level. See - the -vf option for more information about. log verbosity. -
-
-V
-
This option sets the maximum verbosity level and forces log - flushing. The -V is equivalent to '-vf 0xFF -d 2'. See the -vf option - for more information about. log verbosity. -
-
-vf
-
This option sets the log verbosity level. A flags field must follow - the -D option. A bit set/clear in the flags enables/disables a specific - log level as follows:
-  BIT    LOG LEVEL ENABLED
-  ----   -----------------
-  0x01 - ERROR (error messages)
-  0x02 - INFO (basic messages, low volume)
-  0x04 - VERBOSE (interesting stuff, moderate volume)
-  0x08 - DEBUG (diagnostic, high volume)
-  0x10 - FUNCS (function entry/exit, very high volume)
-  0x20 - FRAMES (dumps all SMP and GMP frames)
-  0x40 - ROUTING (dump FDB routing information)
-  0x80 - currently unused. Without -vf, osmtest defaults to ERROR + INFO - (0x3) Specifying -vf 0 disables all messages Specifying -vf 0xFF enables - all messages (see -V) High verbosity levels may require increasing the - transaction timeout with the -t option -
-
-h, --help
-
Display this usage info then exit.
-
-

AUTHORS

-
-
Hal Rosenstock
-
<hal.rosenstock@gmail.com> -
-
Eitan Zahavi
-
<eitan@mellanox.co.il> -
-
-
-

Examples

-

Note - osmtest will not run on the node where OpenSM is running.
- See 'osmtest -h' for all options.

-

Functionality:

-
-

osmtest -f c            +

SYNOPSIS

+ +osmtest + +[-f(low) <c|a|v|s|e|f|m|q|t>] [-w(ait) <trap_wait_time>] [-d(ebug) <number>] +[-m(ax_lid) <LID in hex>] [-g(uid)[=]<GUID in hex>] [-p(ort)] +[-i(nventory) <filename>] [-s(tress)] [-M(ulticast_Mode)] +[-t(imeout) <milliseconds>] [-l | --log_file] [-v] [-vf <flags>] +[-h(elp)] +

DESCRIPTION

+ +

+ +osmtest is a test program used to validate the correct operation of the InfiniBand subnet manager and +administration (SM/SA). +

+Default is to run all flows with the exception of the QoS flow. +

+osmtest provides a test suite for opensm. +

+osmtest has the following capabilities and testing flows: +

+It creates an inventory file of all available Nodes, Ports, and PathRecords, +including all their fields. +It verifies the existing inventory, with all the object fields, and matches it +to a pre-saved one. +A Multicast Compliancy test. +An Event Forwarding test. +A Service Record registration test. +An RMPP stress test. +A Small SA Queries stress test. +

+It is recommended that after installing opensm, the user should run +"osmtest -f c" to generate the inventory file, and +immediately afterwards run "osmtest -f a" to test OpenSM. +

+Another recommendation for osmtest usage is to create the inventory when the +IB fabric is stable, and occasionally +run "osmtest -v" to verify that nothing has changed. +

OPTIONS

+ +

+

+ +

+
-f, --flow
+This option directs osmtest to run a specific flow: +
 FLOW  DESCRIPTION +
 c = create an inventory file with all nodes, ports and paths +
 a = run all validation tests (expecting an input inventory) +
 v = only validate the given inventory file +
 s = run service registration, deregistration, and lease test +
 e = run event forwarding test +
 f = flood the SA with queries according to the stress mode +
 m = multicast flow +
 q = QoS info: dump VLArb and SLtoVL tables +
 t = run trap 64/65 flow (this flow requires running of external tool) +
 (default is all flows except QoS) +
-w, --wait
+This option specifies the wait time for trap 64/65 in seconds +It is used only when running -f t - the trap 64/65 flow +(default to 10 sec) +
-d, --debug
+This option specifies a debug option. +These options are not normally needed. +The number following -d selects the debug +option to enable as follows: +

+ OPT   Description +
 ---    ----------------- +
 -d0  - Ignore other SM nodes +
 -d1  - Force single threaded dispatching +
 -d2  - Force log flushing after each log message +
 -d3  - Disable multicast support +

-m, --max_lid
+This option specifies the maximal LID number to be searched +for during inventory file build (default to 100) +
-g, --guid
+This option specifies the local port GUID value +with which OpenSM should bind. OpenSM may be +bound to 1 port at a time. +If GUID given is 0, OpenSM displays a list +of possible port GUIDs and waits for user input. +Without -g, OpenSM trys to use the default port. +
-p, --port
+This option displays a menu of possible local port GUID values +with which osmtest could bind +
-i, --inventory
+This option specifies the name of the inventory file +Normally, osmtest expects to find an inventory file, +which osmtest uses to validate real-time information +received from the SA during testing +If -i is not specified, osmtest defaults to the file +'osmtest.dat' +See -c option for related information +
-s, --stress
+This option runs the specified stress test instead +of the normal test suite +Stress test options are as follows: +

+ OPT    Description +
 ---    ----------------- +
 -s1  - Single-MAD (RMPP) response SA queries +
 -s2  - Multi-MAD (RMPP) response SA queries +
 -s3  - Multi-MAD (RMPP) Path Record SA queries +
 -s4  - Single-MAD (non RMPP) get Path Record SA queries +

+Without -s, stress testing is not performed +

-M, --Multicast_Mode
+This option specify length of Multicast test: +

+ OPT    Description +
 ---    ----------------- +
 -M1  - Short Multicast Flow (default) - single mode +
 -M2  - Short Multicast Flow - multiple mode +
 -M3  - Long Multicast Flow - single mode +
 -M4  - Long Multicast Flow - multiple mode +

+Single mode - Osmtest is tested alone, with no other +apps that interact with OpenSM MC +

+Multiple mode - Could be run with other apps using MC with +OpenSM. Without -M, default flow testing is performed +

-t, --timeout
+This option specifies the time in milliseconds +used for transaction timeouts. +Specifying -t 0 disables timeouts. +Without -t, OpenSM defaults to a timeout value of +200 milliseconds. +
-l, --log_file
+This option defines the log to be the given file. +By default the log goes to stdout. +
-v, --verbose
+This option increases the log verbosity level. +The -v option may be specified multiple times +to further increase the verbosity level. +See the -vf option for more information about. +log verbosity. +
-V
+This option sets the maximum verbosity level and +forces log flushing. +The -V is equivalent to '-vf 0xFF -d 2'. +See the -vf option for more information about. +log verbosity. +
-vf
+This option sets the log verbosity level. +A flags field must follow the -D option. +A bit set/clear in the flags enables/disables a +specific log level as follows: +

+ BIT    LOG LEVEL ENABLED +
 ----   ----------------- +
 0x01 - ERROR (error messages) +
 0x02 - INFO (basic messages, low volume) +
 0x04 - VERBOSE (interesting stuff, moderate volume) +
 0x08 - DEBUG (diagnostic, high volume) +
 0x10 - FUNCS (function entry/exit, very high volume) +
 0x20 - FRAMES (dumps all SMP and GMP frames) +
 0x40 - ROUTING (dump FDB routing information) +
 0x80 - currently unused. +

+Without -vf, osmtest defaults to ERROR + INFO (0x3) +Specifying -vf 0 disables all messages +Specifying -vf 0xFF enables all messages (see -V) +High verbosity levels may require increasing +the transaction timeout with the -t option +

-h, --help
+Display this usage info then exit. +

+

+

AUTHORS

+ +
+
Hal Rosenstock
+<hal.rosenstock@gmail.com> + +
Eitan Zahavi
+<eitan@mellanox.co.il> + +
 
+

EXAMPLES

+

Note - osmtest will not run on the node where OpenSM is running.
See 'osmtest -h' for all options.

+

Functionality:

+
+

osmtest -f c            # creates osmtest.dat inventory file in the current directory; required by other osmtest runs.
- osmtest -f a            + osmtest -f v            + # validate the default inventory file 'osmtest.dat'.
osmtest -f a            # run all validation tests (expecting an input inventory file 'osmtest.dat' in the current folder).

-
-
+
+

Stress tests

osmtest -f  f -s1        #  @@ -4185,7 +4235,7 @@ include

DAT ENVIRONMENT:
-

DAT/DAPL 2.0 (free-build) libraries are identified in %SystemRoot% as +

DAT/DAPL 2.0 (free-build) libraries are identified in %SystemRoot%\System32 as dat2.dll and dapl2.dll.  Debug versions of the v2.0 runtime libraries are located in '%SystemDrive%\%ProgramFiles%\OFED'.

IA32 (aka, 32-bit) @@ -4195,9 +4245,9 @@ include

In order for DAT/DAPL programs to execute correctly, the runtime library files 'dat2.dll and dapl2.dll' must be present in one of the following folders: current - directory, %SystemRoot% or in the library search path.

+ directory, %SystemRoot%, %SystemRoot%\System32 or in the library search path.

The default OFED - installation places the runtime library files dat2.dll and dapl2.dll in the '%SystemRoot%' folder; + installation places the runtime library files dat2.dll and dapl2.dll in the '%SystemRoot%\System32' folder; symbol files (.pdb) are located in '%ProgramFiles%\OFED\'.

The default DAPL configuration file is defined as '%SystemDrive%\DAT\dat.conf'. This default @@ -4206,7 +4256,7 @@ include

Within the dat.conf file, the DAPL library specification can be located as the 5th whitespace separated line argument. By default the DAPL library file is installed as - '%SystemRoot%\dapl2.dll'.

+ '%SystemRoot%\System32\dapl2.dll'.

Should you choose to relocated the DAPL library file to a path where whitespace appears in the full library path specification, then the full library file specification @@ -4246,7 +4296,7 @@ include

  • File: - %windir%\dapl2.dll

  • + %windir%\System32\dapl2.dll

  • dat.conf Provider name: ibnic0v2

  • @@ -4265,7 +4315,7 @@ include

    Socket-CM Provider

    • -

      File: %windir%\dapl2-ofa-scm.dll

      +

      File: %windir%\System32\dapl2-ofa-scm.dll

    • dat.conf @@ -4289,7 +4339,7 @@ include

      RDMA-CM Provider

      • -

        File: %windir%\dapl2-ofa-cma.dll

        +

        File: %windir%\System32\dapl2-ofa-cma.dll

      • dat.conf Provider @@ -5352,21 +5402,21 @@ RDMA devices

        SYNOPSIS

        DESCRIPTION

        ibv_get_device_list() returns a NULL-terminated array of RDMA devices currently available. The argument num_devices is optional; if not NULL, -it is set to the number of devices returned in the array. +it is set to the number of devices returned in the array.

        ibv_free_device_list() frees the array of devices list returned by ibv_get_device_list().

        RETURN VALUE

        ibv_get_device_list() returns the array of available RDMA devices, or sets errno and returns NULL if the request fails. If no devices are found -then num_devices is set to 0, and non-NULL is returned. +then num_devices is set to 0, and non-NULL is returned.

        ibv_free_device_list() returns no value.

        ERRORS

        EPERM
        -
        Permission denied. +
        Permission denied.
        ENOSYS
        -
        No kernel support for RDMA. +
        No kernel support for RDMA.
        ENOMEM
        Insufficient memory to complete the operation.
        @@ -5376,7 +5426,7 @@ Client code should open all the devices it intends to use with ibv_open_device() before calling ibv_free_device_list(). Once it frees the array with ibv_free_device_list(), it will be able to use only the open devices; pointers to unopened devices will no longer be valid. -  + 

        SEE ALSO

        ibv_get_device_name, ibv_get_device_guid, @@ -5384,7 +5434,7 @@ the open devices; pointers to unopened devices will no longer be valid.

        IBV_GET_DEVICE_GUID


        NAME

        -ibv_get_device_guid - get an RDMA device's GUID +ibv_get_device_guid - get an RDMA device's GUID

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -5393,7 +5443,7 @@ ibv_get_device_guid - get an RDMA device's GUID
         ibv_get_device_name() returns the Global Unique IDentifier (GUID) of the 
         RDMA device device.

        RETURN VALUE

        ibv_get_device_guid() returns the GUID of the device in network byte -order.   +order.  

        SEE ALSO

        ibv_get_device_list, ibv_get_device_name, @@ -5420,7 +5470,7 @@ the request fails.

        SEE ALSO

        IBV_CLOSE_DEVICE


        NAME

        -ibv_open_device, ibv_close_device - open and close an RDMA device context +ibv_open_device, ibv_close_device - open and close an RDMA device context

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -5430,16 +5480,16 @@ ibv_open_device, ibv_close_device - open and close an RDMA device context
         

        DESCRIPTION

        ibv_open_device() opens the device device and creates a context -for further use. +for further use.

        ibv_close_device() closes the device context context.

        RETURN VALUE

        ibv_open_device() returns a pointer to the allocated device context, or -NULL if the request fails. +NULL if the request fails.

        ibv_close_device() returns 0 on success, -1 on failure.

        NOTES

        ibv_close_device() does not release all the resources allocated using context context. To avoid resource leaks, the user should release all -associated resources before closing a context. +associated resources before closing a context.

        SEE ALSO

        ibv_get_device_list, ibv_query_device, @@ -5455,7 +5505,7 @@ associated resources before closing a context.

        NAME

        ibv_get_async_event, ibv_ack_async_event - get or acknowledge asynchronous -events   +events  

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -5466,7 +5516,7 @@ events  
         

        DESCRIPTION

        ibv_get_async_event() waits for the next async event of the RDMA device context context and returns it through the pointer event, which is -an ibv_async_event struct, as defined in <infiniband/verbs.h>. +an ibv_async_event struct, as defined in <infiniband/verbs.h>.

        struct ibv_async_event {
         union {
        @@ -5539,21 +5589,21 @@ member of the structure. event_type will be one of the following events: 

        ibv_ack_async_event() acknowledge the async event event.

        RETURN VALUE

        -ibv_get_async_event() returns 0 on success, and -1 on error. +ibv_get_async_event() returns 0 on success, and -1 on error.

        ibv_ack_async_event() returns no value.

        NOTES

        All async events that ibv_get_async_event() returns must be acknowledged using ibv_ack_async_event(). To avoid races, destroying an object (CQ, SRQ or QP) will wait for all affiliated events for the object to be acknowledged; this avoids an application retrieving an affiliated event after -the corresponding object has already been destroyed. +the corresponding object has already been destroyed.

        ibv_get_async_event() is a blocking function. If multiple threads call this function simultaneously, then when an async event occurs, only one thread will receive it, and it is not possible to predict which thread will receive it.

        EXAMPLES

        The following code example demonstrates one possible way to work with async -events in non-blocking mode. It performs the following steps: +events in non-blocking mode. It performs the following steps:

        1. Set the async events queue work mode to be non-blocked
        2. Poll the queue until it has an async event
        3. Get the async event and ack it

        @@ -5593,13 +5643,13 @@ ibv_ack_async_event(&async_event);

        SEE ALSO

        -ibv_open_device +ibv_open_device

         


        IBV_QUERY_DEVICE


        NAME

        -ibv_query_device - query an RDMA device's attributes   +ibv_query_device - query an RDMA device's attributes  

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -5608,7 +5658,7 @@ ibv_query_device - query an RDMA device's attributes  
         

        DESCRIPTION

        ibv_query_device() returns the attributes of the device with context context. The argument device_attr is a pointer to an ibv_device_attr -struct, as defined in <infiniband/verbs.h>. +struct, as defined in <infiniband/verbs.h>.

        struct ibv_device_attr {
         char                    fw_ver[64];             /* FW version */
        @@ -5654,7 +5704,7 @@ uint8_t                 phys_port_cnt;          /* Number of physical ports */
         };

        RETURN VALUE

        ibv_query_device() returns 0 on success, or the value of errno on failure -(which indicates the failure reason).   +(which indicates the failure reason).  

        NOTES

        The maximum values returned by this function are the upper limits of supported resources by the device. However, it may not be possible to use these maximum @@ -5664,7 +5714,7 @@ permissions, and the amount of resources already in use by other users/processes.

        SEE ALSO

        ibv_open_device, ibv_query_port, ibv_query_pkey, -ibv_query_gid +ibv_query_gid

         


        IBV_QUERY_GID

        @@ -5683,13 +5733,13 @@ RETURN VALUE ibv_open_device, ibv_query_device, ibv_query_port, -ibv_query_pkey +ibv_query_pkey

         


        IBV_QUERY_PKEY


        NAME

        -ibv_query_pkey - query an InfiniBand port's P_Key table +ibv_query_pkey - query an InfiniBand port's P_Key table

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -5699,7 +5749,7 @@ ibv_query_pkey - query an InfiniBand port's P_Key table
         ibv_query_pkey() returns the P_Key value (in network byte order) in entry
         index of port port_num for device context context through 
         the pointer pkey.

        RETURN VALUE

        -ibv_query_pkey() returns 0 on success, and -1 on error. +ibv_query_pkey() returns 0 on success, and -1 on error.

        SEE ALSO

        ibv_open_device, ibv_query_device, @@ -5709,7 +5759,7 @@ the pointer pkey.

        RETURN VALUE

        IBV_QUERY_PORT

        NAME

        -ibv_query_port - query an RDMA port's attributes +ibv_query_port - query an RDMA port's attributes

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -5718,7 +5768,7 @@ ibv_query_port - query an RDMA port's attributes
         

        DESCRIPTION

        ibv_query_port() returns the attributes of port port_num for device context context through the pointer port_attr. The argument -port_attr is an ibv_port_attr struct, as defined in <infiniband/verbs.h>. +port_attr is an ibv_port_attr struct, as defined in <infiniband/verbs.h>.

        struct ibv_port_attr {
         enum ibv_port_state     state;          /* Logical port state */
        @@ -5743,7 +5793,7 @@ uint8_t                 phys_state;     /* Physical port state */
         };

        RETURN VALUE

        ibv_query_port() returns 0 on success, or the value of errno on failure -(which indicates the failure reason). +(which indicates the failure reason).

        SEE ALSO

        ibv_create_qp, ibv_destroy_qp, ibv_query_qp, @@ -5761,17 +5811,17 @@ SYNOPSIS int ibv_dealloc_pd(struct ibv_pd *pd);

        DESCRIPTION

        -ibv_alloc_pd() allocates a PD for the RDMA device context context. - +ibv_alloc_pd() allocates a PD for the RDMA device context context. +

        ibv_dealloc_pd() deallocates the PD pd.

        RETURN VALUE

        ibv_alloc_pd() returns a pointer to the allocated PD, or NULL if the -request fails. +request fails.

        ibv_dealloc_pd() returns 0 on success, or the value of errno on failure (which indicates the failure reason).  

        NOTES

        ibv_dealloc_pd() may fail if any other resource is still associated with -the PD being freed.   +the PD being freed.  

        SEE ALSO

        ibv_reg_mr, ibv_create_srq, ibv_create_qp, @@ -5783,7 +5833,7 @@ the PD being freed.  

        NAME

        ibv_reg_mr, ibv_dereg_mr - register or deregister a memory region (MR) -  + 

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -5796,7 +5846,7 @@ ibv_reg_mr, ibv_dereg_mr - register or deregister a memory region (MR)
         protection domain pd. The MR's starting address is addr and its 
         size is length. The argument access describes the desired memory 
         protection attributes; it is either 0 or the bitwise OR of one or more of the 
        -following flags:
        +following flags:
         

        IBV_ACCESS_LOCAL_WRITE Enable Local Write Access
        @@ -5821,7 +5871,7 @@ request fails. The local key (L_Key) field lkey is used as the lkey field of struct ibv_sge when posting buffers with ibv_post_* verbs, and the the remote key (R_Key) field rkey is used by remote processes to perform Atomic and RDMA operations. The remote process places this rkey -as the rkey field of struct ibv_send_wr passed to the ibv_post_send function. +as the rkey field of struct ibv_send_wr passed to the ibv_post_send function.

        ibv_dereg_mr() returns 0 on success, or the value of errno on failure (which indicates the failure reason).

        NOTES

        @@ -5829,7 +5879,7 @@ as the rkey field of struct ibv_send_wr passed to the ibv_post_send function. SEE ALSO ibv_alloc_pd, ibv_post_send, ibv_post_recv, -ibv_post_srq_recv +ibv_post_srq_recv

         


        IBV_CREATE_AH

        @@ -5848,7 +5898,7 @@ SYNOPSIS

        DESCRIPTION

        ibv_create_ah() creates an address handle (AH) associated with the protection domain pd. The argument attr is an ibv_ah_attr struct, -as defined in <infiniband/verbs.h>. +as defined in <infiniband/verbs.h>.

        struct ibv_ah_attr {
         struct ibv_global_route grh;            /* Global Routing Header (GRH) attributes */
        @@ -5872,13 +5922,13 @@ uint8_t                 traffic_class;  /* Traffic class */
         

        ibv_destroy_ah() destroys the AH ah.

        RETURN VALUE

        ibv_create_ah() returns a pointer to the created AH, or NULL if the -request fails. +request fails.

        ibv_destroy_ah() returns 0 on success, or the value of errno on failure (which indicates the failure reason).

        SEE ALSO

        ibv_alloc_pd, ibv_init_ah_from_wc, -ibv_create_ah_from_wc +ibv_create_ah_from_wc

         


        IBV_CREATE_AH_FROM_WC

        @@ -5887,7 +5937,7 @@ failure (which indicates the failure reason).


        NAME

        ibv_init_ah_from_wc, ibv_create_ah_from_wc - initialize or create an address -handle (AH) from a work completion   +handle (AH) from a work completion  

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -5904,19 +5954,19 @@ handle (AH) from a work completion  
         ibv_init_ah_from_wc() initializes the address handle (AH) attribute 
         structure ah_attr for the RDMA device context context using the 
         port number port_num, using attributes from the work completion wc 
        -and the Global Routing Header (GRH) structure grh.
        -
        +and the Global Routing Header (GRH) structure grh.
        +
         

        ibv_create_ah_from_wc() creates an AH associated with the protection domain pd using the port number port_num, using attributes from the work completion wc and the Global Routing Header (GRH) structure grh.

        RETURN VALUE

        -ibv_init_ah_from_wc() returns 0 on success, and -1 on error. +ibv_init_ah_from_wc() returns 0 on success, and -1 on error.

        ibv_create_ah_from_wc() returns a pointer to the created AH, or NULL if the request fails.  

        NOTES

        The filled structure ah_attr returned from ibv_init_ah_from_wc() -can be used to create a new AH using ibv_create_ah(). +can be used to create a new AH using ibv_create_ah().

        SEE ALSO

        ibv_open_device, ibv_alloc_pd, ibv_create_ah, @@ -5935,13 +5985,13 @@ completion event channel

        SYNOPSIS

        int ibv_destroy_comp_channel(struct ibv_comp_channel *channel);

        DESCRIPTION

        ibv_create_comp_channel() creates a completion event channel for the RDMA -device context context. - +device context context. +

        ibv_destroy_comp_channel() destroys the completion event channel channel.

        RETURN VALUE

        ibv_create_comp_channel() returns a pointer to the created completion -event channel, or NULL if the request fails. +event channel, or NULL if the request fails.

        ibv_destroy_comp_channel() returns 0 on success, or the value of errno on failure (which indicates the failure reason).

        NOTES

        @@ -5951,7 +6001,7 @@ Specification. A completion channel is essentially file descriptor that is used to deliver completion notifications to a userspace process. When a completion event is generated for a completion queue (CQ), the event is delivered via the completion channel attached to that CQ. This may be useful to steer completion -events to different threads by using multiple completion channels. +events to different threads by using multiple completion channels.

        ibv_destroy_comp_channel() fails if any CQs are still associated with the completion event channel being destroyed.

        SEE ALSO

        @@ -5962,7 +6012,7 @@ the completion event channel being destroyed.


        NAME

        ibv_create_cq, ibv_destroy_cq - create or destroy a completion queue (CQ) -  + 

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -5979,17 +6029,17 @@ will be used to set user context pointer of the CQ structure. The argument 
         channel is optional; if not NULL, the completion channel channel will 
         be used to return completion events. The CQ will use the completion vector 
         comp_vector for signaling completion events; it must be at least zero and 
        -less than context->num_comp_vectors.
        -
        +less than context->num_comp_vectors.
        +
         

        ibv_destroy_cq() destroys the CQ cq.

        RETURN VALUE

        ibv_create_cq() returns a pointer to the CQ, or NULL if the request -fails. +fails.

        ibv_destroy_cq() returns 0 on success, or the value of errno on failure (which indicates the failure reason).

        NOTES

        ibv_create_cq() may create a CQ with size greater than or equal to the -requested size. Check the cqe attribute in the returned CQ for the actual size. +requested size. Check the cqe attribute in the returned CQ for the actual size.

        ibv_destroy_cq() fails if any queue pair is still associated with this CQ.

        SEE ALSO

        @@ -6000,7 +6050,7 @@ CQ.

        IBV_POLL_CQ


        NAME

        -ibv_poll_cq - poll a completion queue (CQ)   +ibv_poll_cq - poll a completion queue (CQ)  

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -6010,7 +6060,7 @@ ibv_poll_cq - poll a completion queue (CQ)  
         ibv_poll_cq() polls the CQ cq for work completions and returns the 
         first num_entries (or all available completions if the CQ contains fewer 
         than this number) in the array wc. The argument wc is a pointer to 
        -an array of ibv_wc structs, as defined in <infiniband/verbs.h>.
        +an array of ibv_wc structs, as defined in <infiniband/verbs.h>.
         

        struct ibv_wc {
         uint64_t                wr_id;          /* ID of the completed Work Request (WR) */
        @@ -6071,7 +6121,7 @@ size.

        RETURN VALUE

        requested size. The cqe member of cq will be updated to the actual size.

        SEE ALSO

        -ibv_create_cq ibv_destroy_cq +ibv_create_cq ibv_destroy_cq

         


        IBV_GET_CQ_EVENT

        @@ -6080,7 +6130,7 @@ SEE ALSO

        NAME

        ibv_get_cq_event, ibv_ack_cq_events - get and acknowledge completion queue (CQ) -events +events

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -6091,23 +6141,23 @@ events
         

        DESCRIPTION

        ibv_get_cq_event() waits for the next completion event in the completion event channel channel. Fills the arguments cq with the CQ that got -the event and cq_context with the CQ's context. +the event and cq_context with the CQ's context.

        ibv_ack_cq_events() acknowledges nevents events on the CQ cq.

        RETURN VALUE

        -ibv_get_cq_event() returns 0 on success, and -1 on error. +ibv_get_cq_event() returns 0 on success, and -1 on error.

        ibv_ack_cq_events() returns no value.  

        NOTES

        All completion events that ibv_get_cq_event() returns must be acknowledged using ibv_ack_cq_events(). To avoid races, destroying a CQ will wait for all completion events to be acknowledged; this guarantees a -one-to-one correspondence between acks and successful gets. +one-to-one correspondence between acks and successful gets.

        Calling ibv_ack_cq_events() may be relatively expensive in the datapath, since it must take a mutex. Therefore it may be better to amortize this cost by keeping a count of the number of events needing acknowledgement and acking several completion events in one call to ibv_ack_cq_events().

        EXAMPLES

        The following code example demonstrates one possible way to work with completion -events. It performs the following steps: +events. It performs the following steps:

        Stage I: Preparation
        1. Creates a CQ
        2. Requests for notification upon a new (first) completion event

        @@ -6223,8 +6273,8 @@ SYNOPSIS int ibv_req_notify_cq(struct ibv_cq *cq, int solicited_only);

        DESCRIPTION

        ibv_req_notify_cq() requests a completion notification on the completion -queue (CQ) cq. - +queue (CQ) cq. +

        Upon the addition of a new CQ entry (CQE) to cq, a completion event will be added to the completion channel associated with the CQ. If the argument solicited_only is zero, a completion event is generated for any new CQE. @@ -6237,7 +6287,7 @@ successful send completion is unsolicited.

        ibv_req_notify_cq() returns 0 on success, or the value of errno on failure (which indicates the failure reason).

        NOTES

        The request for notification is "one shot." Only one completion event will be -generated for each call to ibv_req_notify_cq().   +generated for each call to ibv_req_notify_cq().  

        SEE ALSO

        ibv_create_comp_channel, ibv_create_cq, ibv_get_cq_event

         

        @@ -6251,7 +6301,7 @@ generated for each call to ibv_req_notify_cq().  

        NAME

        ibv_create_srq, ibv_destroy_srq - create or destroy a shared receive queue (SRQ) -  + 

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -6266,8 +6316,8 @@ ibv_create_srq, ibv_destroy_srq - create or destroy a shared receive queue (SRQ)
         int ibv_destroy_srq(struct ibv_srq *srq);

        DESCRIPTION

        ibv_create_srq() creates a shared receive queue (SRQ) associated with the -protection domain pd. - +protection domain pd. +

        ibv_create_xrc_srq() creates an XRC shared receive queue (SRQ) associated with the protection domain pd, the XRC domain xrc_domain and the CQ which will hold the XRC completion xrc_cq.

        @@ -6291,7 +6341,7 @@ max_wr and max_sge will be greater than or equal to the values requested.

        ibv_destroy_srq() destroys the SRQ srq.

        RETURN VALUE

        ibv_create_srq() returns a pointer to the created SRQ, or NULL if the -request fails. +request fails.

        ibv_destroy_srq() returns 0 on success, or the value of errno on failure (which indicates the failure reason).

        NOTES

        @@ -6312,7 +6362,7 @@ ibv_modify_srq - modify attributes of a shared receive queue (SRQ)

        SYNOPSISDESCRIPTION

        ibv_modify_srq() modifies the attributes of SRQ srq with the attributes in srq_attr according to the mask srq_attr_mask. The -argument srq_attr is an ibv_srq_attr struct, as defined in <infiniband/verbs.h>. +argument srq_attr is an ibv_srq_attr struct, as defined in <infiniband/verbs.h>.

        struct ibv_srq_attr {
         uint32_t                max_wr;      /* maximum number of outstanding work requests (WRs) in the SRQ */
        @@ -6334,7 +6384,7 @@ following flags: 

        ibv_modify_srq() returns 0 on success, or the value of errno on failure (which indicates the failure reason).

        NOTES

        If any of the modify attributes is invalid, none of the attributes will be -modified. +modified.

        Not all devices support resizing SRQs. To check if a device supports it, check if the IBV_DEVICE_SRQ_RESIZE bit is set in the device capabilities flags.

        @@ -6356,7 +6406,7 @@ ibv_query_srq - get the attributes of a shared receive queue (SRQ)

        SYNOPSISDESCRIPTION

        ibv_query_srq() gets the attributes of the SRQ srq and returns them through the pointer srq_attr. The argument srq_attr is an -ibv_srq_attr struct, as defined in <infiniband/verbs.h>. +ibv_srq_attr struct, as defined in <infiniband/verbs.h>.

        struct ibv_srq_attr {
         uint32_t                max_wr;         /* maximum number of outstanding work requests (WRs) in the SRQ */
        @@ -6368,7 +6418,7 @@ uint32_t                srq_limit;      /* the limit value of the SRQ */
         (which indicates the failure reason).

        NOTES

        If the value returned for srq_limit is 0, then the SRQ limit reached ("low watermark") event is not (or no longer) armed, and no asynchronous events will -be generated until the event is rearmed.   +be generated until the event is rearmed.  

        SEE ALSO

        ibv_create_srq, ibv_destroy_srq, @@ -6390,7 +6440,7 @@ This QP number should be passed to the remote node (sender). The remote node will use xrc_rcv_qpn in ibv_post_send() when sending to an XRC SRQ on this host in the same xrc domain as the XRC receive QP. This QP is created in kernel space, and persists until the last process registered for the QP calls -ibv_unreg_xrc_rcv_qp() (at which time the QP is destroyed). +ibv_unreg_xrc_rcv_qp() (at which time the QP is destroyed).

        The process which creates this QP is automatically registered for it, and should also call ibv_unreg_xrc_rcv_qp() at some point, to unregister.

        Processes which wish to receive on an XRC SRQ via this QP should call @@ -6436,7 +6486,7 @@ number xrc_qp_num which is associated with the XRC domain xrc_domainattr according to the mask attr_mask and move the QP state through the following transitions: Reset -> Init -> RTR. attr_mask should indicate all of the attributes which will be used in this -QP transition and the following masks (at least) should be set: +QP transition and the following masks (at least) should be set:

        Next state     Required attributes
         ----------     ----------------------------------------
        @@ -6532,17 +6582,17 @@ argument is either 0 or the bitwise OR of one or more of the following flags:
         

        RETURN VALUE

        ibv_modify_xrc_rcv_qp() returns 0 on success, or the value of errno on -failure (which indicates the failure reason). +failure (which indicates the failure reason).

        NOTES

        If any of the modify attributes or the modify mask are invalid, none of the -attributes will be modified (including the QP state). +attributes will be modified (including the QP state).

        Not all devices support alternate paths. To check if a device supports it, check if the IBV_DEVICE_AUTO_PATH_MIG bit is set in the device capabilities flags.

        SEE ALSO

        ibv_open_xrc_domain, ibv_create_xrc_rcv_qp, -ibv_query_xrc_rcv_qp +ibv_query_xrc_rcv_qp

         


        IBV_OPEN_XRC_DOMAIN

        @@ -6551,7 +6601,7 @@ capabilities flags.


        NAME

        ibv_open_xrc_domain, ibv_close_xrc_domain - open or close an eXtended Reliable -Connection (XRC) domain +Connection (XRC) domain

        SYNOPSIS

        #include <fcntl.h>
         #include <infiniband/verbs.h>
        @@ -6564,21 +6614,21 @@ Connection (XRC) domain
         context context or return a reference to an opened one. fd is the 
         file descriptor to be associated with the XRC domain. The argument oflag 
         describes the desired file creation attributes; it is either 0 or the bitwise OR 
        -of one or more of the following flags:
        +of one or more of the following flags:
         

        O_CREAT
        If a domain belonging to device named by context is already associated with the inode, this flag has no effect, except as noted under O_EXCL below. Otherwise, a new XRC domain is created and is associated with inode - specified by fd. - + specified by fd. +
        O_EXCL
        If O_EXCL and O_CREAT are set, open will fail if a domain associated with the inode exists. The check for the existence of the domain and creation of the domain if it does not exist is atomic with respect to - other processes executing open with fd naming the same inode. + other processes executing open with fd naming the same inode.

        If fd equals -1, no inode is is associated with the domain, and the @@ -6587,12 +6637,12 @@ only valid value for oflag is O_CREAT.

        last reference, the XRC domain will be destroyed.

        RETURN VALUE

        ibv_open_xrc_domain() returns a pointer to an opened XRC, or NULL if the -request fails. +request fails.

        ibv_close_xrc_domain() returns 0 on success, or the value of errno on failure (which indicates the failure reason).

        NOTES

        Not all devices support XRC. To check if a device supports it, check if the -IBV_DEVICE_XRC bit is set in the device capabilities flags. +IBV_DEVICE_XRC
        bit is set in the device capabilities flags.

        ibv_close_xrc_domain() may fail if any QP or SRQ are still associated with the XRC domain being closed.

        SEE ALSO

        @@ -6606,7 +6656,7 @@ with the XRC domain being closed.

        IBV_QUERY_XRC_RCV_QP

        NAME

        -ibv_query_xrc_rcv_qp - get the attributes of an XRC receive queue pair (QP) +ibv_query_xrc_rcv_qp - get the attributes of an XRC receive queue pair (QP)

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -6618,7 +6668,7 @@ ibv_query_xrc_rcv_qp - get the attributes of an XRC receive queue pair (QP)
         for the XRC receive QP with the number xrc_qp_num which is associated 
         with the XRC domain xrc_domain and returns them through the pointers 
         attr and init_attr. The argument attr is an ibv_qp_attr 
        -struct, as defined in <infiniband/verbs.h>.
        +struct, as defined in <infiniband/verbs.h>.
         

        struct ibv_qp_attr {
         enum ibv_qp_state       qp_state;            /* Current QP state */
        @@ -6655,7 +6705,7 @@ For details on struct ibv_ah_attr see the description of ibv_create_ah().
         failure (which indicates the failure reason).

        NOTES

        The argument attr_mask is a hint that specifies the minimum list of attributes to retrieve. Some InfiniBand devices may return extra attributes not -requested, for example if the value can be returned cheaply. +requested, for example if the value can be returned cheaply.

        Attribute values are valid if they have been set using ibv_modify_xrc_rcv_qp(). The exact list of valid attributes depends on the QP state.

        @@ -6673,7 +6723,7 @@ sq_draining, ah_attr (if APM is enabled).


        NAME

        ibv_reg_xrc_rcv_qp, ibv_unreg_xrc_rcv_qp - register and unregister a user -process with an XRC receive queue pair (QP)   +process with an XRC receive queue pair (QP)  

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -6682,8 +6732,8 @@ process with an XRC receive queue pair (QP)  
         

        DESCRIPTION

        ibv_reg_xrc_rcv_qp() registers a user process with the XRC receive QP (created via ibv_create_xrc_rcv_qp() ) whose number is xrc_qp_num, -and which is associated with the XRC domain xrc_domain. - +and which is associated with the XRC domain xrc_domain. +

        ibv_unreg_xrc_rcv_qp() unregisters a user process from the XRC receive QP number xrc_qp_num, which is associated with the XRC domain xrc_domain. When the number of user processes registered with this XRC @@ -6695,8 +6745,8 @@ NOTES ibv_reg_xrc_rcv_qp() and ibv_unreg_xrc_rcv_qp() may fail if the number xrc_qp_num is not a number of a valid XRC receive QP (the QP is not allocated or it is the number of a non-XRC QP), or the XRC receive QP was -created with an XRC domain other than xrc_domain. - +created with an XRC domain other than xrc_domain. +

        If a process is still registered with any XRC RCV QPs belonging to some domain, ibv_close_xrc_domain() will return failure if called for that domain in that process.

        @@ -6725,7 +6775,7 @@ ibv_create_qp, ibv_destroy_qp - create or destroy a queue pair (QP)

        SYNOPSIS<

        DESCRIPTION

        ibv_create_qp() creates a queue pair (QP) associated with the protection domain pd. The argument qp_init_attr is an ibv_qp_init_attr -struct, as defined in <infiniband/verbs.h>. +struct, as defined in <infiniband/verbs.h>.

        struct ibv_qp_init_attr {
         void                   *qp_context;     /* Associated context of the QP */
        @@ -6752,24 +6802,24 @@ created; the values will be greater than or equal to the values requested. 

        ibv_destroy_qp() destroys the QP qp.

        RETURN VALUE

        ibv_create_qp() returns a pointer to the created QP, or NULL if the -request fails. Check the QP number (qp_num) in the returned QP. +request fails. Check the QP number (qp_num) in the returned QP.

        ibv_destroy_qp() returns 0 on success, or the value of errno on failure (which indicates the failure reason).

        NOTES

        ibv_create_qp() will fail if a it is asked to create QP of a type other -than IBV_QPT_RC or IBV_QPT_UD associated with an SRQ. +than IBV_QPT_RC or IBV_QPT_UD associated with an SRQ.

        The attributes max_recv_wr and max_recv_sge are ignored by ibv_create_qp() if the QP is to be associated with an SRQ.

        ibv_destroy_qp() fails if the QP is attached to a multicast group.

        SEE ALSO

        ibv_alloc_pd, ibv_modify_qp, -ibv_query_qp +ibv_query_qp

         


        IBV_MODIFY_QP


        NAME

        -ibv_modify_qp - modify the attributes of a queue pair (QP) +ibv_modify_qp - modify the attributes of a queue pair (QP)

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -6778,7 +6828,7 @@ ibv_modify_qp - modify the attributes of a queue pair (QP)
         

        DESCRIPTION

        ibv_modify_qp() modifies the attributes of QP qp with the attributes in attr according to the mask attr_mask. The argument -attr is an ibv_qp_attr struct, as defined in <infiniband/verbs.h>. +attr is an ibv_qp_attr struct, as defined in <infiniband/verbs.h>.

        struct ibv_qp_attr {
         enum ibv_qp_state       qp_state;               /* Move the QP to this state */
        @@ -6864,7 +6914,7 @@ argument is either 0 or the bitwise OR of one or more of the following flags:
         ibv_modify_qp() returns 0 on success, or the value of errno on failure 
         (which indicates the failure reason).

        NOTES

        If any of the modify attributes or the modify mask are invalid, none of the -attributes will be modified (including the QP state). +attributes will be modified (including the QP state).

        Not all devices support resizing QPs. To check if a device supports it, check if the IBV_DEVICE_RESIZE_MAX_WR bit is set in the device capabilities flags.

        @@ -6927,8 +6977,8 @@ SYNOPSIS with wr to the receive queue of the queue pair qp. It stops processing WRs from this list at the first failure (that can be detected immediately while requests are being posted), and returns this failing WR -through bad_wr. - +through bad_wr. +

        The argument wr is an ibv_recv_wr struct, as defined in <infiniband/verbs.h>.

        @@ -6946,11 +6996,11 @@ uint32_t lkey; /* Key of the local Memory Region */ };

        RETURN VALUE

        ibv_post_recv() returns 0 on success, or the value of errno on failure -(which indicates the failure reason). +(which indicates the failure reason).

        NOTES

        The buffers used by a WR can only be safely reused after WR the request is fully executed and a work completion has been retrieved from the corresponding -completion queue (CQ). +completion queue (CQ).

        If the QP qp is associated with a shared receive queue, you must use the function ibv_post_srq_recv(), and not ibv_post_recv(), since the QP's own receive queue will not be used.

        @@ -6963,14 +7013,14 @@ list.

        SEE ALSO

        ibv_create_qp, ibv_post_send, ibv_post_srq_recv, -ibv_poll_cq +ibv_poll_cq

         

         


        IBV_POST_SEND


        NAME

        -ibv_post_send - post a list of work requests (WRs) to a send queue +ibv_post_send - post a list of work requests (WRs) to a send queue

        SYNOPSIS

        #include <infiniband/verbs.h>
         
        @@ -6981,8 +7031,8 @@ ibv_post_send - post a list of work requests (WRs) to a send queue
         with wr to the send queue of the queue pair qp. It stops 
         processing WRs from this list at the first failure (that can be detected 
         immediately while requests are being posted), and returns this failing WR 
        -through bad_wr.
        -
        +through bad_wr.
        +
         

        The argument wr is an ibv_send_wr struct, as defined in <infiniband/verbs.h>.

        @@ -7050,16 +7100,16 @@ following flags:

        IBV_SEND_INLINE Send data in given gather list as inline data
        in a send WQE. Valid only for Send and RDMA Write. The L_Key will not be - checked. + checked.

        RETURN VALUE

        ibv_post_send() returns 0 on success, or the value of errno on failure -(which indicates the failure reason). +(which indicates the failure reason).

        NOTES

        The user should not alter or destroy AHs associated with WRs until request is fully executed and a work completion has been retrieved from the corresponding -completion queue (CQ) to avoid unexpected behavior. +completion queue (CQ) to avoid unexpected behavior.

        The buffers used by a WR can only be safely reused after WR the request is fully executed and a work completion has been retrieved from the corresponding completion queue (CQ). However, if the IBV_SEND_INLINE flag was set, the buffer @@ -7086,8 +7136,8 @@ ibv_post_srq_recv - post a list of work requests (WRs) to a shared receive queue ibv_post_srq_recv() posts the linked list of work requests (WRs) starting with wr to the shared receive queue (SRQ) srq. It stops processing WRs from this list at the first failure (that can be detected immediately while -requests are being posted), and returns this failing WR through bad_wr. - +requests are being posted), and returns this failing WR through bad_wr. +

        The argument wr is an ibv_recv_wr struct, as defined in <infiniband/verbs.h>.

        @@ -7108,7 +7158,7 @@ uint32_t lkey; /* Key of the local Memory Region */ failure (which indicates the failure reason).

        NOTES

        The buffers used by a WR can only be safely reused after WR the request is fully executed and a work completion has been retrieved from the corresponding -completion queue (CQ). +completion queue (CQ).

        If a WR is being posted to a UD QP, the Global Routing Header (GRH) of the incoming message will be placed in the first 40 bytes of the buffer(s) in the scatter list. If no GRH is present in the incoming message, then the first bytes @@ -7118,7 +7168,7 @@ list.

        SEE ALSO

        ibv_create_qp, ibv_post_send, ibv_post_recv, -ibv_poll_cq +ibv_poll_cq

         

         


        @@ -7134,7 +7184,7 @@ ibv_query_qp - get the attributes of a queue pair (QP)

        SYNOPSIS

        DESCRIPTION

        ibv_query_qp() gets the attributes specified in attr_mask for the QP qp and returns them through the pointers attr and init_attr. -The argument attr is an ibv_qp_attr struct, as defined in <infiniband/verbs.h>. +The argument attr is an ibv_qp_attr struct, as defined in <infiniband/verbs.h>.

        struct ibv_qp_attr {
         enum ibv_qp_state       qp_state;            /* Current QP state */
        @@ -7172,8 +7222,8 @@ For details on struct ibv_ah_attr see the description of ibv_create_ah().
         The argument attr_mask is a hint that specifies the minimum list of 
         attributes to retrieve. Some RDMA devices may return extra attributes not 
         requested, for example if the value can be returned cheaply. This has the same 
        -form as in ibv_modify_qp().
        -
        +form as in ibv_modify_qp().
        +
         

        Attribute values are valid if they have been set using ibv_modify_qp(). The exact list of valid attributes depends on the QP state.

        Multiple calls to ibv_query_qp() may yield some differences in the @@ -7199,15 +7249,15 @@ to/from a multicast group

        SYNOPSIS

        int ibv_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid);

        DESCRIPTION

        ibv_attach_mcast() attaches the QP qp to the multicast group -having MGID gid and MLID lid. - +having MGID gid and MLID lid. +

        ibv_detach_mcast() detaches the QP qp to the multicast group having MGID gid and MLID lid.

        RETURN VALUE

        ibv_attach_mcast() and ibv_detach_mcast() returns 0 on success, or the value of errno on failure (which indicates the failure reason).

        NOTES

        Only QPs of Transport Service Type IBV_QPT_UD may be attached to -multicast groups. +multicast groups.

        If a QP is attached to the same multicast group multiple times, the QP will still receive a single copy of a multicast message.

        In order to receive multicast messages, a join request for the multicast @@ -7233,16 +7283,16 @@ mult_to_ibv_rate - convert multiplier of 2.5 Gbit/sec to an IB rate enumeration<

        DESCRIPTION

        ibv_rate_to_mult() converts the IB transmission rate enumeration rate to a multiple of 2.5 Gbit/sec (the base rate). For example, if rate is -IBV_RATE_5_GBPS, the value 2 will be returned (5 Gbit/sec = 2 * 2.5 Gbit/sec). +IBV_RATE_5_GBPS, the value 2 will be returned (5 Gbit/sec = 2 * 2.5 Gbit/sec).

        mult_to_ibv_rate() converts the multiplier value (of 2.5 Gbit/sec) mult to an IB transmission rate enumeration. For example, if mult is 2, the rate enumeration IBV_RATE_5_GBPS will be returned.

        RETURN VALUE

        -ibv_rate_to_mult() returns the multiplier of the base rate 2.5 Gbit/sec. +ibv_rate_to_mult() returns the multiplier of the base rate 2.5 Gbit/sec.

        mult_to_ibv_rate() returns the enumeration representing the IB transmission rate.

        SEE ALSO

        -ibv_query_port +ibv_query_port

        <return-to-top>

        @@ -7622,7 +7672,7 @@ returned events must be acked before calling this function.   RDMA_RESOLVE_ADDR

        NAME

        -rdma_resolve_addr - Resolve destination and optional source addresses. +rdma_resolve_addr - Resolve destination and optional source addresses.

        SYNOPSIS

        #include <rdma/rdma_cma.h>

         int rdma_resolve_addr (struct rdma_cm_id *id, struct sockaddr *src_addr, @@ -7630,22 +7680,22 @@ rdma_cm_id *id, struct sockaddr *src_addr,ARGUMENTS

        id
        -
        RDMA identifier. +
        RDMA identifier.
        src_addr
        -
        Source address information. This parameter may be NULL. +
        Source address information. This parameter may be NULL.
        dst_addr
        -
        Destination address information. +
        Destination address information.
        timeout_ms
        -
        Time to wait for resolution to complete. +
        Time to wait for resolution to complete.

        DESCRIPTION

        Resolve destination and optional source addresses from IP addresses to an RDMA address. If successful, the specified rdma_cm_id will be bound to a local -device.   +device.  

        NOTES

        This call is used to map a given destination IP address to a usable RDMA address. The IP to RDMA address mapping is done using the local routing tables, @@ -7655,24 +7705,24 @@ given, and the rdma_cm_id has not yet been bound to a device, then the rdma_cm_id will be bound to a source address based on the local routing tables. After this call, the rdma_cm_id will be bound to an RDMA device. This call is typically made from the active side of a connection before calling -rdma_resolve_route and rdma_connect.   +rdma_resolve_route and rdma_connect.  

        INFINIBAND SPECIFIC

        This call maps the destination and, if given, source IP addresses to GIDs. In order to perform the mapping, IPoIB must be running on both the local and remote -nodes.   +nodes.  

        SEE ALSO

        rdma_create_id, rdma_resolve_route, rdma_connect, rdma_create_qp, rdma_get_cm_event, rdma_bind_addr, rdma_get_src_port, rdma_get_dst_port, rdma_get_local_addr, -rdma_get_peer_addr +rdma_get_peer_addr

         


        RDMA_GET_CM_EVENT


        NAME

        -rdma_get_cm_event - Retrieves the next pending communication event. +rdma_get_cm_event - Retrieves the next pending communication event.

        SYNOPSIS

        #include <rdma/rdma_cma.h>

         int rdma_get_cm_event (struct rdma_event_channel *channel, struct rdma_cm_event **event); @@ -7680,15 +7730,15 @@ rdma_event_channel *channel, struct rdma_cm_event **

        ARGUMENTS

        channel
        -
        Event channel to check for events. +
        Event channel to check for events.
        event
        -
        Allocated information about the next communication event. +
        Allocated information about the next communication event.

        DESCRIPTION

        Retrieves a communication event. If no events are pending, by default, the call -will block until an event is received. +will block until an event is received.

        NOTES

        The default synchronous behavior of this routine can be changed by modifying the file descriptor associated with the given channel. All events that are reported @@ -7696,36 +7746,36 @@ must be acknowledged by calling rdma_ack_cm_event. Destruction of an rdma_cm_id will block until related events have been acknowledged.

        EVENT DATA

        Communication event details are returned in the rdma_cm_event structure. This structure is allocated by the rdma_cm and released by the rdma_ack_cm_event -routine. Details of the rdma_cm_event structure are given below. +routine. Details of the rdma_cm_event structure are given below.
        id
        The rdma_cm identifier associated with the event. If the event type is RDMA_CM_EVENT_CONNECT_REQUEST, then this references a new id for that - communication. + communication.
        listen_id
        For RDMA_CM_EVENT_CONNECT_REQUEST event types, this references the - corresponding listening request identifier. + corresponding listening request identifier.
        event
        Specifies the type of communication event which occurred. See EVENT - TYPES below. + TYPES below.
        status
        Returns any asynchronous error information associated with an event. The - status is zero unless the corresponding operation failed. + status is zero unless the corresponding operation failed.
        param
        Provides additional details based on the type of event. Users should select the conn or ud subfields based on the rdma_port_space of the rdma_cm_id associated with the event. See UD EVENT DATA and CONN EVENT DATA - below. + below.

        UD EVENT DATA

        Event parameters related to unreliable datagram (UD) services: RDMA_PS_UDP and RDMA_PS_IPOIB. The UD event data is valid for RDMA_CM_EVENT_ESTABLISHED and -RDMA_CM_EVENT_MULTICAST_JOIN events, unless stated otherwise. +RDMA_CM_EVENT_MULTICAST_JOIN events, unless stated otherwise.
        private_data
        References any user-specified data associated with @@ -7733,19 +7783,19 @@ RDMA_CM_EVENT_MULTICAST_JOIN events, unless stated otherwise. referenced by this field matches that specified by the remote side when calling rdma_connect or rdma_accept. This field is NULL if the event does not include private data. The buffer referenced by this pointer is - deallocated when calling rdma_ack_cm_event. + deallocated when calling rdma_ack_cm_event.
        private_data_len
        The size of the private data buffer. Users should note that the size of the private data buffer may be larger than the amount of private data sent - by the remote side. Any additional space in the buffer will be zeroed out. + by the remote side. Any additional space in the buffer will be zeroed out.
        ah_attr
        Address information needed to send data to the remote endpoint(s). Users - should use this structure when allocating their address handle. + should use this structure when allocating their address handle.
        qp_num
        -
        QP number of the remote endpoint or multicast group. +
        QP number of the remote endpoint or multicast group.
        qkey
        QKey needed to send data to the remote endpoint(s).
        @@ -7754,115 +7804,115 @@ RDMA_CM_EVENT_MULTICAST_JOIN events, unless stated otherwise.

        CONN EVENT DATA

        Event parameters related to connected QP services: RDMA_PS_TCP. The connection related event data is valid for RDMA_CM_EVENT_CONNECT_REQUEST and -RDMA_CM_EVENT_ESTABLISHED events, unless stated otherwise. +RDMA_CM_EVENT_ESTABLISHED events, unless stated otherwise.
        private_data
        References any user-specified data associated with the event. The data referenced by this field matches that specified by the remote side when calling rdma_connect or rdma_accept. This field is NULL if the event does not include private data. The buffer referenced by this pointer is - deallocated when calling rdma_ack_cm_event. + deallocated when calling rdma_ack_cm_event.
        private_data_len
        The size of the private data buffer. Users should note that the size of the private data buffer may be larger than the amount of private data sent - by the remote side. Any additional space in the buffer will be zeroed out. + by the remote side. Any additional space in the buffer will be zeroed out.
        responder_resources
        The number of responder resources requested of the recipient. This field matches the initiator depth specified by the remote node when calling - rdma_connect and rdma_accept. + rdma_connect and rdma_accept.
        initiator_depth
        The maximum number of outstanding RDMA read/atomic operations that the recipient may have outstanding. This field matches the responder resources - specified by the remote node when calling rdma_connect and rdma_accept. + specified by the remote node when calling rdma_connect and rdma_accept.
        flow_control
        -
        Indicates if hardware level flow control is provided by the sender. +
        Indicates if hardware level flow control is provided by the sender.
        retry_count
        For RDMA_CM_EVENT_CONNECT_REQUEST events only, indicates the number of - times that the recipient should retry send operations. + times that the recipient should retry send operations.
        rnr_retry_count
        The number of times that the recipient should retry receiver not ready - (RNR) NACK errors. + (RNR) NACK errors.
        srq
        -
        Specifies if the sender is using a shared-receive queue. +
        Specifies if the sender is using a shared-receive queue.
        qp_num
        -
        Indicates the remote QP number for the connection. +
        Indicates the remote QP number for the connection.

        EVENT TYPES

        -The following types of communication events may be reported. +The following types of communication events may be reported.
        RDMA_CM_EVENT_ADDR_RESOLVED
        -
        Address resolution (rdma_resolve_addr) completed successfully. +
        Address resolution (rdma_resolve_addr) completed successfully.
        RDMA_CM_EVENT_ADDR_ERROR
        -
        Address resolution (rdma_resolve_addr) failed. +
        Address resolution (rdma_resolve_addr) failed.
        RDMA_CM_EVENT_ROUTE_RESOLVED
        -
        Route resolution (rdma_resolve_route) completed successfully. +
        Route resolution (rdma_resolve_route) completed successfully.
        RDMA_CM_EVENT_ROUTE_ERROR
        -
        Route resolution (rdma_resolve_route) failed. +
        Route resolution (rdma_resolve_route) failed.
        RDMA_CM_EVENT_CONNECT_REQUEST
        Generated on the passive side to notify the user of a new connection - request. + request.
        RDMA_CM_EVENT_CONNECT_RESPONSE
        Generated on the active side to notify the user of a successful response to a connection request. It is only generated on rdma_cm_id's that do not - have a QP associated with them. + have a QP associated with them.
        RDMA_CM_EVENT_CONNECT_ERROR
        Indicates that an error has occurred trying to establish or a - connection. May be generated on the active or passive side of a connection. + connection. May be generated on the active or passive side of a connection.
        RDMA_CM_EVENT_UNREACHABLE
        Generated on the active side to notify the user that the remote server - is not reachable or unable to respond to a connection request. + is not reachable or unable to respond to a connection request.
        RDMA_CM_EVENT_REJECTED
        Indicates that a connection request or response was rejected by the - remote end point. + remote end point.
        RDMA_CM_EVENT_ESTABLISHED
        Indicates that a connection has been established with the remote end - point. + point.
        RDMA_CM_EVENT_DISCONNECTED
        -
        The connection has been disconnected. +
        The connection has been disconnected.
        RDMA_CM_EVENT_DEVICE_REMOVAL
        The local RDMA device associated with the rdma_cm_id has been removed. - Upon receiving this event, the user must destroy the related rdma_cm_id. + Upon receiving this event, the user must destroy the related rdma_cm_id.
        RDMA_CM_EVENT_MULTICAST_JOIN
        The multicast join operation (rdma_join_multicast) completed - successfully. + successfully.
        RDMA_CM_EVENT_MULTICAST_ERROR
        An error either occurred joining a multicast group, or, if the group had already been joined, on an existing group. The specified multicast group is - no longer accessible and should be rejoined, if desired. + no longer accessible and should be rejoined, if desired.
        RDMA_CM_EVENT_ADDR_CHANGE
        The network device associated with this ID through address resolution changed its HW address, eg following of bonding failover. This event can serve as a hint for applications who want the links used for their RDMA - sessions to align with the network stack. + sessions to align with the network stack.
        RDMA_CM_EVENT_TIMEWAIT_EXIT
        The QP associated with a connection has exited its timewait state and is now ready to be re-used. After a QP has been disconnected, it is maintained in a timewait state to allow any in flight packets to exit the network. - After the timewait state has completed, the rdma_cm will report this event. + After the timewait state has completed, the rdma_cm will report this event.

        SEE ALSO

        @@ -7888,7 +7938,7 @@ rdma_cm_event *event);
        event
        -
        Event to be released. +
        Event to be released.
        @@ -7901,7 +7951,7 @@ should be a one-to-one correspondence between successful gets and acks. This call frees the event structure and any memory that it references.

        SEE ALSO

        -rdma_get_cm_event, rdma_destroy_id +rdma_get_cm_event, rdma_destroy_id

         

        @@ -8344,7 +8394,7 @@ address.

        SEE ALSO

        rdma_connect, rdma_accept, rdma_get_cm_event, rdma_get_src_port, rdma_get_local_addr, -rdma_get_peer_addr +rdma_get_peer_addr

         


        RDMA_GET_LOCAL_ADDR

        @@ -8440,14 +8490,14 @@ rdma_resolve_addr, rdma_create_qp, RDMA_LEAVE_MULTICAST

        NAME

        -rdma_leave_multicast - Leaves a multicast group. +rdma_leave_multicast - Leaves a multicast group.

        SYNOPSIS

        #include <rdma/rdma_cma.h>

         int rdma_leave_multicast (struct rdma_cm_id *id, struct sockaddr *addr);

        ARGUMENTS

        id
        -
        Communication identifier associated with the request. +
        Communication identifier associated with the request.
        addr
        Multicast address identifying the group to leave.
        @@ -8460,7 +8510,7 @@ multicast group may stilled be queued for completion processing immediately after leaving a multicast group. Destroying an rdma_cm_id will automatically leave all multicast groups.

        SEE ALSO

        rdma_join_multicast, -rdma_destroy_qp +rdma_destroy_qp

         


        RDMA_SET_OPTION

        -- 2.46.0