From 7fe7a865526544852e4b99683352e4b2ea174ee1 Mon Sep 17 00:00:00 2001 From: Tziporet Koren Date: Tue, 26 Feb 2008 10:44:49 +0200 Subject: [PATCH] Doc explaining how to sue QoS in OFED Signed-off-by: Yevgeny Kliteynik --- QoS_in_OFED.txt | 216 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 216 insertions(+) create mode 100755 QoS_in_OFED.txt diff --git a/QoS_in_OFED.txt b/QoS_in_OFED.txt new file mode 100755 index 0000000..72f9e83 --- /dev/null +++ b/QoS_in_OFED.txt @@ -0,0 +1,216 @@ + + QoS support in OFED + +============================================================================== +Table of contents +============================================================================== + +1. Overview +2. Architecture +3. Supported Policy +4. CMA functionality +5. IPoIB functionality +6. SDP functionality +7. RDS functionality +8. SRP functionality +9. iSER functionality +10. OpenSM functionality + + +============================================================================== +1. Overview +============================================================================== + +Quality of Service requirements stem from the realization of I/O consolidation +over IB network: As multiple applications and ULPs share the same fabric, +means to control their use of the network resources are becoming a must. +The basic need is to differentiate the service levels provided to different +traffic flows, such that a policy could be enforced and control each flow +utilization of the fabric resources. + +IBTA specification defined several hardware features and management interfaces +to support QoS: +* Up to 15 Virtual Lanes (VL) carry traffic in a non-blocking manner +* Arbitration between traffic of different VLs is performed by a 2 priority + levels weighted round robin arbiter. The arbiter is programmable with + a sequence of (VL, weight) pairs and maximal number of high priority credits + to be processed before low priority is served +* Packets carry class of service marking in the range 0 to 15 in their + header SL field +* Each switch can map the incoming packet by its SL to a particular output + VL based on programmable table VL=SL-to-VL-MAP(in-port, out-port, SL) +* The Subnet Administrator controls each communication flow parameters + by providing them as a response to Path Record (PR) or MultiPathRecord (MPR) + queries + +The IB QoS features provide the means to implement a DiffServ like +architecture. DiffServ architecture (IETF RFC 2474 & 2475) is widely used +today in highly dynamic fabrics. + +This document provides the detailed functional definition for the various +software elements that enable a DiffServ like architecture over the +OpenFabrics software stack. + + +============================================================================== +2. Architecture +============================================================================== + +QoS functionality is split between the SM/SA, CMA and the various ULPS. +We take the "chronology approach" to describe how the overall system works. + +2.1. The network manager (human) provides a set of rules (policy) that +define how the network is being configured and how its resources are split +to different QoS-Levels. The policy also define how to decide which QoS-Level +each application or ULP or service use. + +2.2. The SM analyzes the provided policy to see if it is realizable and +performs the necessary fabric setup. Part of this policy defines the default +QoS-Level of each partition. The SA is enhanced to match the requested Source, +Destination, QoS-Class, Service-ID, PKey against the policy, so clients +(ULPs, programs) can obtain a policy enforced QoS. The SM may also set up +partitions with appropriate IPoIB broadcast group. This broadcast group +carries its QoS attributes: SL, MTU, RATE, and Packet Lifetime. + +2.3. IPoIB is being setup. IPoIB uses the SL, MTU, RATE and Packet Lifetime +available on the multicast group which forms the broadcast group of this +partition. + +2.4. MPI which provides non IB based connection management should be +configured to run using hard coded SLs. It uses these SLs for every QP +being opened. + +2.5. ULPs that use CM interface (like SRP) have their own pre-assigned +Service-ID and use it while obtaining PathRecord/MultiPathRecord (PR/MPR) +for establishing connections. The SA receiving the PR/MPR matches it +against the policy and returns the appropriate PR/MPR including SL, MTU, +RATE and Lifetime. + +2.6. ULPs and programs (e.g. SDP) use CMA to establish RC connection provide +the CMA the target IP and port number. ULPs might also provide QoS-Class. +The CMA then creates Service-ID for the ULP and passes this ID and optional +QoS-Class in the PR/MPR request. The resulting PR/MPR is used for configuring +the connection QP. + +PathRecord and MultiPathRecord enhancement for QoS: + +As mentioned above the PathRecord and MultiPathRecord attributes are enhanced +to carry the Service-ID which is a 64bit value. A new field QoS-Class is also +provided. +A new capability bit describes the SM QoS support in the SA class port info. +This approach provides an easy migration path for existing access layer and +ULPs by not introducing new set of PR/MPR attributes. + + +============================================================================== +3. Supported Policy +============================================================================== + +The QoS policy that is specified in a separate file is divided into +4 sub sections: + +I) Port Group: a set of CAs, Routers or Switches that share the same settings. + A port group might be a partition defined by the partition manager policy, + list of GUIDs, or list of port names based on NodeDescription. + +II) Fabric Setup: Defines how the SL2VL and VLArb tables should be setup. + NOTE: In OFED 1.3 this part of the policy is ignored. SL2VL and VLArb + tables should be configured in the OpenSM options file + (opensm.opts). + +III) QoS-Levels Definition: This section defines the possible sets of + parameters for QoS that a client might be mapped to. Each set holds + SL and optionally: Max MTU, Max Rate, Packet Lifetime and Path Bits. + NOTE: Path Bits are not implemented in OFED 1.3 + +IV) Matching Rules: A list of rules that match an incoming PR/MPR request + to a QoS-Level. The rules are processed in order such as the first match + is applied. Each rule is built out of a set of match expressions which + should all match for the rule to apply. The matching expressions are + defined for the following fields: + - SRC and DST to lists of port groups + - Service-ID to a list of Service-ID values or ranges + - QoS-Class to a list of QoS-Class values or ranges + + +============================================================================== +4. CMA features +============================================================================== + +The CMA interface supports Service-ID through the notion of port space +as a prefixes to the port_num which is part of the sockaddr provided to +rdma_resolve_add(). +CMP also allows the ULP (like SDP) to propagate a request for specific +QoS-Class. CMA uses the provided QoS-Class and Service-ID in the sent PR/MPR. + + +============================================================================== +5. IPoIB +============================================================================== + +IPoIB queries the SA for its broadcast group information. +It provides the broadcast group SL, MTU, and RATE in every following +PathRecord query performed when a new UDAV is needed by IPoIB. + + +============================================================================== +6. SDP +============================================================================== + +SDP uses CMA for building its connections. +The Service-ID for SDP is 0x000000000001PPPP, where PPPP are 4 hex digits +holding the remote TCP/IP Port Number to connect to. + + +============================================================================== +7. RDS +============================================================================== + +RDS uses CMA and thus it is very close to SDP. The Service-ID for RDS is +0x000000000106PPPP, where PPPP are 4 hex digits holding the TCP/IP Port +Number that the protocol connects to. +Default port number for RDS is 0x48CA, which makes a default Service-ID +0x00000000010648CA. + + +============================================================================== +8. SRP +============================================================================== + +Current SRP implementation uses its own CM callbacks (not CMA). So SRP fills +in the Service-ID in the PR/MPR by itself and use that information in setting +up the QP. +SRP Service-ID is defined by the SRP target I/O Controller (it also complies +with IBTA Service-ID rules). The Service-ID is reported by the I/O Controller +in the ServiceEntries DMA attribute and should be used in the PR/MPR if the +SA reports its ability to handle QoS PR/MPRs. + + +============================================================================== +9. iSER +============================================================================== + +Similar to RDS, iSER also uses CMA. The Service-ID for iSER is similar to RDS +(0x000000000106PPPP), with default port number 0x035C, which makes a default +Service-ID 0x000000000106035C. + + +============================================================================== +10. OpenSM features +============================================================================== + +The QoS related functionality that is provided by OpenSM can be split into two +main parts: + +10.1. Fabric Setup +During fabric initialization the SM parses the policy and apply its settings +to the discovered fabric elements. + +10.2. PR/MPR query handling: +OpenSM enforces the provided policy on client request. +The overall flow for such requests is: first the request is matched against +the defined match rules such that the target QoS-Level definition is found. +Given the QoS-Level a path(s) search is performed with the given restrictions +imposed by that level. + +============================================================================== -- 2.41.0