From: Vladimir Sokolovsky Date: Sun, 28 Mar 2010 07:34:07 +0000 (+0300) Subject: Updated SRP/T documents X-Git-Url: https://openfabrics.org/gitweb/?a=commitdiff_plain;h=66cbea9a3d11930756e0daf7370ba949053fb17b;p=compat-rdma%2Fdocs.git Updated SRP/T documents Signed-off-by: Vu Pham Signed-off-by: Vladimir Sokolovsky --- diff --git a/SRPT_README.txt b/SRPT_README.txt index 98881d4..ac16fb7 100644 --- a/SRPT_README.txt +++ b/SRPT_README.txt @@ -24,12 +24,14 @@ Prerequisites ------------- 0. Supported distributions: RHEL 5.2/5.3/5.4, SLES 10 sp2/sp3, SLES 11 -NOTES: On distribution default kernels you can run scst_vdisk blockio mode - to have good performance. You can also run scst_disk ie. scsi pass-thru - mode; however, you have to compile scst with -DSTRICT_SERIALIZING - enabled and this does not yield good performance. - It is required to recompile the kernel to have good performance with - scst_disk ie. scsi pass-thru mode +NOTES: On distribution default kernels, you can run scst_vdisk blockio mode + to have good performance. + + It is required to patch and recompile the kernel to run scst_disk + ie. scsi pass-thru mode + OR + You have to compile scst with -DSTRICT_SERIALIZING enabled and this + does not yield good performance. 1. Download and install SCST driver (supported version 1.0.1.1) @@ -41,7 +43,7 @@ NOTES: On distribution default kernels you can run scst_vdisk blockio mode $ tar zxvf scst-1.0.1.1.tar.gz $ cd scst-1.0.1.1 - THIS STEP IS SPECIFIC FOR SLES 10 sp2 distribution: + THIS STEP IS SPECIFIC FOR SLES 10 sp2/sp3 distributions: $ patch -p1 -i /docs/scst/scst_sles10_sp2.patch @@ -49,7 +51,8 @@ NOTES: On distribution default kernels you can run scst_vdisk blockio mode $ make && make install - NOTES: FOR SLES 11 distribution, skip this step and go directly to step (2) +NOTES: FOR SLES 11 distribution, skip next step (step 1c) and go directly to + step (2) 1c. patch scst.h header file with scst.patch @@ -165,11 +168,11 @@ dgid=fe800000000000000002c90200226cf5,pkey=ffff,service_id=0002c90200226cf4 > OR + You can edit /etc/infiniband/openib.conf to load srp driver and srp HA daemon -automatically ie. set SRP_LOAD=yes, and SRPHA_ENABLE=yes +automatically ie. set SRP_LOAD=yes, SRP_DAEMON_ENABLE=yes, and SRPHA_ENABLE=yes + To set up and use high availability feature you need dm-multipath driver and multipath tool -+ Please refer to OFED-1.x SRP's user manual for more in-details instructions -on how-to enable/use HA feature ++ Please refer to OFED-1.5.1 SRP's user manual for more in-details instructions +on how-to enable/use HA feature (OFED-1.5.1/docs/srp_release_notes.txt) Here is an example of srp target setup file @@ -207,3 +210,14 @@ How-to unload/shutdown $ modprobe -r scst_vdisk scst 3. Unload ofed $ /etc/rc.d/openibd stop + +=========================================================================== +Known Issues +=========================================================================== + +- With active connections/sesssions and active I/Os, unload ib_srpt driver + will randomly fail and got stuck. + +- With active connections/sessions with active I/Os, reboot system will + randomly get stuck. + diff --git a/srp_release_notes.txt b/srp_release_notes.txt index 80cc543..62caa2c 100644 --- a/srp_release_notes.txt +++ b/srp_release_notes.txt @@ -1,8 +1,8 @@ Open Fabrics Enterprise Distribution (OFED) - SRP in OFED 1.5 Release Notes + SRP in OFED 1.5.1 Release Notes - December 2009 + March 2010 ============================================================================== @@ -10,7 +10,7 @@ Table of contents ============================================================================== 1. Overview - 2. Changes and Bug Fixes since OFED 1.3.1 + 2. Changes and Bug Fixes since OFED 1.5 3. Software Dependencies 4. Major Features 5. Loading SRP Initiator @@ -34,7 +34,7 @@ target port using RDMA communication service. ============================================================================== -2. Changes and Bug Fixes since OFED 1.3.1 +2. Changes and Bug Fixes since OFED 1.5 ============================================================================== * Check for scsi_id in scmnd to prevent scan/rescan keep adding new scsi devices ie. echo "- - -" > /sys/class/scsi_host/hostXX/scan @@ -80,6 +80,19 @@ NOTE: When loading the ib_srp module, it is possible to set the module b. edit /etc/modprobe.conf and add the following line: options ib_srp srp_sg_tablesize=32 +Module paramters: +For the list of ib_srp module parameters + $ modinfo ib_srp + + + srp_sg_tablesze: Max number of scatter/gather entries per I/O + + srp_dev_loss_tmo: Number of seconds that srp driver will not return + DID_NO_CONNECT status when it loss connection to target. + During this period, it will try to re-establish + the connection to target, and return DID_RESET, + DID_ABORT statuses for outstanding scsi command to + prevent DM Multipath driver to failover to next paths. + Default value is 60 seconds. + ============================================================================== 7. Manually Establishing an SRP Connection ============================================================================== @@ -98,7 +111,6 @@ automatically. pkey=ffff,service_id=[service[0] value] > \ /sys/class/infiniband_srp/srp-mthca[hca number]-[port number]/add_target - Notes: a. Execution of the above "echo" command may take some time b. The SM must be running while the command executes c. It is possible to include additional parameters in the echo command: @@ -110,12 +122,19 @@ automatically. d. See SRP Tools below for instructions on how the parameters in the echo command above may be obtained. +NOTES: + +- Using the same *echo -n * more than one, the srp target + will terminate the previous connection and re-establish the new + connection. To have more than two connections to srp target, please use + different inititiator_ext values in echo command. + - To list the new SCSI devices that have been added by the echo command, you may use either of the following two methods: a. Execute "fdisk -l". This command lists all devices; the new devices are included in this listing. - b. Execute "dmesg" or look at /var/log/messages to find messages with the names - of the new devices. + b. Execute *dmesg* or look at /var/log/messages to find messages with the + names of the new devices. ============================================================================== @@ -143,7 +162,7 @@ ibsrpdm usage a. To detect all targets reachable by the SRP initiator via the default umad device (/dev/umad0), execute the following command: - > ibsrpdm + $ ibsrpdm This command will output information on each SRP target detected, in human-readable form. @@ -167,28 +186,31 @@ ibsrpdm usage b. To detect all the SRP Targets reachable by the SRP Initiator via another umad device, use the following command: - > ibsrpdm -d + $ ibsrpdm -d 2. Assistance in creating an SRP connection a. To generate output suitable for utilization in the "echo" command of section 5, add the "-c" option to ibsrpdm: - >ibsrpdm -c + $ ibsrpdm -c Sample output: id_ext=200400A0B81146A1,ioc_guid=0002c90200402bd4, - dgid=fe800000000000000002c90200402bd5,pkey=ffff,service_id=200400a0b81146a1 + dgid=fe800000000000000002c90200402bd5,pkey=ffff, + service_id=200400a0b81146a1 b. To establish a connection with an SRP Target (Section 6) using the output from the "libsrpdm -c" example above, execute the following command: - echo -n id_ext=200400A0B81146A1,ioc_guid=0002c90200402bd4, - dgid=fe800000000000000002c90200402bd5,pkey=ffff,service_id=200400a0b81146a1 - > /sys/class/infiniband_srp/srp-mthca0-1/add_target + $ echo -n id_ext=200400A0B81146A1,ioc_guid=0002c90200402bd4, + dgid=fe800000000000000002c90200402bd5,pkey=ffff, + service_id=200400a0b81146a1 + > /sys/class/infiniband_srp/srp-mlnx_0-1/add_target - The SRP connection should now be up; the newly created SCSI devices should appear - in the listing obtained from the "fdisk -l" command. + The SRP connection should now be up; the newly created SCSI devices should + appear in the listing obtained from the "fdisk -l" command. + srp_daemon ---------- @@ -248,7 +270,7 @@ b. srp_daemon extensions to ibsrpdm srp_daemon -e -o. This utility will scan the fabric once, connect to every Target it detects, and then exit. - NOTE: srp_daemon will follow the configuration it finds in +NOTE: srp_daemon will follow the configuration it finds in /etc/srp_daemon.conf. Thus, it will ignore a target that is disallowed in the configuration file. @@ -269,7 +291,8 @@ b. srp_daemon extensions to ibsrpdm - It is possible to configure this script to execute automatically when the InfiniBand driver starts by changing the value of SRP_DAEMON_ENABLE in - /etc/infiniband/openib.conf to "yes". + /etc/infiniband/openib.conf to "yes" and SRP_LOAD to yes as well. + Another option to to configure this script to execute automatically when the InfiniBand driver starts is by changing the value of SRPHA_ENABLE in /etc/infiniband/openib.conf to "yes". However, this option also enables @@ -297,7 +320,8 @@ use the Target port GUID as the initiator_ext value for the relevant path. If you use srp_daemon with -n flag, it automatically assigns initiator_ext values according to this convention. For example: - id_ext=200500A0B81146A1,ioc_guid=0002c90200402bec,dgid=fe800000000000000002c90200402bed,\ + id_ext=200500A0B81146A1,ioc_guid=0002c90200402bec, + dgid=fe800000000000000002c90200402bed, pkey=ffff,service_id=200500a0b81146a1,initiator_ext=ed2b400002c90200 Notes: @@ -391,6 +415,7 @@ Initialization: (Execute after each boot of the driver) Automatic Activation of High Availability ----------------------------------------- - Set the value of SRPHA_ENABLE in /etc/infiniband/openib.conf to "yes". + Also make sure SRP_LOAD=yes and SRP_DAEMON_ENABLE=yes. - From the next loading of the driver it will be possible to access the SRP LUNs on /dev/mapper/ @@ -404,33 +429,50 @@ Automatic Activation of High Availability 12. Shutting Down SRP ============================================================================== -SRP can be shutdown by using "rmmod ib_srp", or by stopping the OFED driver +SRP can be shutdown by using "modprobe -r ib_srp", or by stopping the OFED ("/etc/init.d/openibd stop"), or as a by-product of a complete system shutdown. -Prior to shutting down SRP, remove all references to it. The actions you need -to take depend on the way SRP was loaded. There are three cases. +Prior to shutting down SRP, it is REQUIRED to remove all references to it. +The actions you need to take depend on the way SRP was loaded. There are +three cases. a. Without High Availability ------------------------------------ -When working without High Availability, you should unmount the SRP +When working without High Availability, you should unmount all SRP partitions that were mounted prior to shutting down SRP. - +For example, /dev/sdd1 is srp partition and mount to /mnt/test +$ umount /mnt/test +$ modprobe -r ib_srp + +NOTES: the umount may get stuck ~90 seconds per connection to target if the + target is down. This is due to the srp_dev_loss_tmo=60 seconds which + srp driver waits for the target coming back before returning error + status. + If you have shutdown/remove srp target and the host have 4 connections + to the SRP target, you should wait ~4-5 minutes for the unmount to exit. + Do not ctrl+c to kill umount process. b. After Manual Activation of High Availability ----------------------------------------------- If you manually activated SRP High Availability, perform the following steps: -1) Unmount all SRP partitions that were mounted -2) Kill the SRP daemon instances -3) Make sure there are no multipath instances running. If there are multiple - instances, wait for them to end or kill them. -4) Execute multipath -F - +- Unmount all SRP partitions that were mounted +- Kill all SRP daemon instances. +- Make sure there are no multipath instances running. If there are multiple + instances, wait for them to end or kill them. +- Execute multipath -F + +Example: +$ umount /mnt/test1 /mnt/test2 (wait for it to exit, do not ctrl+c) +$ ps -ax and kill all srp_daemon processes. +$ multipath -ll (wait for it to exit, do not ctrl+c) +$ multipath -F +$ modprobe -r ib_srp c. After Automatic Activation of High Availability -------------------------------------------------- If SRP High Availability was automatically activated, SRP shutdown must be part of the driver shutdown ("/etc/init.d/openibd stop") which performs -steps 2-4 of case b above. However, you still have to unmount all SRP +steps 2-5 of case (b) above. However, you still have to unmount all SRP partitions that were mounted before driver shutdown. @@ -493,6 +535,16 @@ should make sure this will not occur. One solution may be to stop "haldaemon" to a certain target, srp_daemon will ignore the target. If you find out that srp_daemon ignores a target, please check the /etc/srp_daemon.conf file. +- Rebooting the system with unclean mounted filesystem and dead connection + to SRP target, the system may get stuck. + +- After establish the connection with srp target and rebooting the system, + initiator will fail to connect to target @ first manual *echo -n* command + (target reject with stale connection). You need to do *echo -n* one more + time. + You do not see this problem with srp_daemon mode since srp_daemon will + retry to connect. + ============================================================================== 14. Vendor Specific Notes ==============================================================================