-------------
0. Supported distributions: RHEL 5.2/5.3/5.4, SLES 10 sp2/sp3, SLES 11
-NOTES: On distribution default kernels you can run scst_vdisk blockio mode
- to have good performance. You can also run scst_disk ie. scsi pass-thru
- mode; however, you have to compile scst with -DSTRICT_SERIALIZING
- enabled and this does not yield good performance.
- It is required to recompile the kernel to have good performance with
- scst_disk ie. scsi pass-thru mode
+NOTES: On distribution default kernels, you can run scst_vdisk blockio mode
+ to have good performance.
+
+ It is required to patch and recompile the kernel to run scst_disk
+ ie. scsi pass-thru mode
+ OR
+ You have to compile scst with -DSTRICT_SERIALIZING enabled and this
+ does not yield good performance.
1. Download and install SCST driver (supported version 1.0.1.1)
$ tar zxvf scst-1.0.1.1.tar.gz
$ cd scst-1.0.1.1
- THIS STEP IS SPECIFIC FOR SLES 10 sp2 distribution:
+ THIS STEP IS SPECIFIC FOR SLES 10 sp2/sp3 distributions:
$ patch -p1 -i <path to OFED>/docs/scst/scst_sles10_sp2.patch
$ make && make install
- NOTES: FOR SLES 11 distribution, skip this step and go directly to step (2)
+NOTES: FOR SLES 11 distribution, skip next step (step 1c) and go directly to
+ step (2)
1c. patch scst.h header file with scst.patch
OR
+ You can edit /etc/infiniband/openib.conf to load srp driver and srp HA daemon
-automatically ie. set SRP_LOAD=yes, and SRPHA_ENABLE=yes
+automatically ie. set SRP_LOAD=yes, SRP_DAEMON_ENABLE=yes, and SRPHA_ENABLE=yes
+ To set up and use high availability feature you need dm-multipath driver
and multipath tool
-+ Please refer to OFED-1.x SRP's user manual for more in-details instructions
-on how-to enable/use HA feature
++ Please refer to OFED-1.5.1 SRP's user manual for more in-details instructions
+on how-to enable/use HA feature (OFED-1.5.1/docs/srp_release_notes.txt)
Here is an example of srp target setup file
$ modprobe -r scst_vdisk scst
3. Unload ofed
$ /etc/rc.d/openibd stop
+
+===========================================================================
+Known Issues
+===========================================================================
+
+- With active connections/sesssions and active I/Os, unload ib_srpt driver
+ will randomly fail and got stuck.
+
+- With active connections/sessions with active I/Os, reboot system will
+ randomly get stuck.
+
Open Fabrics Enterprise Distribution (OFED)
- SRP in OFED 1.5 Release Notes
+ SRP in OFED 1.5.1 Release Notes
- December 2009
+ March 2010
==============================================================================
==============================================================================
1. Overview
- 2. Changes and Bug Fixes since OFED 1.3.1
+ 2. Changes and Bug Fixes since OFED 1.5
3. Software Dependencies
4. Major Features
5. Loading SRP Initiator
==============================================================================
-2. Changes and Bug Fixes since OFED 1.3.1
+2. Changes and Bug Fixes since OFED 1.5
==============================================================================
* Check for scsi_id in scmnd to prevent scan/rescan keep adding new scsi devices
ie. echo "- - -" > /sys/class/scsi_host/hostXX/scan
b. edit /etc/modprobe.conf and add the following line:
options ib_srp srp_sg_tablesize=32
+Module paramters:
+For the list of ib_srp module parameters
+ $ modinfo ib_srp
+
+ + srp_sg_tablesze: Max number of scatter/gather entries per I/O
+ + srp_dev_loss_tmo: Number of seconds that srp driver will not return
+ DID_NO_CONNECT status when it loss connection to target.
+ During this period, it will try to re-establish
+ the connection to target, and return DID_RESET,
+ DID_ABORT statuses for outstanding scsi command to
+ prevent DM Multipath driver to failover to next paths.
+ Default value is 60 seconds.
+
==============================================================================
7. Manually Establishing an SRP Connection
==============================================================================
pkey=ffff,service_id=[service[0] value] > \
/sys/class/infiniband_srp/srp-mthca[hca number]-[port number]/add_target
- Notes:
a. Execution of the above "echo" command may take some time
b. The SM must be running while the command executes
c. It is possible to include additional parameters in the echo command:
d. See SRP Tools below for instructions on how the parameters in the
echo command above may be obtained.
+NOTES:
+
+- Using the same *echo -n <same paramters>* more than one, the srp target
+ will terminate the previous connection and re-establish the new
+ connection. To have more than two connections to srp target, please use
+ different inititiator_ext values in echo command.
+
- To list the new SCSI devices that have been added by the echo command, you
may use either of the following two methods:
a. Execute "fdisk -l". This command lists all devices; the new devices are
included in this listing.
- b. Execute "dmesg" or look at /var/log/messages to find messages with the names
- of the new devices.
+ b. Execute *dmesg* or look at /var/log/messages to find messages with the
+ names of the new devices.
==============================================================================
a. To detect all targets reachable by the SRP initiator via the default
umad device (/dev/umad0), execute the following command:
- > ibsrpdm
+ $ ibsrpdm
This command will output information on each SRP target detected, in
human-readable form.
b. To detect all the SRP Targets reachable by the SRP Initiator via
another umad device, use the following command:
- > ibsrpdm -d <umad device>
+ $ ibsrpdm -d <umad device>
2. Assistance in creating an SRP connection
a. To generate output suitable for utilization in the "echo" command of
section 5, add the "-c" option to ibsrpdm:
- >ibsrpdm -c
+ $ ibsrpdm -c
Sample output:
id_ext=200400A0B81146A1,ioc_guid=0002c90200402bd4,
- dgid=fe800000000000000002c90200402bd5,pkey=ffff,service_id=200400a0b81146a1
+ dgid=fe800000000000000002c90200402bd5,pkey=ffff,
+ service_id=200400a0b81146a1
b. To establish a connection with an SRP Target (Section 6) using the output
from the "libsrpdm -c" example above, execute the following command:
- echo -n id_ext=200400A0B81146A1,ioc_guid=0002c90200402bd4,
- dgid=fe800000000000000002c90200402bd5,pkey=ffff,service_id=200400a0b81146a1
- > /sys/class/infiniband_srp/srp-mthca0-1/add_target
+ $ echo -n id_ext=200400A0B81146A1,ioc_guid=0002c90200402bd4,
+ dgid=fe800000000000000002c90200402bd5,pkey=ffff,
+ service_id=200400a0b81146a1
+ > /sys/class/infiniband_srp/srp-mlnx_0-1/add_target
- The SRP connection should now be up; the newly created SCSI devices should appear
- in the listing obtained from the "fdisk -l" command.
+ The SRP connection should now be up; the newly created SCSI devices should
+ appear in the listing obtained from the "fdisk -l" command.
+
srp_daemon
----------
srp_daemon -e -o. This utility will scan the fabric once, connect to
every Target it detects, and then exit.
- NOTE: srp_daemon will follow the configuration it finds in
+NOTE: srp_daemon will follow the configuration it finds in
/etc/srp_daemon.conf. Thus, it will ignore a target that is disallowed in
the configuration file.
- It is possible to configure this script to execute automatically when the
InfiniBand driver starts by changing the value of SRP_DAEMON_ENABLE in
- /etc/infiniband/openib.conf to "yes".
+ /etc/infiniband/openib.conf to "yes" and SRP_LOAD to yes as well.
+
Another option to to configure this script to execute automatically when the
InfiniBand driver starts is by changing the value of SRPHA_ENABLE in
/etc/infiniband/openib.conf to "yes". However, this option also enables
If you use srp_daemon with -n flag, it automatically assigns initiator_ext
values according to this convention. For example:
- id_ext=200500A0B81146A1,ioc_guid=0002c90200402bec,dgid=fe800000000000000002c90200402bed,\
+ id_ext=200500A0B81146A1,ioc_guid=0002c90200402bec,
+ dgid=fe800000000000000002c90200402bed,
pkey=ffff,service_id=200500a0b81146a1,initiator_ext=ed2b400002c90200
Notes:
Automatic Activation of High Availability
-----------------------------------------
- Set the value of SRPHA_ENABLE in /etc/infiniband/openib.conf to "yes".
+ Also make sure SRP_LOAD=yes and SRP_DAEMON_ENABLE=yes.
- From the next loading of the driver it will be possible to access the SRP
LUNs on /dev/mapper/
12. Shutting Down SRP
==============================================================================
-SRP can be shutdown by using "rmmod ib_srp", or by stopping the OFED driver
+SRP can be shutdown by using "modprobe -r ib_srp", or by stopping the OFED
("/etc/init.d/openibd stop"), or as a by-product of a complete system shutdown.
-Prior to shutting down SRP, remove all references to it. The actions you need
-to take depend on the way SRP was loaded. There are three cases.
+Prior to shutting down SRP, it is REQUIRED to remove all references to it.
+The actions you need to take depend on the way SRP was loaded. There are
+three cases.
a. Without High Availability
------------------------------------
-When working without High Availability, you should unmount the SRP
+When working without High Availability, you should unmount all SRP
partitions that were mounted prior to shutting down SRP.
-
+For example, /dev/sdd1 is srp partition and mount to /mnt/test
+$ umount /mnt/test
+$ modprobe -r ib_srp
+
+NOTES: the umount may get stuck ~90 seconds per connection to target if the
+ target is down. This is due to the srp_dev_loss_tmo=60 seconds which
+ srp driver waits for the target coming back before returning error
+ status.
+ If you have shutdown/remove srp target and the host have 4 connections
+ to the SRP target, you should wait ~4-5 minutes for the unmount to exit.
+ Do not ctrl+c to kill umount process.
b. After Manual Activation of High Availability
-----------------------------------------------
If you manually activated SRP High Availability, perform the following steps:
-1) Unmount all SRP partitions that were mounted
-2) Kill the SRP daemon instances
-3) Make sure there are no multipath instances running. If there are multiple
- instances, wait for them to end or kill them.
-4) Execute multipath -F
-
+- Unmount all SRP partitions that were mounted
+- Kill all SRP daemon instances.
+- Make sure there are no multipath instances running. If there are multiple
+ instances, wait for them to end or kill them.
+- Execute multipath -F
+
+Example:
+$ umount /mnt/test1 /mnt/test2 (wait for it to exit, do not ctrl+c)
+$ ps -ax and kill all srp_daemon processes.
+$ multipath -ll (wait for it to exit, do not ctrl+c)
+$ multipath -F
+$ modprobe -r ib_srp
c. After Automatic Activation of High Availability
--------------------------------------------------
If SRP High Availability was automatically activated, SRP shutdown must be
part of the driver shutdown ("/etc/init.d/openibd stop") which performs
-steps 2-4 of case b above. However, you still have to unmount all SRP
+steps 2-5 of case (b) above. However, you still have to unmount all SRP
partitions that were mounted before driver shutdown.
to a certain target, srp_daemon will ignore the target. If you find out
that srp_daemon ignores a target, please check the /etc/srp_daemon.conf file.
+- Rebooting the system with unclean mounted filesystem and dead connection
+ to SRP target, the system may get stuck.
+
+- After establish the connection with srp target and rebooting the system,
+ initiator will fail to connect to target @ first manual *echo -n* command
+ (target reject with stale connection). You need to do *echo -n* one more
+ time.
+ You do not see this problem with srp_daemon mode since srp_daemon will
+ retry to connect.
+
==============================================================================
14. Vendor Specific Notes
==============================================================================