Updating SCSI targets while in a production environment.
It still amazes me to see storage administrators bringing the same Microsoft Windows mentality to the UNIX and Linux environments. That is, after changes to a configuration are made “reboot the console to view all changes.” Now while Microsoft Windows does a fairly decent job of updating any changes made to the SCSI Subsystem, UNIX and GNU/Linux still handle it somewhat differently. Rebooting the console should be the LAST thing anybody does. These operating systems are so modular that in most cases there is absolutely no need to reboot; unless you have made changes to the kernel. If you are working in a production environment, down time can mean lost time which results in lost revenue. So when you update your SAN, how do you manage your storage configuration changes on your GNU/Linux 2.6 and Sun Solaris hosts?
GNU/Linux
There is a noteworthy variable in the SCSI Subsystem designed on the Linux 2.6 kernel, which some may find it to be problematic. Although I believe that this is how it should be. That variable lies in the Lower-Layer of the Subsystem where the Host Bus Adapter (HBA) modules reside. While it is true that the Linux 2.6 kernel supports hotplugging which includes SCSI devices, the HBA modules are designed in such a way which would lead a novice storage administrator to believe otherwise.
For example, let us say I am working with Fibre Channel (FC) devices and I use a Qlogic, Emulex or LSI FC HBA. I have Logical Units (LU) mapped to the Fibre Channel Node Ports on the host’s HBA. So when I insert my modules for a Qlogic qla2340 (qla2300/qla2xxx), all Logical Unit Numbers (LUN) are recognized by the SCSI Subsystem and immediately udev assigns them the appropriate node names (/dev/sda, /dev/sdb, etc.). At least through the FC HBA, if the LUN mappings change and I add/remove devices, the HBA will not report any changes to the host and therefore the SCSI Subsystem is not updated. There are a few methods to getting the HBA to report changes to the GNU/Linux host, one of which is a reboot (which provides the same functions as the next method). The second is to reload the module and have it report the latest LUN mapping(s) to the SCSI Subsystem while the third, being the most appropriate of methods, does not require any downtime. It takes a simple command. To add a device:
echo "scsi remove-single-device 1 2 3 4" > /proc/scsi/scsi |
And to remove it:
echo "scsi add-single-device 1 2 3 4" > /proc/scsi/scsi |
Where 1=host; 2=channel; 3=id; 4=lun.
In the SCSI portion of the kernel source, the file drivers/scsi/scsi_proc.c has a function routine that takes these inputs and after parsing them, it will eventually verify that the target being added/removed is a valid one and the action is then performed. That function routine is:
static ssize_t proc_scsi_write(struct file *file, const char __user *buf, size_t length, loff_t *ppos){ ... } |
Sounds simple, right? It is. No reboot or re-insertion of the module is necessary. Now I mentioned earlier about how I would prefer this method over a more dynamic approach as is seen in Microsoft Windows. Allow me to explain. When your hosting storage for an enterprise environment, anything outside of a static configuration can produce hazardous results. A great example was when I was working with LSI Logic to correct an unimplemented functionality which allowed for the disabling of hotplugging in their Serial Attached SCSI (SAS) HBA device drivers. To my knowledge those patches I submitted are still implemented to this day. Without the administrator’s knowledge, whevener the storage configuration gets updated even with flaky symptoms (i.e. a drive drops offline and back online again), it can bring down an entire server. Let us say I have 2 LUNs mapped which udev desingated as /dev/sda and /dev/sdb. I mount these devices as /mnt/mnt1 and /mnt/mnt2. Now let us say that the hotpluggable feature incorporated into the HBA’s device driver is enabled and for some reason something happens and the LUN that has been allocated to /dev/sda drops offline for a few seconds. Who knows the external storage controller may be acting up. It happens all too frequently which is why Multipathing with Failover capabilities is a must. The mount path associated with that device (/mnt/mnt1) is still mounted and holds /dev/sda, preventing udev from removing the node. Meanwhile the SAS HBA realizes that a “new” drive (i.e. the drive that momentarily dropped offline) has been recognized and goes through the usual process to make the device accessable by the host. Now, wait a minute, /dev/sda is still taken and mounted to /mnt/mnt1. What happens now? A new device node is allocated: /dev/sdc and the path to the drive changes. The /mnt/mnt1 mount must be removed and /dev/sdc would have to mount to it instead. But the administrator still does not have any knowledge of the change. At least not until he/she reviews the kernel logs and notices nothing but SCSI Disk I/O errors when the original node was attempted to be written to.
Now there are ways around this, that is by working with udev and making specific devices with specific attributes lock to a specific device node. When hotplugging is not a feature or it is disable and a device drops offline for a quick moment, no changes to the configuration are reported to the SCSI layer and when it comes back online, it resume its original role.
Sun Solaris
Now working with Solaris is a bit different. Let us now say that you changed your SAN configuration and whatever has been mapped to your Sun box’s FC HBAs has been modified. Sometimes it is as simple as running 1 command. At the command prompt you would first type:
devfsadmn |
This will update all changes in the SCSI layer. So now when you type format, your new devices will appear. And what if they don’t? The very handy utility luxadm comes into play. First list all of your HBA ports and their status:
luxadm -e port |
The function traverses through the /devices path (this is similar to the Linux /sys path of sysfs) and produce a list of results that look similar to this:
/devices/pci@7,600000/SUNW,qlc@2/fp@0,0:devctl CONNECTED |
Now what you would want to do is force a lip (FC terminology) through each FC node.
luxadm -e forcelip /devices/pci@7,600000/SUNW,qlc@2/fp@0,0:devctl |
Type format again, and now you SHOULD see the added disk device(s).

