dd if=/dev/random of=/dev/blog

21. March 2009

device-mapper (dm): working with multipath-tools. Part 1

Filed under: Storage, SCSI, Linux — admin @ 10:33

Device-mapper (hereafter, dm) is one of the best collection of device drivers that I have ever worked with. It brings high availability, flexibility and more to the Linux 2.6 kernel. Device-mapper is a Linux 2.6 kernel infrastructure that provides a generic way to create virtual layers of a block device while supporting stripping, mirroring, snapshots, concatenation, multipathing, etc. While many modules are built on top of device-mapper, the focus of this article is on multipath-tools. Note that I will be using the terms multipath, multipath-tools and dm-multipath interchangeably to signify the same package. Also note that dm-multipath is the name of the repackaged multipath-tools redistributed under Red Hat in their Advanced Server Linux distribution.

Device-mapper multipath provides the following features (taken from HP dm-multipath reference guide):

  • Allows the multivendor Storage RAID systems and host servers equipped with multivendor Host Bus Adapters (HBAs) redundant physical connectivity along the independent Fibre Channel fabric paths available
  • Monitors each path and automatically reroutes (failover) I/O to an available functioning alternate path if an existing connection fails
  • Provides an option to perform fail-back of the LUN to the repaired paths
  • Implements failover or failback actions transparently without disrupting applications
  • Monitors each path and notifies if there is a change in the path status
  • Facilitates the load balancing among the multiple paths
  • Provides CLI with display options to configure and manage Multipath features
  • Provides all Device Mapper Multipath features support for any LUN newly added to the host
  • Provides an option to have customized names for the Device Mapper Multipath devices
  • Provides persistency to the Device Mapper Multipath devices across reboots if there are any change in the Storage Area Network
  • Provides policy based path grouping for the user to customize the I/O flow through specific set of paths

Installing multipath-tools

Installing multipath-tools is usually as simple as going to your distributions repository, finding the package and select it to be installed. You can always download it and build it from source; but most likely your distribution should have it in its repository. Again, note that multipath-tools runs on top of device-mapper, so you will need device-mapper installed in order to utilize multipath-tools.

Configuring multipath-tools

The two main or key components to manage and monitor in the multipath-tools package are the multipath.conf file and also the multipathd daemon. Both serve vital functions to help load a configuration and monitor it. Sometimes after the multipath-tools package has been installed, the multipath.conf file could be found in /etc. If not you can always run a search for an existing template, which in some distributions can exist in the following directories:

Redhat –

$ cd /usr/share/doc/device-mapper-multipath-<version no.>/multipath.conf.defaults
$ cp multipath.conf.defaults /etc/multipath.conf

SuSE –

$ cd /usr/share/doc/packages/multipath-tools/multipath.conf.synthetic
$ cp multipath.conf.synthetic /etc/multipath.conf

To edit the multipath.conf file simply open it up in a text editor:

$ vim /etc/multipath.conf

This is just an example. Your multipath.conf file may be configured differently to accommodate certain features and limitations with the external data storage that you are working with

# Blacklist all devices by default. Remove this to enable multipathing
# on the default devices.
#devnode_blacklist {
#       devnode "*"
#}
##
## This is a template multipath-tools configuration file
## Uncomment the lines relevent to your environment
##
defaults {
multipath_tool  "/sbin/multipath -v0"
udev_dir        /dev
polling_interval 10
default_selector        "round-robin 0"
#       default_path_grouping_policy    multibus
default_path_grouping_policy    failover
default_getuid_callout  "/sbin/scsi_id -g -u -s /block/%n"
default_prio_callout    "/bin/true"
#       default_features        "0"
rr_min_io               100
path_checker            tur
failback                3
no_path_retry      2
}
devnode_blacklist {
wwid 26353900f02796769
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z][0-9]*"
devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
}

In older versions of this multipath.conf file there is a known typo. In the blacklist section make sure that you correct the known error:
devnode “^hd[a-z][[0-9]*]” should read devnode “^hd[a-z][0-9]*”

Please understand what you set before activating the dm multipathed disk devices. For example the default_path_grouping_policy is set to failover and not multibus. That means despite the number of LUN path I have accessing the same logical volume, only one remains active at a single time. If the active path were to fail, then there would be a failover to a secondary defined path. Multibus simply send I/O requests across all paths which are marked as active unless failed. My path_checker is a Test Unit Ready (TUR), which is a low level SCSI command (opcode 0×00) to validate that the SCSI unit is ready to accept I/O requests. Also supported as path checkers are readsector0 and directio. Here is a guide to some of these field definitions.

Another extremely important field to this multipath.conf file is the blacklist. This tells the multipath-tools module to omit any device with the following characteristics, when scanning and grouping devices into device-mapper for multipathing.

There is so much more to this multipath.conf and I know I am only touching the surface of it but there is a wealth of information out there to help understand the vast amount of details buried within. I must admit though, that the coolest feature is that you can define specific settings for device specific environments. If you are working with a specific model of Compaq, Mylex or even Xyratex storage devices, these can be defined separately without interfering with any other connected storage device. Here is an example taken from the default multipath.conf file:

#       device {
#               vendor                  "COMPAQ  "
#               product                 "HSV110 (C)COMPAQ"
#               path_grouping_policy    multibus
#               getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
#               path_checker            readsector0
#               path_selector           "round-robin 0"
#               hardware_handler        "0"
#               failback                15
#               rr_weight               priorities
#               no_path_retry           queue
#       }

Obviously your definition would not be commented out.

Earlier I had mentioned the multipathd daemon. You can start or stop the daemon in the following ways:

Redhat -

$ service multipathd start
$ service multipathd stop

SuSE -

$ /etc/init.d/boot.multipath start
$ /etc/init.d/multipathd start
$ /etc/init.d/boot.multipath stop
$ /etc/init.d/multipathd stop

Note that multipathd will not function appropriately until you have all the appropriate modules loaded. In my case it is dm_round_robin, dm_mirror, dm_multipath and dm_mod.

Scanning the SCSI bus for multipath devices

To have the utility scan or update the nodes on the scsi bus/channel(s) you must type the following command:

$ multipath -v2
create: 32000000bb55555cd
[size=27 GB][features="0"][hwhandler="0"]
\_ round-robin 0
\_ 30:0:0:0 sda 8:0   [ready]
\_ round-robin 0
\_ 31:0:0:0 sde 8:64  [ready]
create: 32001000bb55555cd
[size=27 GB][features="0"][hwhandler="0"]
\_ round-robin 0
\_ 30:0:0:1 sdb 8:16  [ready]
\_ round-robin 0
\_ 31:0:0:1 sdf 8:80  [ready]
create: 32002000bb55555cd
[size=92 GB][features="0"][hwhandler="0"]
\_ round-robin 0
\_ 30:0:0:2 sdc 8:32  [ready]
\_ round-robin 0
\_ 31:0:0:2 sdg 8:96  [ready]
create: 32003000bb55555cd
[size=92 GB][features="0"][hwhandler="0"]
\_ round-robin 0
\_ 30:0:0:3 sdd 8:48  [ready]
\_ round-robin 0
\_ 31:0:0:3 sdh 8:112 [ready]

Everything gets grouped according to WWID into a dm device.  Multiple LUN mappings of the same LD will be given an alias (read below for these aliases).
To kill this mapping table you can simply run:

$ dmsetup remove_all

or by removing each individual entry:

$ dmsetup remove /dev/mapper/32003000bb55555cd

Once the mapping table is created you should be able to look in the following 3 paths to find a list of all dm devices either written as a dm-x device or under its WWID (World Wide Identifier).

$ ls /dev/dm-
dm-0  dm-1  dm-2  dm-3
$ ls /dev/mapper/
32000000bb55555cd  32001000bb55555cd  32002000bb55555cd  32003000bb55555cd  control
$ ls /dev/mpath/
32000000bb55555cd  32001000bb55555cd  32002000bb55555cd 32003000bb55555cd

Pick one of the paths and format the device as you normally would with any other raw Linux device to be mounted.

$ mke2fs -F /dev/dm-0 ......

Mount the device and verify that it is mounted by typing df at the command line.  Now to see all your active paths and monitor them during the test procedure you can type either multipath –ll or multipath –l at the command line.

$ multipath -ll
32002000bb55555cd
[size=92 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active]
\_ 30:0:0:2 sdc 8:32  [active][ready]
\_ round-robin 0 [enabled]
\_ 31:0:0:2 sdg 8:96  [active][ready]
32001000bb55555cd
[size=27 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active]
\_ 30:0:0:1 sdb 8:16  [active][ready]
\_ round-robin 0 [enabled]
\_ 31:0:0:1 sdf 8:80  [active][ready]
32000000bb55555cd
[size=27 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active]
\_ 30:0:0:0 sda 8:0   [active][ready]
\_ round-robin 0 [enabled]
\_ 31:0:0:0 sde 8:64  [active][ready]
32003000bb55555cd
[size=92 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active]
\_ 30:0:0:3 sdd 8:48  [active][ready]
\_ round-robin 0 [enabled]
\_ 31:0:0:3 sdh 8:112 [active][ready]

The results show that there are two LUN paths representing a single logical volume. One is set to active while the other is enabled and ready until the path fails over.

Note that a lot of individuals make the mistake of formatting and mounting the sd devices.  This is not allowed when using device-mapper.  sdc and sdg present dm device dm-0 or WWID 32002000bb55555cd.  These are virtual devices labeled to represent multiple LUN mappings to the same LD.  So you must use the dm labels as opposed to the sd ones.

Stay tuned for Part 2. Whenever that is going to be.

Powered by WordPress