UNIX Consulting and Expertise
Golden Apple Enterprises Ltd. » Posts for tag 'SAN'

Solaris 9 can’t import it’s SVM metasets when booting Comments Off on Solaris 9 can’t import it’s SVM metasets when booting

I came across this particular issue for a client, and it turned out to be a harsh gotcha in Solaris 9.

Quick recap: SVM metasets are a group of disks (usually from a SAN) that have their own meta state databases. They grew out of Sun Cluster as a way to share storage between cluster nodes, using SVM, and have since become a really handy way of managing SAN volumes.

Anyway, Solaris 9 4/04 introduced the ability to have ‘autotake’ metasets. Basically, one host was the master, and it could automatically import and manage the metaset on boot. This was great, because it finally swept aside the last baggage of Sun Cluster, and meant you could have your metasets referenced in /etc/vfstab and mount them at boot – just like real disks.

And there was much rejoicing across the land.

In this particular case, there was a host running Solaris 9 (for client software reasons) which had many terabytes of SAN LUNs mounted as metasets. I say had because when it rebooted, the machine said it couldn’t autotake the disk set because it wasn’t the owner, before dropping to single user mode complaining it couldn’t check any of the filesystems.

Odd. A quick check from single user mode, and yes indeed – the metaset was configured for autotake, but the host wasn’t the owner. Comment the (many) filesystems out of /etc/vfstab, continue the boot, and check again once at run level 3. Hang on – now the host is the metaset owner.

Whisky Tango Foxtrot, over. A quick Google threw up far too many suggestions to hack the startup scripts so that the SVM daemons start before the filesystem mounts. Not a great idea.

A very quick dig through Sunsolve turned up Sun BugID 6276747 – “Auto-take fails to work with fabric disks”
Turns out that this is an issue with the Solaris 9 SAN Foundation Suite, and how the kernel initialises SAN fabric LUNs, as opposed to FC-AL LUNs.

Adding the following like to /etc/system:

set fcp:ssfcp_enable_auto_configuration = 1

Followed by a quick reboot later, and behold! metasets are imported and mounted correctly, no further problems. This appears to be purely an issue in Solaris 9, so apart from old client apps I’m hoping we can leave this one behind.

Finding the WWN in Solaris followup – making it easier Comments Off on Finding the WWN in Solaris followup – making it easier

In the previous post I listed the ‘long way round’ to find out the WWN from active HBA links in Solaris. The commands I listed before will work on all recent releases of Solaris. If you’re able to migrate to Solaris 10, you can make things easier for yourself.

cfgadm will take a verbose flag, which will print out a listing that includes the full device path. This will definitely work on Solaris 9 and 10 – I’m afraid I don’t have an 8 box to test though.

bash-3.00# cfgadm -lv 
Ap_Id                          Receptacle   Occupant     Condition  Information
When         Type         Busy     Phys_Id
c0                             connected    configured   unknown
unavailable  scsi-bus     n        /devices/pci@7c0/pci@0/pci@1/pci@0/ide@8:scsi
c1                             connected    configured   unknown
unavailable  scsi-bus     n        /devices/pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2:scsi
c2                             connected    configured   unknown
unavailable  fc-private   n        /devices/pci@780/pci@0/pci@8/SUNW,qlc@0/fp@0,0:fc
c3                             connected    unconfigured unknown
unavailable  fc           n        /devices/pci@780/pci@0/pci@8/SUNW,qlc@0,1/fp@0,0:fc
c4                             connected    configured   unknown
unavailable  fc-private   n        /devices/pci@7c0/pci@0/pci@9/SUNW,qlc@0/fp@0,0:fc
c5                             connected    unconfigured unknown
unavailable  fc           n        /devices/pci@7c0/pci@0/pci@9/SUNW,qlc@0,1/fp@0,0:fc
usb0/1                         empty        unconfigured ok
unavailable  unknown      n        /devices/pci@7c0/pci@0/pci@1/pci@0/usb@5:1
usb0/2                         empty        unconfigured ok
unavailable  unknown      n        /devices/pci@7c0/pci@0/pci@1/pci@0/usb@5:2
usb1/1.1                       empty        unconfigured ok
unavailable  unknown      n        /devices/pci@7c0/pci@0/pci@1/pci@0/usb@6/hub@1:1.1
usb1/1.2                       empty        unconfigured ok
unavailable  unknown      n        /devices/pci@7c0/pci@0/pci@1/pci@0/usb@6/hub@1:1.2
usb1/1.3                       empty        unconfigured ok
unavailable  unknown      n        /devices/pci@7c0/pci@0/pci@1/pci@0/usb@6/hub@1:1.3
usb1/1.4                       empty        unconfigured ok
unavailable  unknown      n        /devices/pci@7c0/pci@0/pci@1/pci@0/usb@6/hub@1:1.4
usb1/2                         empty        unconfigured ok
unavailable  unknown      n        /devices/pci@7c0/pci@0/pci@1/pci@0/usb@6:2

If you have Solaris 10 8/07 or later, then you’ll find that the dump_map option to luxadm will take the short notation for an HBA that cfgadm uses.

bash-3.00# luxadm -e dump_map /dev/cfg/c2
Pos AL_PA ID Hard_Addr Port WWN         Node WWN         Type
0     1   7d    0      210000e08b86f840 200000e08b86f840 0x1f (Unknown Type,Host Bus Adapter)
1     ad  23    ad     50060e8014118960 50060e8014118960 0x0  (Disk device)

Again, this all works only if the HBA has a live link – it needs some cable plugged in, and you need to have something listening at the other end. I’ll be exploring how to find the WWN of your HBAs – even if they’re not plugged in – soon, using some other features of Solaris.

Silly SAN tricks – finding the WWN of an HBA from Solaris Comments Off on Silly SAN tricks – finding the WWN of an HBA from Solaris

When connecting a Solaris machine to a SAN, you’ll usually need to know the WWN of the host bus adapter (HBA). WWNs are a bit like MAC addresses for ethernet cards – they are unique, and they’re used to manage who is connected to what, and what they can see.

The quickest and easiest way to check the WWN is when we have an active HBA. We can use the cfgadm command under Solaris to check our adapter states:

root@avalon>cfgadm -al
Ap_Id                          Type         Receptacle   Occupant     Condition
c0                             scsi-bus     connected    configured   unknown
c0::dsk/c0t0d0                 CD-ROM       connected    configured   unknown
c1                             fc-private   connected    configured   unknown
c1::210000008783fd1c           disk         connected    configured   unknown
c1::2100000087844ad8           disk         connected    configured   unknown
c2                             fc-private   connected    configured   unknown
c2::50060e8014118920           disk         connected    configured   unknown
c3                             fc           connected    unconfigured unknown
c4                             fc-private   connected    configured   unknown
c4::50060e8014118930           disk         connected    configured   unknown
c5                             fc           connected    unconfigured unknown
usb0/1                         unknown      empty        unconfigured ok
usb0/2                         unknown      empty        unconfigured ok
usb0/3                         unknown      empty        unconfigured ok
usb0/4                         unknown      empty        unconfigured ok

So both our controllers, c2 and c4, have active loops. Now we can use luxadm to query the driver and print out the device paths for each port on each HBA:

root@avalon>luxadm qlgc
 Found Path to 5 FC100/P, ISP2200, ISP23xx Devices
 Opening Device: /devices/pci@8,700000/SUNW,qlc@2/fp@0,0:devctl
  Detected FCode Version:       ISP2312 Host Adapter Driver: 1.14.09 03/08/04
 Opening Device: /devices/pci@8,700000/SUNW,qlc@2,1/fp@0,0:devctl
  Detected FCode Version:       ISP2312 Host Adapter Driver: 1.14.09 03/08/04
 Opening Device: /devices/pci@8,700000/SUNW,qlc@3/fp@0,0:devctl
  Detected FCode Version:       ISP2312 Host Adapter Driver: 1.14.09 03/08/04
 Opening Device: /devices/pci@8,700000/SUNW,qlc@3,1/fp@0,0:devctl
  Detected FCode Version:       ISP2312 Host Adapter Driver: 1.14.09 03/08/04
 Opening Device: /devices/pci@9,600000/SUNW,qlc@2/fp@0,0:devctl
  Detected FCode Version:       ISP2200 FC-AL Host Adapter Driver: 1.15 04/03/22
  Complete

This particular machine I’m playing on is a Sun v490, which uses internal FC-AL disks – so the sixth controller port we can see (the ISP2200) is the internal controller for the internal root disks. Why the sixth? Due to the way the V490 initialises itself, the internal controller is tested and configured after all the PCI slots.

Also, if you look at the device path, you can see it’s coming from a different PCI bus – pci@9 as opposed to pci@8

Finally, the FCode and driver version are different, which shows us it’s a slightly different chipset from the other HBAs.

REMEMBER: numbering starts from the top (the first device) down. So:

/devices/pci@8,700000/SUNW,qlc@2/fp@0,0:devctl is c2

/devices/pci@8,700000/SUNW,qlc@2,1/fp@0,0:devctl is c3

/devices/pci@8,700000/SUNW,qlc@3/fp@0,0:devctl is c4

/devices/pci@8,700000/SUNW,qlc@3,1/fp@0,0:devctl is c5

/devices/pci@9,600000/SUNW,qlc@2/fp@0,0:devctl is c1, our internal HBA

We can now use the dump_map option from pci@9 to print out the device map, as seen from each port.

For c2, for example, we would do:

root@avalon>luxadm -e dump_map  /devices/pci@8,700000/SUNW,qlc@2/fp@0,0:devctl
Pos AL_PA ID Hard_Addr Port WWN         Node WWN         Type
0     1   7d    0      210000e08b1ea9ef 200000e08b1ea9ef 0x1f (Unknown Type,Host Bus Adapter)
1     b1  21    b1     50060e8014118920 50060e8014118920 0x0  (Disk device)

And there is our listing of WWNs. The 50060e8014118920 WWN belongs to our SAN device at the other end (note the type of ‘0x0 Disk device’), and the first WWN of 210000e08b1ea9ef is for our HBA.

Note that this just works for cards which have an active connection to a SAN fabric. If we haven’t plugged them in yet, we need to use some lower level Solaris tools, which I’ll be covering in another post.

Extracting EMC Symmetrix Data with Orca Comments Off on Extracting EMC Symmetrix Data with Orca

One of the problems using big disk arrays is the difficulty in getting meaningful reporting out of them. All the vendors’ tools are closed source, and in many cases the expertise from the vendor is often missing or seriously lacking when it comes plotting performance trends.

“Just add more cache” is the same tired refrain vendors always give. No. I’m not going to recommend to clients that they spend a huge sum of money buying more SAN cache until I can prove the SAN actually needs it.

In March 2004 I wrote an article for SysAdmin Magazine showing how to use the symcli command line tools in conjunction with Orca to plot some nice historic performance graphs, showing the host’s view of performance of the Symmetrix array.

You can find the original article, complete with diagrams and code, on SysAdmin Magazine’s website at http://www.samag.com/documents/s=9364/sam0403f/0403f.htm

Top of page / Subscribe to new Entries (RSS)