UNIX Consulting and Expertise
Golden Apple Enterprises Ltd. » Posts for tag 'solaris volume manager'

Solaris SVM metasets Comments Off on Solaris SVM metasets

The Solaris disk management/virtualisation tools have gone through many name changes – ODS, SDS, SVM – but the basic tools have remained the same. With the introduction of Sun Cluster, Sun needed to come up with a way to share storage between cluster nodes. Obviously this functionality needed to be added to SVM, and they came up with the idea of metasets.

Normally, metadevices are local to a host. When encapsulating disk slices into metadevices, you first have to dedicate a disk slice to store the metadatabase (with them metadb command). You can then create replicas of the metadb, and scatter them across slices, to ensure they don’t get corrupt and you lose all your metadevice information.

metasets are, in a nutshell, a collection of metadevices that have their own metadb state databases.

This works well in a cluster – the metaset metadevices, along with their metadbs, can be moved around, existing on the node which needs to mount their filesystems.

It also makes things easy for use when mounting LUNs from a SAN. If we can encapsulate the LUNs within metadevices, in their own metaset, then if our host dies we can just re-import everything on another host. Think of it as very basic – but very quick and easy – disaster recovery.

Think of metasets as being very similar to disk groups in Veritas Volume Manager.

Creating the metaset is a simple process. First of all we define our metaset and add our host to it.

bash-3.00# metaset -s test -a -h avalon

Syntax is pretty straightforward:

  • -s is used to specify which metaset we’re using
  • -a is the add flag. Guess what -d does?
  • -h specifies the hostname which owns this metaset

All metasets are owned by at least one host (it’s how they track who can access them). If you’re in a cluster environment, multiple hosts will own the metaset, allowing the cluster software to move the metadevices between nodes.

For a single hosted metaset, however, we just need to add one host, and we need to make sure that it will automatically take ownership and import the metaset on boot.

All we have to do to make this happen is enable the autotake flag on the metaset:

bash-3.00# metaset -s test -A enable

And that completes the setup of the metaset. We then just select which LUNs we’re interested in, and add them in to the metaset:

bash-3.00# metaset -s test -a c7t60060E80141189000001118900001A10d0 \
c7t60060E80141189000001118900001A17d0 \
c7t60060E80141189000001118900001A17d0 \
c7t60060E80141189000001118900001A19d0

Note that when we add devices to a metaset (disks or LUNs) we need to only specify the device name – not slices, and not s2 (the Solaris way to reference an entire disk by a single reserved slice).

Normally, when you create a metadevice, you are encapsulation a slice that already exists on disk. This means the data stays intact. This is not the case when importing a disk into a metaset.

The act of importing a disk re-partitions it. All existing partitions are deleted, with a tiny slice on s7 being created to store the metadb replica, and the rest given over to s0. Note that s2 – the usual way of addressing a disk in Solaris – is also removed.

Here’s what the partitions look like on our root disk:

bash-3.00# prtvtoc /dev/dsk/c1t0d0s2
* /dev/dsk/c1t0d0s2 partition map
*
* Dimensions:
*     512 bytes/sector
*     107 sectors/track
*      27 tracks/cylinder
*    2889 sectors/cylinder
*   24622 cylinders
*   24620 accessible cylinders
*
* Flags:
*   1: unmountable
*  10: read-only
*
*                          First     Sector    Last
* Partition  Tag  Flags    Sector     Count    Sector  Mount Directory
       0      2    00    8389656  23071554  31461209
       1      3    01          0   8389656   8389655
       2      5    00          0  71127180  71127179
       5      7    00   31461210  20974140  52435349
       6      0    00   52435350  18625383  71060732
       7      0    00   71060733     66447  71127179

And here’s what they look like on a LUN that’s part of the metaset:

bash-3.00# prtvtoc /dev/dsk/c7t60060E80141189000001118900001A10d0s0
* /dev/dsk/c7t60060E80141189000001118900001A10d0s0 partition map
*
* Dimensions:
*     512 bytes/sector
*     512 sectors/track
*      15 tracks/cylinder
*    7680 sectors/cylinder
*   13653 cylinders
*   13651 accessible cylinders
*
* Flags:
*   1: unmountable
*  10: read-only
*
*                          First     Sector    Last
* Partition  Tag  Flags    Sector     Count    Sector  Mount Directory
       0      4    00      15360 104824320 104839679
       7      4    01          0     15360     15359

We can query the metaset and have a look at it’s contents, to check everything is OK:


bash-3.00# metaset -s test

Set name = test, Set number = 5

Host Owner
avalon Yes (auto)

Drive Dbase

/dev/dsk/c7t60060E80141189000001118900001A10d0 Yes

/dev/dsk/c7t60060E80141189000001118900001A17d0 Yes

/dev/dsk/c7t60060E80141189000001118900001A18d0 Yes

/dev/dsk/c7t60060E80141189000001118900001A19d0 Yes

Once we’ve populated our metaset, we create metadevices as normal. The only extras we need when using the metainit command is that we need to specify which metaset we’re using, and that we’ll always be using s0.

Let’s create a single metadevice striped across all 4 LUNs in our metaset:

bash-3.00# metainit -s test d100 1 4 /dev/dsk/c7t60060E80141189000001118900001A10d0s0 \
/dev/dsk/c7t60060E80141189000001118900001A17d0s0 \
/dev/dsk/c7t60060E80141189000001118900001A18d0s0 \
/dev/dsk/c7t60060E80141189000001118900001A19d0s0

metainit works in the same way it’s always done – we need to specify the full path to the slice we’re using – but with the additional -s flag to tell metainit which metaset we want to add the metadevice to.

We can use the summary flag to metastat (sorry, Solaris 10 only) to show us the summary of what we’ve just configured:

bash-3.00# metastat -c -s test
dbt/d100     s  199GB /dev/dsk/c7t60060E80141189000001118900001A10d0s0 \
/dev/dsk/c7t60060E80141189000001118900001A17d0s0 \
/dev/dsk/c7t60060E80141189000001118900001A18d0s0 \
/dev/dsk/c7t60060E80141189000001118900001A19d0s0

metasets are an easy way to group together storage and filesystems in Solaris, especially where the storage is external to your host, and you’d like the flexibility of importing it to another host in the future – for example, as part of some DR work if the host fails.

Solaris 9 can’t import it’s SVM metasets when booting Comments Off on Solaris 9 can’t import it’s SVM metasets when booting

I came across this particular issue for a client, and it turned out to be a harsh gotcha in Solaris 9.

Quick recap: SVM metasets are a group of disks (usually from a SAN) that have their own meta state databases. They grew out of Sun Cluster as a way to share storage between cluster nodes, using SVM, and have since become a really handy way of managing SAN volumes.

Anyway, Solaris 9 4/04 introduced the ability to have ‘autotake’ metasets. Basically, one host was the master, and it could automatically import and manage the metaset on boot. This was great, because it finally swept aside the last baggage of Sun Cluster, and meant you could have your metasets referenced in /etc/vfstab and mount them at boot – just like real disks.

And there was much rejoicing across the land.

In this particular case, there was a host running Solaris 9 (for client software reasons) which had many terabytes of SAN LUNs mounted as metasets. I say had because when it rebooted, the machine said it couldn’t autotake the disk set because it wasn’t the owner, before dropping to single user mode complaining it couldn’t check any of the filesystems.

Odd. A quick check from single user mode, and yes indeed – the metaset was configured for autotake, but the host wasn’t the owner. Comment the (many) filesystems out of /etc/vfstab, continue the boot, and check again once at run level 3. Hang on – now the host is the metaset owner.

Whisky Tango Foxtrot, over. A quick Google threw up far too many suggestions to hack the startup scripts so that the SVM daemons start before the filesystem mounts. Not a great idea.

A very quick dig through Sunsolve turned up Sun BugID 6276747 – “Auto-take fails to work with fabric disks”
Turns out that this is an issue with the Solaris 9 SAN Foundation Suite, and how the kernel initialises SAN fabric LUNs, as opposed to FC-AL LUNs.

Adding the following like to /etc/system:

set fcp:ssfcp_enable_auto_configuration = 1

Followed by a quick reboot later, and behold! metasets are imported and mounted correctly, no further problems. This appears to be purely an issue in Solaris 9, so apart from old client apps I’m hoping we can leave this one behind.

Growing live Solaris filesystems with metadevices 1 comment

Solaris Volume Manager has also been variously known as Disksuite or ODS (but not the Solaris Volume Manager which was the rebadged and bundled Veritas Volume Manager!) and comes with lots of neat features. One of the best is simple method for expanding metadevices and filesystems – on a live system!

For my example today, I’m going to use a Sun T2000 which has 4 zones, all of which are running from a metadevice composed of several 10gb LUNs from a SAN. The object of this exercise is to add another 10gb to the zone filesystem, so the developers can fire up another zone.

The fact that we’re using a metaset from a SAN doesn’t matter – the key thing here is that, by encapsulating a disk (or LUN, or partition) and using a metadevice, we can quickly and easily expand live filesystems.

First of all, let’s have a look at the metaset on the host, which has been imaginatively called ‘zones’:

bash-3.00# metaset -s zones
Set name = zones, Set number = 2
Host                Owner
  bumpkin         Yes (auto)
Drive                                            Dbase
/dev/dsk/c6t60060E80141189000001118900001918d0   Yes  
/dev/dsk/c6t60060E80141189000001118900001919d0   Yes  
/dev/dsk/c6t60060E80141189000001118900002133d0   Yes  

We can use the metastat command to print a summary of the metadevices within the metaset:

bash-3.00# metastat -s zones -c
zones/d100       s   29GB /dev/dsk/c6t60060E80141189000001118900001918d0s0 /dev/dsk/c6t60060E80141189000001118900001919d0s0 /dev/dsk/c6t60060E80141189000001118900002133d0s0

In this case, we’ve got just one metadevice, named d100, which is composed of three 10gb LUNs from the SAN.

So, our first task is to add the newly available LUN to the metaset:

bash-3.00# metaset -s zones -a c6t60060E80141189000001118900001731d0

We can check that it’s really been added with the metaset command again:

bash-3.00# metaset -s zones
Set name = zones, Set number = 2
Host                Owner
  bumpkin         Yes (auto)
Drive                                            Dbase
/dev/dsk/c6t60060E80141189000001118900001918d0   Yes  
/dev/dsk/c6t60060E80141189000001118900001919d0   Yes  
/dev/dsk/c6t60060E80141189000001118900002133d0   Yes  
/dev/dsk/c6t60060E80141189000001118900001731d0   Yes  

Rock on! We’ve now got our 4 10gb LUNs added to the metaset. Now we need to attach the new LUN to our existing 30gb metadevice, d100:

bash-3.00# metattach zones/d100 /dev/dsk/c6t60060E80141189000001118900001731d0s0
zones/d100: component is attached

Note that we don’t need to bring down the three running zones – we can do all of this live, with the system at the multi-user-server milestone.

If we query the metadevice now we can see that it’s grown, from a stated 29GB to 39GB, and that our new LUN is part of the metadevice:

bash-3.00# metastat -s zones -c
zones/d100       s   39GB /dev/dsk/c6t60060E80141189000001118900001918d0s0 /dev/dsk/c6t60060E80141189000001118900001919d0s0 /dev/dsk/c6t60060E80141189000001118900002133d0s0 /dev/dsk/c6t60060E80141189000001118900001731d0s0

Now all we need to do is grow the filesystem, using all of that extra 10gb:

bash-3.00# growfs -M /export/zones /dev/md/zones/rdsk/d100 
/dev/md/zones/rdsk/d100:        83742720 sectors in 13630 cylinders of 48 tracks, 128 sectors
        40890.0MB in 852 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
................
super-block backups for last 10 cylinder groups at:
 82773280, 82871712, 82970144, 83068576, 83167008, 83265440, 83363872,
 83462304, 83560736, 83659168

Here’s the output of df before we hacked about:

bash-3.00# df -k /export/zones
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/md/zones/dsk/d100
                     30928078 14451085 16270806    48%    /export/zones

And there’s the output after we’ve expanded the filesystem:

bash-3.00# df -k /export/zones
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/md/zones/dsk/d100
                     41237442 14462125 26569130    36%    /export/zones

So, a quick and simple way to grow a filesystem under Solaris, using metadevices and with no downtime.

Also, a brief note to Sun product managers: choose a name for your products, and stick with that name for more than a year. Thanks!

Top of page / Subscribe to new Entries (RSS)