Wednesday, November 30, 2011

Expanding a filesystem with Linux LVM

In today's world of ever-expanding storage requirements, it's not uncommon for the capacity of a hard drive to be eclipsed before it physically wears out. When this happens, it can be a pain to back everything up, replace the drive, and then restore your stuff back onto the new, larger drive. Not only is there margin for error, but copying that much data can take a really long time... and that assumes you can even buy a single hard drive large enough to store everything. Fortunately, Linux solves this problem by pooling together multiple physical drives into a single partition using the Logical Volume Manager. I recently had to expand an LVM partition and I ran into some speed bumps that I wanted to help others avoid.

Thanks to my digital photography hobby (now business), I actually ran into the capacity problem years ago, and had already setup LVM to stripe together a pair of drives.  Before I get going, a little background is in order.

There are several ways to combine multiple drives together, each offering varying degrees of speed, capacity, and fault tolerance.  Many of these methods fall under the category of "RAID" (Redundant Array of Inexpensive Disks).  I won't bore you with the details of each RAID level; if you want them, read here.  LVM by itself is capable of either mirroring (RAID 1), striping (RAID 0), or concatenation (JBOD).  To get RAID 5 or 6 into the mix, you must use mdadm (possibly in conjunction with LVM).


Concatenation (sometimes referred to as JBOD, for "Just a Bunch Of Disks") means that multiple drives are lined up end to end.  When data reaches the end of one drive, it starts using the next drive.  This is the simplest to setup and manage, but it offers no speed advantage, and is less reliable than a single disk because losing any one of those drives could clobber your filesystem.

Striping (RAID 0) runs two identically-sized drives in parallel.  A small chunk of data (32KB, in my setup) is read off the first drive, then the next chunk is read off the second drive, then the third chunk is read off the first drive again, etc.  Any time you're reading more than 32KB (as is the case with photos & videos), you get the speed advantage of reading from two drives simultaneously.  You can stripe together as many drives as you want to improve your access speed.  However, striping has the same reliability concerns as concatenation, and more drives increases your probability of a drive failure taking out your filesystem.

Mirroring (RAID 1)also runs drives in parallel, but instead of alternating between them, each chunk of data is stored on both drives simultaneously.  You get the same improvements in read speed as with striping, but because you've got two copies of your data, you can lose one drive without causing any harm.  The disadvantage is you have to buy twice as much disk space as you need to use.

Because I've already got multiple backups of all of my data, and because nobody but me is inconvenienced if my computer is down for a day for drive replacement, I chose to take the cheap route and setup my /home filesystem with striping.  At the time, I had one 1TB drive and one 1.5TB drive at my disposal.  Striping pairs identically-sized partitions, so I carved out a 1TB slice of the larger drive and striped it with the smaller drive to create a single 2TB, with 500GB left over for future use.

This served me well until I started to fill up that 2TB partition.  By the time that happened, I had another 2TB drive at my disposal (an old off-site backup drive that was no longer large enough).  I wanted to merge that into the /home filesystem stripe along with the original two.  I had a few options for doing this:

1.  Carve off a 1TB partition from the new drive and merge it into the existing stripe.  LVM can't do this on the fly, so I'd have to blow away the 2-disk stripe, rebuild it with 3 drives, and restore the data from backup.  This would leave me with the remaining 1TB from the new drive unused (as well as the original 500GB from the other drive), since striping across two partitions on the same drive will really kill performance.  I could always concatenate that onto the end of the stripe if needed, or save it for later, since 3TB is plenty for me at the moment.  This option renders the filesystem unusable during the process.

2.  Leave the old 2TB stripe alone and simply concatenate the new 2TB drive onto the end of it.  This is simple (once you find the up-to-date instructions, detailed below), but any data stored on the new drive will be much slower than data stored on the old stripe.  Access speed makes a difference in my everyday life, so this wasn't ideal.  This option can be completed on a live filesystem with no interruption to the users.

3.  Remember that 500GB partition left over from the original stripe?  I could carve off a matching 500GB from the new 2TB drive, stripe it with the old partition, and the concatenate those onto the end of the original stripe.  This would increase my available disk space by 1TB and would preserve the speed advantage of a striped filesystem.  This would leave the remaining 1.5TB unused on the new drive, but that's OK.  I don't need that space yet, and I don't want to commit to using it now, because I might need it elsewhere later.  It's very easy to add space to an LVM filesystem at any time, but you can't remove it once it's been added.  This option can also be completed on a live filesystem with no interruption to the users.

I chose to go with door #3.  However, since LVM has been around for over ten years and has continually increased its feature set, there are lots of instructions out on the web that are now way out of date.  Finding accurate instructions was far more difficult than it should have been, so I'll detail each of the three methods below.

I've been using Unix for over 20 years now (much of that as a professional sysadmin), so I'm quite comfortable with a command prompt.  Nevertheless, I'm all for using a GUI when it does the job more quickly with less opportunity for error.  Such is the case with basic LVM.  I opted to use the "system-config-lvm" GUI program (originally from the RedHat distribution, but now running on my Ubuntu 11.04 box).  System-config-lvm isn't flexible enough to perform option #2 in its entirety, but you can use it to do some of the prep work.

With any of these options, it's important to note that when striping multiple partitions, the space used from each partition will match that of the smallest of the partitions.  For example, if I tried to stripe together my 1TB, 1.5TB, and 2TB drives in their entirety, I'd only get 3x1TB = 3TB of usable space, with the leftover 1.5TB forever inaccessible.  You're best off partitioning the larger drives so that all of the striped partitions are roughly the same size.

Of course, always remember to make a backup of your system before you begin.  Even if you intend to use an option that allows you to expand the filesystem in place, one little slip of the finger could modify the wrong drive and really ruin your day.

Option 1

Create a 1TB partition on the new drive.  I use fdisk for this because it's old and familiar, but you could use cfdisk or Ubuntu's "Administration / Disk Utility" GUI program.  Partitions near the beginning of the disk have faster access times than those at the end.  Set the partition type to "Linux LVM" (8e).

In system-config-lvm, initialize the new partition (under "Uninitialized Entities").  Next, add the new, unallocated 1TB volume to the existing volume group ("HomeVG" in my case).

This is as far as you can go with the old filesystem mounted.  You'll have to unmount the soon-to-be-enlarged filesystem now so that you can blow it away and recreate it with the third drive.  If this filesystem is your /home partition, you'll have to log out and log in as root or boot from a live CD before you continue.

CHECK ONE LAST TIME THAT YOU'VE GOT A VALID BACKUP OF THIS FILESYSTEM, BECAUSE THERE'LL BE NO RECOVERY ONCE YOU CONTINUE WITH THIS PROCESS.

Back in system-config-lvm, select the old logical volume ("HomeLV" in my case) and remove both of the physical partitions.  Now create a new logical volume by selecting all three physical partitions (aka drives).  The stripe size you select has a lot to do with the data you're accessing; I got good performance using a 32KB stripe.

Once you're done, create a new filesystem on the partition.  I'm fond of ext4.  When that's created, you're ready to mount it and restore your data from backup.

Option 2

Create one partition on the new 2TB drive, as described under Option 1.  We'll do the rest of this process from the command line, even though the early parts of it could be done with system-config-lvm.

First, add the new partition (let's say sdX1) to the volume group ("HomeVG" in my case).

# vgextend HomeVG /dev/sdX1

Since the logical volume ("HomeLV") being extended was previously setup as a stripe, LVM by default would like any further additions to also be striped.  Overriding this requires the "100%FREE" addition to the following command line (from this page):

# lvextend -i1 -1+100%FREE HomeVG/HomeLV

Option 3

Create a 500GB partition on the new 2TB disk to match the leftover 500GB partition on the old 1.5TB disk, as described under Option 1.  In system-config-lvm, initialize both 500GB partitions (under "Unitialized Entities") and then add the new, unallocated volumes to the existing volume group ("HomeVG" in my case).

Next, select the logical volume in question ("HomeLV" in my case) and then click the "Edit Properties" button.  Since the LV being extended was already setup as a stripe, any further additions must also be striped (at least when using this GUI; the command line imposes no such restriction).  The system already knows that there is a certain amount of disk space available on two separate drives.  Clicking the "Use Remaining" button will use as much space as can be striped across both drives, which should be all of it since you created two identically-sized partitions.

When you click "OK," the program will sit there for an eternity (perhaps 30 minutes in my case) while it expands the filesystem (ext4, in my case).  You can monitor its progress by periodically running "df -m /home" (or wherever your filesystem is mounted) from a shell and watching the "1M-blocks" column increase.  When the GUI finally returns, your job is done.

Hopefully, all this doesn't sound too hard.  The process itself is really pretty simple once you know what you're doing.  I spent far too much time tracking down up-to-date documentation on the process, in part because I wasn't sure how to best attack the problem.  The end result is that my /home filesystem is now only 64% full, and I've still got 1.5TB in my hip pocket for whatever need arises down the road.

I'm not an expert in LVM, but I am pretty fluent in the basic concepts, so feel free to post any questions or correct my mistakes in the comments section below.  I'm always happy to hear from my readers!

No comments:

Post a Comment

Please leave your comment below. Comments are moderated, so don't be alarmed if your note doesn't appear immediately. Also, please don't use my blog to advertise your own web site unless it's related to the discussion at hand.