Integrating FreeBSD, ZFS, and Periodic; snapshots and scrubs

April 15th, 2009

Update: the scripts/process outlined here has been used as the foundation for zfs-periodic in the FreeBSD ports collection, you can check it out here.

ZFS on FreeBSD is powerful, especially when coupled with periodic taking hourly, daily, weekly, monthly, … snapshots. In the following post I’ll provide the scripts & config necessary to customize and walk you step-by-step through setting up zfs snapshots and scrubs with periodic on FreeBSD.

Periodic’s main advantage over the more traditional and obvious method of running a script from a cron job is integration with the notification emails and standard configuration mechanism. That may not sound like much, but that means you a year down the road (or someone else that comes to the system) only has to look in the obvious place to figure out what’s going on or make changes.

This is going to be a long post, but there’s a decent amount of code & config to walk through. The files being discussed have been tar’d up and the latest version of them can be downloaded from here.

Configuration – the stuff you might actually want to muck with

We’ll start with the configuration (/etc/periodic.conf) as it’s the most relevant portion or at least the most likely to be edited. Out of the box FreeBSD supports daily, weekly, and monthly periodic tasks, we’re going to be adding an hourly so that we can do hourly snapshots. The first section of config sets up who output from the hourly script should go to, whether it should be sent if everything succeeded, if something failed, or if something is mis-configured. Hourly emails seem a bit much so we’ve disabled them when everything goes well. We do want messages about errors and I’ve just left badconfig to the same value as all of the other time-frames.

# Hourly options
hourly_output="root"					# user or /file
hourly_show_success="NO"				# scripts returning 0
hourly_show_info="YES"					# scripts returning 1
hourly_show_badconfig="NO"				# scripts returning 2

The next section configures hourly snapshots. In this case we’re enabling them for the pool tank and keeping the 6 most recent around. There are defaults, that we’ll see later, for both pools and keep so the only required value here is enable. To specify more than one pool add them space seperated to the config string, e.g. “tank boat plane”

# 000.zfs-snapshot
hourly_zfs_snapshot_enable="YES"
hourly_zfs_snapshot_pools="tank"
hourly_zfs_snapshot_keep=6

The daily section is almost identical, but we instead keep the last 7 days. We’re also enabling a daily zfs status script that is in the default setup, but disabled.

# Daily options

# 000.zfs-snapshot
daily_zfs_snapshot_enable="YES"
daily_zfs_snapshot_pools="tank"
daily_zfs_snapshot_keep=7

# 404.status-zfs
daily_status_zfs_enable="YES"

Weekly and Monthly have the same configuration options, we’re keeping the last 5 weeks, and last 2 months below.

# Weekly options

# 000.zfs-snapshot
weekly_zfs_snapshot_enable="YES"
weekly_zfs_snapshot_pools="tank"
weekly_zfs_snapshot_keep=5

# Monthly options

# 000.zfs-snapshot
monthly_zfs_snapshot_enable="YES"
monthly_zfs_snapshot_pools="tank"
monthly_zfs_snapshot_keep=2

A final section configures the monthly scrubbing. Similarlly to the snapshot config sections there’s an enable line and pools line. Here we’ve enabled the monthly scrub on the tank pool.

# 998.zfs-scrub
monthly_zfs_scrub_enable="YES"
monthly_zfs_scrub_pools="tank"

periodic hourly – adding hourly script support to periodic

The next thing we need to do is add hourly support to periodic. While that might sound complicated it’s actually very straightforward. We start by creating a directory to house the hourly files.

# mkdir /etc/periodic/hourly

And then add the following line to /etc/crontab just before the line for ‘periodic hourly’

1	*	*	*	*	root	periodic hourly

That’s it you now have a place to put scripts that will be run every hour on the :01.

hourly/daily/weekly/monthly scripts – adding hourly script support to periodic

Now we’ll get to the scripts that make all of this configuration do something. It’s unlikely that you’ll ever have to much with any of these, but in case your curious I’ll go ahead and walk though them. We’ll start with the hourly snapshot script (/etc/periodic/hourly/000.zfs-snapshot.)

 1  #!/bin/sh
 2
 3  # If there is a global system configuration file, suck it in.
 4  #
 5  if [ -r /etc/defaults/periodic.conf ]
 6  then
 7      . /etc/defaults/periodic.conf
 8      source_periodic_confs
 9  fi
10
11  pools=$hourly_zfs_snapshot_pools
12  if [ -z "$pools" ]; then
13      pools='tank'
14  fi
15
16  keep=$hourly_zfs_snapshot_keep
17  if [ -z "$keep" ]; then
18      keep=6
19  fi
20
21  case "$hourly_zfs_snapshot_enable" in
22      [Yy][Ee][Ss])
23          . /etc/periodic/zfs-snapshot
24          do_snapshots "$pools" $keep 'hourly'
25          ;;
26      *)
27          ;;
28  esac

Lines 3-9 is boilerplate periodic script stuff. 11-19 look for the values we configured earlier and use defaults if they’re not specified. 21, 22, 25, and 26 are case shell scripting case statement stuff that’s borrowed from one of the other periodic scripts, mainly just makes sure that hourly_zfs_snapshot_enable is set to YES, ignoring case. Line 23 pulls in (think #include) some common zfs snapshotting code that we’ll get to next and finally line 24 calls the snapshotting function for the configured pools, keep count, and the type of hourly. The daily, weekly, and monthly scripts are identical with hourly replaced with the appropriate value throughout.

zfs-snapshot – the workhorse

There’s too much here to walk though in detail so I’ll let you read through the code. I’ve tried to do a decent job of in-line commenting. If you have questions or want clarification feel free to ask…

#!/bin/sh

# checks to see if there's a scrub in progress
scrub_in_progress()
{
  pool=$1

  if zpool status $pool | grep "scrub in progress" > /dev/null; then
    return 0
  else
    return 1
  fi
}

# take the appropriately named snapshot
create_snapshot()
{
    pool=$1

    case "$type" in
        hourly)
        now=`date +"$type-%Y-%m-%d-%H"`
        ;;
        daily)
        now=`date +"$type-%Y-%m-%d"`
        ;;
        weekly)
        now=`date +"$type-%Y-%U"`
        ;;
        monthly)
        now=`date +"$type-%Y-%m"`
        ;;
        *)
        echo "unknown snapshot type: $type"
        exit 1
    esac

    # create the now snapshot
    snapshot="$pool@$now"
    # look for a snapshot with this name
    if zfs list -H -o name | sort | grep "$snapshot$" > /dev/null; then
        echo "	snapshot, $snapshot, already exists"
    else
        echo "	taking snapshot, $snapshot"
        zfs snapshot -r $snapshot
    fi
}

# delete the named snapshot
delete_snapshot()
{
    snapshot=$1
    echo "	destroying old snapshot, $snapshot"
    zfs destroy -r $snapshot
}

# take a type snapshot of pool, keeping keep old ones
do_pool()
{
    pool=$1
    keep=$2
    type=$3

    # create the regex matching the type of snapshots we're currently working
    # on
    case "$type" in
        hourly)
        # hourly-2009-01-01-00
        regex="$pool@$type-[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[0-9][0-9]$"
        ;;
        daily)
        # daily-2009-01-01
        regex="$pool@$type-[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]$"
        ;;
        weekly)
        # weekly-2009-01
        regex="$pool@$type-[0-9][0-9][0-9][0-9]-[0-9][0-9]"
        ;;
        monthly)
        # monthly-2009-01
        regex="$pool@$type-[0-9][0-9][0-9][0-9]-[0-9][0-9]"
        ;;
        *)
        echo "unknown snapshot type: $type"
        exit 1
    esac

    create_snapshot $pool $type

    # get a list of all of the snapshots of this type sorted alpha, which
    # effectively is increasing date/time
    # (using sort as zfs's sort seems to have bugs)
    snapshots=`zfs list -H -o name | sort | grep $regex`
    # count them
    count=`echo $snapshots | wc -w`
    if [ $count -ge 0 ]; then
        # how many items should we delete
        delete=`expr $count - $keep`
        count=0
        # walk through the snapshots, deleting them until we've trimmed deleted
        for snapshot in $snapshots; do
            if [ $count -ge $delete ]; then
                break
            fi
            delete_snapshot $snapshot
            count=`expr $count + 1`
        done
    fi
}

# take snapshots of type, for pools, keeping keep old ones,
do_snapshots()
{
    pools=$1
    keep=$2
    type=$3

    echo ""
    echo "Doing zfs $type snapshots:"
    for pool in $pools; do
        if scrub_in_progress $pool; then
          echo "	skipping snapshot of $pool, scrub in progress"
        else
          do_pool $pool $keep $type
        fi
    done
}

ZFS snapshots reset/restart zfs scrub on FreeBSD 7.0

April 12th, 2009

I ran in to an interesting problem that took me a little bit of digging/thinking to figure out this morning. When you do a zpool scrub on FreeBSD and then do zfs snapshot the scrub restarts from the beginning, (at least as of this writing.) I couldn’t find any references to this and FreeBSD, but did find mentions of zfs scrub resetting with OSX. The same thread claimed the issue was addressed in OpenSolairs versions past that which had been ported/integrated in to OSX, probably the same for FreeBSD. I started a scrub last night before I went to bed and woke up this morning and it had less % done than it did the last time I looked, that’s not good. I checked in on it every few minutes for a while and noticed that it was completing pretty fast, but then kicked back up after a while. It didn’t immediately occur to me that it was at the top of the hour when my snapshot script had just run, but I got there a few minutes later.

At any rate, once I knew what was going on it made perfect sense and was pretty easily addressed. As part of addressing it I’ve generalized my snapshot setup and converted it to be run by periodic. A post with that is coming as soon as I live with it for a bit and make sure the kinks are worked out. It has actually turned out really cool, hourly, daily, weekly, monthly snapshots, monthly scrubs (during which snapshots are suspected and it’s all configurable…

ZFS snapshots — poor man’s backup — solution to the ‘rm *’ whoops problem

April 4th, 2009

Daily/Hourly Snapshots Script

First off snapshots != backups, but they’re still really useful. A solution to the whoops I didn’t mean to do that problem. Similar to the benefits of having a delayed slave with mysql, if you accidentally do something to mess up your world/data you can go back in time a little bit and “undo” it. ZFS gives you great tools for doing file system snapshots and makes recovering from problems possible. I use the following simple script I coded up on my newly built FreeBSD 7.0 full ZFS system.

#!/bin/sh

ZFS=zfs
POOL=tank

# i run this in a cron tab with output redirected to a
# log file, this gives me a ran at time
date

# delete the last hourly snapshot, the old one that
# should go away, in this case 2 hours ago
LASTHOUR=`date -v-2H +"%Y-%m-%d-%H"`
echo "deleting hourly snapshot $LASTHOUR of $POOL"
$ZFS destroy -r $POOL@$LASTHOUR
# create the new snapshot, this hour
HOURLY=`date +"%Y-%m-%d-%H"`
echo "taking hourly snapshot $HOURLY of $POOL"
$ZFS snapshot -r $POOL@$HOURLY

# at 12:00 utc, ~ 4am local time, i run this server in utc
if [ `date +"%H"` = '12' ]; then
  # same as LASTHOUR, but 1 week ago
  LASTWEEK=`date -v-1w +"%Y-%m-%d"`
  echo "deleting daily snapshot $LASTWEEK of $POOL"
  $ZFS destroy -r $POOL@$LASTWEEK
  # todays
  DAILY=`date +"%Y-%m-%d"`
  echo "taking daily snapshot $DAILY of $POOL"
  $ZFS snapshot -r $POOL@$DAILY
fi

It takes hourly snapshots, keeping around the last 2, and daily snapshots keeping around the last 7. That’s good enough to suit my purposes, but the script could easily be appended to take more frequent snapshots or to do longer term snapshotting: weekly, monthly, …

Disk Usage

One awesome thing is that if you have data that is added to, never taken away, snapshots won’t take up any extra space. If you make changes then both copies of the data will need to be stored and thus double the space will be required. Deleting, won’t free up the space so long as the snapshot lives. With my use-case I see about 3% overhead for snapshots on my frequently used pieces and nearly 0% everywhere else.

Viewing Current Usage

Useful commands for taking a peek at snapshots and space usage:

# zfs list -t filesystem
# zfs list

The first will show only filesystems, no snapshots etc. It’s similar in purpose to df. The second will show all filesystem and snapshot usage. This one tells you how much space each is taking up. If you have evenly spaced snapshots over time it can give you an idea how much churn you’re seeing.

An example of the output from zfs list for one of my filesystems and it’s snapshots

tank/media                54.7G  2.80T  47.9G  /media/media
tank/media@2009-03-29         0      -  46.3G  -
tank/media@2009-03-30         0      -  46.3G  -
tank/media@2009-03-31       43K      -  47.5G  -
tank/media@2009-04-01     1.20G      -  49.3G  -
tank/media@2009-04-02      350M      -  46.5G  -
tank/media@2009-04-03       52K      -  47.4G  -
tank/media@2009-04-04       48K      -  47.1G  -
tank/media@2009-04-05-04      0      -  47.9G  -
tank/media@2009-04-05-05      0      -  47.9G  -

Reading through the above the overall usage is at 54.7G. Nothing has changed in the past 2 hours. Tiny little bits of data have changed in the last two days and the two days before that saw decent size chunks changing at 350M and 1.2G. This is a pretty common pattern for this filesystem for me. The snapshots give me a place to go when I accidentally delete a file or make an unintended change.

Getting At Your Snapshot Data

Oh, one more thing… You can get at the snapshots using the .zfs directory. So if the tank/media filesystem was mounted at /media/ you’d find yesterday’s snapshot at /media/.zfs/snapshots/2009-04-03.

FreeBSD ZFS Root, running a full system in ZFS

March 23rd, 2009

I’ll start with a little bit of back story, if you’re just interested in the tech skip ahead…

FreeBSD and Me

FreeBSD and I have a long history together, in fact just about as long as I’ve been playing with computers. It was ~1996, I had just graduated from high school and managed to find the account sign up page for the University of Kentucky. I signed up for all of the accounts it would give me having no idea what they were since I didn’t start school for another 4 months or so. One of those accounts turned out to be email, pop.uky.edu. The server provided telnet access so I logged on and was greeted with a motd including FreeBSD. Up until that point I’d only been exposed to Windows. I knew that UNIX systems existed, but didn’t really have access to one. I could tell this was a UNIX system and the word Free piqued my interest. It turned out that it was a dual-P2 with a few hundred megabytes of RAM. I was pretty amazed, a pretty modest system hosting 40k users email accounts. A few minutes later I was downloading several (probably 8-12) floppy images and on my way to having a UNIX of my own. I’ve haven’t liked windows since.

I recently purchased a new system primarily intended as a network storage server. It’s a fairly modest, or at least cheap, system in all respects except hard drive space, I have 3 identical Segate drives weighing in at 1.5 TB each. I knew I wanted to give ZFS a try so I started looking at Open Solaris. I never really liked the Solaris env and quickly found Nexenta which seemed to address that complaint. Problem turned out to be that Nexenta is only avaliable in 32-bit flavors, wtf is up with that, and therefore didn’t cope with 4G of RAM or 1.5TB HD (I can’t remember which.) In reading around about ZFS I had run across something talking about ZFS support in FreeBSD. In reality I should of started there, but I wanted to mess around with DTrace as well, little did I know that also has been ported to FreeBSD. So an hour or so later I had a working FreeBSD system, fully running on ZFS.
</backstory>

Requirements

  • A system with a spare 3G or more in its own partition (though you’ll want way more for it to be interesting.)
  • FreeBSD bootonly iso, burnt on to a CDROM (or another installation method)
  • A couple hours to play around with it all

I initally worked off of the process described here, what follows is pretty similar and changes only where I’ve optimized out a step or two or just done things according to my situation, ymmv. As always please read my disclaimer thoroughly.

My Setup

  • 3 1.5TB hard drives in fully dedicated disk mode (the only part that’s really relevant)
  • New Egg shared wish list for the full details.
  • Old PCI 3-com Ethernet card as the NIC on the motherboard didn’t seem to get along with FreeBSD.

FreeBSD Install

I won’t walk through every detail of my process, but some of the big ones include

  1. Use entire disk for all 3 disks, installing bootloader on all 3
  2. Create a 1G partition on each disk, and allocate the rest to a 2nd partition, the important thing here, since I’m using raidz, is that the involved partitions are identical in size.
    • ad4s1a: 1G UFS, / (during install anyway)
    • ad4s1d: rest of drive, initially set to mount at /zfs0, but immeidately removed the mount point (takes a long time to format the file system and we don’t need it to be)
    • ad6s1a: 1G UFS, /r2, will eventually be used as a redundant/backup root
    • ad6s1d: same as ad4s1d
    • ad8s1b: 1G swap
    • ad8s1d: same as ad4s1d
  3. Otherwise install whatever you like, I always start with a minimal install and add things as I need/want them.

Post Install – The Real Stuff

After removing the disk and rebooting select single user mode:

mount -w /

Create the raid pool, allocating whatever partitions you have to use, my command looked like:

zpool create tank raidz /dev/ad3s1d /dev/ad6s1d /dev/ad8s1d

We don’t want to mount the pool as a whole, so unset it’s mountpoint

zfs set mountpoint=none tank

Now we’ll create our root filesystem in the tank pool. We’re specifying a temporary mount point that we can use for now, later we’ll disable mounting for this filesystem as it will be mounted before booting from it.

zfs create -o mountpoint=/tank tank/root

This is just an example of how you would create other filesystems inside of the tank pool. Its safe to create home and go ahead and mount it in its final location so long as you didn’t create users during the intial install process. If you did then you need to do something similar to the root copy we’ll be doing in a second to copy the data from the old partition to the new.

zfs create -o mountpoint=/home tank/home

Check out what you’ve created with

df -h
zfs list

Now enable ZFS on boot

echo 'zfs_enable="YES"' >> /etc/rc.conf

And then copy all of the current, non-ZFS, root data over to the ZFS root partition.

find -x / | cpio -pmd /tank 

No remove the boot directory we just copied over

rm -rf /tank/boot

And then create a point at which the UFS boot directory will be mounted so that we can update it later if need be.

mkdir /tank/bootdir
cd /tank
ln -s bootdir/boot boot

Now we need to tell the UFS bootloader to enable ZFS and to boot from our root ZFS filesystem. We also need to do some ZFS tuning based on information found in the wiki at ZFSTuningGuide, check it out for more info on what’s being set there. The specific values depend mainly on the amount of RAM you have and what (else) you’ll be using the box for.

echo '
vm.kmem_size="1024M"
vm.kmem_size_max="1024M"
#vfs.zfs.arc_max="100M"
zfs_load="YES"
vfs.root.mountfrom="zfs:tank/root"
' >> /boot/loader.conf

Edit the ZFS root’s fstab, /tank/etc/fstab, so that the UFS bootdir is mounted where we pointed the symlink up above, the dev portion of this may vary if you’ve

/dev/ad4s1a /bootdir	ufs	rw	1	1

And finally change the root filesystem’s mountpoint so that ZFS won’t try to mount it automatically.

cd /
zfs set mountpoint=legacy tank/root

Thanks pretty much it, just reboot and check it out.

The last step is to mirror the UFS boot parition on to a 2nd drive in case the first dies, to do this

find -x /bootdir | cpio -pmd /r2

Any time you want to add a new filesystem, they can be very handy for bookkeeping and concise snapshotting, you just do something like the following

zfs create -o mountpoint=/media/music tank/music