Integrating FreeBSD, ZFS, and Periodic; snapshots and scrubs

April 15th, 2009

Update: the scripts/process outlined here has been used as the foundation for zfs-periodic in the FreeBSD ports collection, you can check it out here.

ZFS on FreeBSD is powerful, especially when coupled with periodic taking hourly, daily, weekly, monthly, … snapshots. In the following post I’ll provide the scripts & config necessary to customize and walk you step-by-step through setting up zfs snapshots and scrubs with periodic on FreeBSD.

Periodic’s main advantage over the more traditional and obvious method of running a script from a cron job is integration with the notification emails and standard configuration mechanism. That may not sound like much, but that means you a year down the road (or someone else that comes to the system) only has to look in the obvious place to figure out what’s going on or make changes.

This is going to be a long post, but there’s a decent amount of code & config to walk through. The files being discussed have been tar’d up and the latest version of them can be downloaded from here.

Configuration – the stuff you might actually want to muck with

We’ll start with the configuration (/etc/periodic.conf) as it’s the most relevant portion or at least the most likely to be edited. Out of the box FreeBSD supports daily, weekly, and monthly periodic tasks, we’re going to be adding an hourly so that we can do hourly snapshots. The first section of config sets up who output from the hourly script should go to, whether it should be sent if everything succeeded, if something failed, or if something is mis-configured. Hourly emails seem a bit much so we’ve disabled them when everything goes well. We do want messages about errors and I’ve just left badconfig to the same value as all of the other time-frames.

# Hourly options
hourly_output="root"					# user or /file
hourly_show_success="NO"				# scripts returning 0
hourly_show_info="YES"					# scripts returning 1
hourly_show_badconfig="NO"				# scripts returning 2

The next section configures hourly snapshots. In this case we’re enabling them for the pool tank and keeping the 6 most recent around. There are defaults, that we’ll see later, for both pools and keep so the only required value here is enable. To specify more than one pool add them space seperated to the config string, e.g. “tank boat plane”

# 000.zfs-snapshot
hourly_zfs_snapshot_enable="YES"
hourly_zfs_snapshot_pools="tank"
hourly_zfs_snapshot_keep=6

The daily section is almost identical, but we instead keep the last 7 days. We’re also enabling a daily zfs status script that is in the default setup, but disabled.

# Daily options

# 000.zfs-snapshot
daily_zfs_snapshot_enable="YES"
daily_zfs_snapshot_pools="tank"
daily_zfs_snapshot_keep=7

# 404.status-zfs
daily_status_zfs_enable="YES"

Weekly and Monthly have the same configuration options, we’re keeping the last 5 weeks, and last 2 months below.

# Weekly options

# 000.zfs-snapshot
weekly_zfs_snapshot_enable="YES"
weekly_zfs_snapshot_pools="tank"
weekly_zfs_snapshot_keep=5

# Monthly options

# 000.zfs-snapshot
monthly_zfs_snapshot_enable="YES"
monthly_zfs_snapshot_pools="tank"
monthly_zfs_snapshot_keep=2

A final section configures the monthly scrubbing. Similarlly to the snapshot config sections there’s an enable line and pools line. Here we’ve enabled the monthly scrub on the tank pool.

# 998.zfs-scrub
monthly_zfs_scrub_enable="YES"
monthly_zfs_scrub_pools="tank"

periodic hourly – adding hourly script support to periodic

The next thing we need to do is add hourly support to periodic. While that might sound complicated it’s actually very straightforward. We start by creating a directory to house the hourly files.

# mkdir /etc/periodic/hourly

And then add the following line to /etc/crontab just before the line for ‘periodic hourly’

1	*	*	*	*	root	periodic hourly

That’s it you now have a place to put scripts that will be run every hour on the :01.

hourly/daily/weekly/monthly scripts – adding hourly script support to periodic

Now we’ll get to the scripts that make all of this configuration do something. It’s unlikely that you’ll ever have to much with any of these, but in case your curious I’ll go ahead and walk though them. We’ll start with the hourly snapshot script (/etc/periodic/hourly/000.zfs-snapshot.)

 1  #!/bin/sh
 2
 3  # If there is a global system configuration file, suck it in.
 4  #
 5  if [ -r /etc/defaults/periodic.conf ]
 6  then
 7      . /etc/defaults/periodic.conf
 8      source_periodic_confs
 9  fi
10
11  pools=$hourly_zfs_snapshot_pools
12  if [ -z "$pools" ]; then
13      pools='tank'
14  fi
15
16  keep=$hourly_zfs_snapshot_keep
17  if [ -z "$keep" ]; then
18      keep=6
19  fi
20
21  case "$hourly_zfs_snapshot_enable" in
22      [Yy][Ee][Ss])
23          . /etc/periodic/zfs-snapshot
24          do_snapshots "$pools" $keep 'hourly'
25          ;;
26      *)
27          ;;
28  esac

Lines 3-9 is boilerplate periodic script stuff. 11-19 look for the values we configured earlier and use defaults if they’re not specified. 21, 22, 25, and 26 are case shell scripting case statement stuff that’s borrowed from one of the other periodic scripts, mainly just makes sure that hourly_zfs_snapshot_enable is set to YES, ignoring case. Line 23 pulls in (think #include) some common zfs snapshotting code that we’ll get to next and finally line 24 calls the snapshotting function for the configured pools, keep count, and the type of hourly. The daily, weekly, and monthly scripts are identical with hourly replaced with the appropriate value throughout.

zfs-snapshot – the workhorse

There’s too much here to walk though in detail so I’ll let you read through the code. I’ve tried to do a decent job of in-line commenting. If you have questions or want clarification feel free to ask…

#!/bin/sh

# checks to see if there's a scrub in progress
scrub_in_progress()
{
  pool=$1

  if zpool status $pool | grep "scrub in progress" > /dev/null; then
    return 0
  else
    return 1
  fi
}

# take the appropriately named snapshot
create_snapshot()
{
    pool=$1

    case "$type" in
        hourly)
        now=`date +"$type-%Y-%m-%d-%H"`
        ;;
        daily)
        now=`date +"$type-%Y-%m-%d"`
        ;;
        weekly)
        now=`date +"$type-%Y-%U"`
        ;;
        monthly)
        now=`date +"$type-%Y-%m"`
        ;;
        *)
        echo "unknown snapshot type: $type"
        exit 1
    esac

    # create the now snapshot
    snapshot="$pool@$now"
    # look for a snapshot with this name
    if zfs list -H -o name | sort | grep "$snapshot$" > /dev/null; then
        echo "	snapshot, $snapshot, already exists"
    else
        echo "	taking snapshot, $snapshot"
        zfs snapshot -r $snapshot
    fi
}

# delete the named snapshot
delete_snapshot()
{
    snapshot=$1
    echo "	destroying old snapshot, $snapshot"
    zfs destroy -r $snapshot
}

# take a type snapshot of pool, keeping keep old ones
do_pool()
{
    pool=$1
    keep=$2
    type=$3

    # create the regex matching the type of snapshots we're currently working
    # on
    case "$type" in
        hourly)
        # hourly-2009-01-01-00
        regex="$pool@$type-[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]-[0-9][0-9]$"
        ;;
        daily)
        # daily-2009-01-01
        regex="$pool@$type-[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]$"
        ;;
        weekly)
        # weekly-2009-01
        regex="$pool@$type-[0-9][0-9][0-9][0-9]-[0-9][0-9]"
        ;;
        monthly)
        # monthly-2009-01
        regex="$pool@$type-[0-9][0-9][0-9][0-9]-[0-9][0-9]"
        ;;
        *)
        echo "unknown snapshot type: $type"
        exit 1
    esac

    create_snapshot $pool $type

    # get a list of all of the snapshots of this type sorted alpha, which
    # effectively is increasing date/time
    # (using sort as zfs's sort seems to have bugs)
    snapshots=`zfs list -H -o name | sort | grep $regex`
    # count them
    count=`echo $snapshots | wc -w`
    if [ $count -ge 0 ]; then
        # how many items should we delete
        delete=`expr $count - $keep`
        count=0
        # walk through the snapshots, deleting them until we've trimmed deleted
        for snapshot in $snapshots; do
            if [ $count -ge $delete ]; then
                break
            fi
            delete_snapshot $snapshot
            count=`expr $count + 1`
        done
    fi
}

# take snapshots of type, for pools, keeping keep old ones,
do_snapshots()
{
    pools=$1
    keep=$2
    type=$3

    echo ""
    echo "Doing zfs $type snapshots:"
    for pool in $pools; do
        if scrub_in_progress $pool; then
          echo "	skipping snapshot of $pool, scrub in progress"
        else
          do_pool $pool $keep $type
        fi
    done
}

Releated Posts