Daily/Hourly Snapshots Script
First off snapshots != backups, but they’re still really useful. A solution to the whoops I didn’t mean to do that problem. Similar to the benefits of having a delayed slave with mysql, if you accidentally do something to mess up your world/data you can go back in time a little bit and “undo” it. ZFS gives you great tools for doing file system snapshots and makes recovering from problems possible. I use the following simple script I coded up on my newly built FreeBSD 7.0 full ZFS system.
#!/bin/sh
ZFS=zfs
POOL=tank
# i run this in a cron tab with output redirected to a
# log file, this gives me a ran at time
date
# delete the last hourly snapshot, the old one that
# should go away, in this case 2 hours ago
LASTHOUR=`date -v-2H +"%Y-%m-%d-%H"`
echo "deleting hourly snapshot $LASTHOUR of $POOL"
$ZFS destroy -r $POOL@$LASTHOUR
# create the new snapshot, this hour
HOURLY=`date +"%Y-%m-%d-%H"`
echo "taking hourly snapshot $HOURLY of $POOL"
$ZFS snapshot -r $POOL@$HOURLY
# at 12:00 utc, ~ 4am local time, i run this server in utc
if [ `date +"%H"` = '12' ]; then
# same as LASTHOUR, but 1 week ago
LASTWEEK=`date -v-1w +"%Y-%m-%d"`
echo "deleting daily snapshot $LASTWEEK of $POOL"
$ZFS destroy -r $POOL@$LASTWEEK
# todays
DAILY=`date +"%Y-%m-%d"`
echo "taking daily snapshot $DAILY of $POOL"
$ZFS snapshot -r $POOL@$DAILY
fi
It takes hourly snapshots, keeping around the last 2, and daily snapshots keeping around the last 7. That’s good enough to suit my purposes, but the script could easily be appended to take more frequent snapshots or to do longer term snapshotting: weekly, monthly, …
Disk Usage
One awesome thing is that if you have data that is added to, never taken away, snapshots won’t take up any extra space. If you make changes then both copies of the data will need to be stored and thus double the space will be required. Deleting, won’t free up the space so long as the snapshot lives. With my use-case I see about 3% overhead for snapshots on my frequently used pieces and nearly 0% everywhere else.
Viewing Current Usage
Useful commands for taking a peek at snapshots and space usage:
# zfs list -t filesystem
# zfs list
The first will show only filesystems, no snapshots etc. It’s similar in purpose to df. The second will show all filesystem and snapshot usage. This one tells you how much space each is taking up. If you have evenly spaced snapshots over time it can give you an idea how much churn you’re seeing.
An example of the output from zfs list for one of my filesystems and it’s snapshots
tank/media 54.7G 2.80T 47.9G /media/media
tank/media@2009-03-29 0 - 46.3G -
tank/media@2009-03-30 0 - 46.3G -
tank/media@2009-03-31 43K - 47.5G -
tank/media@2009-04-01 1.20G - 49.3G -
tank/media@2009-04-02 350M - 46.5G -
tank/media@2009-04-03 52K - 47.4G -
tank/media@2009-04-04 48K - 47.1G -
tank/media@2009-04-05-04 0 - 47.9G -
tank/media@2009-04-05-05 0 - 47.9G -
Reading through the above the overall usage is at 54.7G. Nothing has changed in the past 2 hours. Tiny little bits of data have changed in the last two days and the two days before that saw decent size chunks changing at 350M and 1.2G. This is a pretty common pattern for this filesystem for me. The snapshots give me a place to go when I accidentally delete a file or make an unintended change.
Getting At Your Snapshot Data
Oh, one more thing… You can get at the snapshots using the .zfs directory. So if the tank/media filesystem was mounted at /media/ you’d find yesterday’s snapshot at /media/.zfs/snapshots/2009-04-03.