I have been unfortunate enough to learn the hard way that a good backup strategy is an essential, not a nice-to-have. If you think this is melodramatic, imagine for a second that a disk died - right now - it choose this moment to give in. Where are all the photos? Not the low quality copy uploaded to Facebook, but the high resolution image that came off the camera? Where are all the important e-mails and documents? Without a working, tested backup system in place they all would have been lost.
So backups are great - but why incremental backup? Imagine a document, picture or other file that isn't accessed very often became corrupt. With latest-copy backup system like rsync, or just copying the files to a server or external disk, there would be a backup of the corrupt file, which is of little use. With an incremental backup system it's possible to recover a pre-corruption version from the backup. It's also quicker than copying all the files over to another disk all the time because it only copies the changed files and only the bits of the file that have changed - if it's not a naive implementation.
On my Linux box I was using rdiff-backup to keep incremental backups of all my stuff. OSX has Time Machine which does exactly the same thing, but with a cool interface that allows you to browse your backups and restore files using a familiar graphical environment. Unfortunately by default it does not allow backups to network volumes that aren't provided by Apple. Whilst this is mildly annoying it isn't difficult to change.
I currently backup to a server at home running Debian that provides an Apple Filing Protocol (AFP) service amongst others to our Macs, but it's not natively supported by OSX for Time Machine backups! I have to credit the following web pages which were incredibly useful for getting Time Machine working for us:
- A Mac OS X Hints article describing how to create a sparse bundle and enable OSX to use unsupported network volumes for backups: http://www.macosxhints.com/article.php?story=20080420211034137
- A post by Matthias Kretschmann describing how to setup Netatalk (for AFP) and Avahi (for Bonjour) on an Ubuntu box: http://www.kremalicious.com/2008/06/ubuntu-as-mac-file-server-and-time-machine-volume/
Neither of the links worked on its own for me. I used a combination of them and the man pages shipped with OSX to get Time Machine to work, backing up to the server from our Macs. I used Matthias' article to setup the server with AFP and Bonjour. After this I did the following on the Macs:
Firstly, just selecting the network volume to backup to in Time Machine and asking it to create a backup didn't work for me. I had to create a backup sparse bundle first and copy it to the network drive where I wanted to keep the backup. To create the sparse bundle I used the command:
$ hdiutil create -library SPUD -size 1m -fs 'Journaled HFS+' \ -type SPARSEBUNDLE -volname NAME_AABBCCDDEEFF.sparsebundle \ NAME_AABBCCDDEEFF.sparsebundle
I replaced the text NAME with the name of the Mac that would be using the backup and the text AABBCCDDEEFF with the MAC (not Apple!) address of the ethernet port of the Mac that will use the backup. See the Mac OS X Hints article for more information.
This command is subtly different from the one in the Mac OS X Hints article. The command in the article didn't work for me. I copied the sparse bundle to the network volume I wanted to keep the backup on. I haven't tried putting it anywhere different, but I assume it must be in the root directory of the network volume.
In order for Time Machine to use this volume you must enable OSX to use unsupported volumes, another command:
# defaults write com.apple.systempreferences TMShowUnsupportedNetworkVolumes 1
I made the mistake of not mounting the backup volume in Finder before configuring Time Machine. Unless the backup volume with the sparse bundle is mounted in Finder, Time Machine won't allow you to select it as a backup location. I then configured Time Machine to use the backup volume. The first backup seemed to take forever and it was like it wasn't using the full performance of the network, subsequent backups seem to flow much quicker over the network.
I excluded a few directories that I don't want backed up, either because I don't care about them or I only ever use them as a scratch pad and don't intend for them to take up space on my backup volume. Notably Trash, Desktop and Music are never backed up - I either have copies elsewhere or don't care about their contents.
That's incremental backup working away in the background, keeping an up-to-date copy of all my data without me having to do anything. It offers the same functionality as rdiff-backup, but is much easier to setup and automatically makes a backup whenever it can connect to the backup volume.