My current project is to enable network backups of my Mac and my wife’s PC over the internet, so that we have an off-site backup of last resort, should our house burn down, fall over, and then sink into the swamp.
Anyone who’s used a Mac for a long time knows that transferring Mac-native files over the internet is fraught with peril. You risk losing type and creator codes and resource forks, as well as a number of other forms of metadata introduced with MacOS X. So my first step was to determine how I could safely encode my files so that they could make the trip to a foreign server (which would either by a Linux-type box on my web host, or an Amazon S3 account) and then back again, with the file intact for recovery.
A few months ago, the Plasticsfuture blog ran an excellent article comparing the capabilities of darn near every backup and restore program on the Mac. The results were disheartening: Only SuperDuper! could precisely back up and restore a file (although, in my own testing, I found that ChronoSync was updated and can now handle the job).
However, my needs are somewhat different. A network backup requires that I only update changed files, and furthermore, I won’t be accessing them on a filesystem. So SuperDuper’s use of a disk image to perform network backups is straight out. So, too, is ChronoSync, since it can only back up to a network filesystem.
No, I need something that can encode individual files, ideally via a script or some other automated method. Furthermore, I’m mostly just concerned about archiving documents (I don’t want to pay for online storage of my whole hard drive!), so if permissions, ACLs (which I don’t use), or BSD flags are munged, that’s all right. This is truly a last-resort backup, so if I can get resource forks and extended attributes to back up, I’m pretty happy.
Now, according to Apple, a number of command-line utilities have been updated to deal correctly with extended attributes and resource forks – these include tar, cpio, ditto, and zip. There are also a handful of third-party archivers/compressors out there which also claim Mac compatibility such as the x7z (a Mac version of the 7-zip compression algorithm) and StuffIt compression programs, the xar archiver, and Interarchy’s “backup” format. (Note: Interarchy’s backup page is currently 404’ing, but suffice to say it’s an open format that attempts to encode files in such a way as to store all of Apple’s fancy metadata.)
So I figured I’d do what any red-blooded geek would do, and test all these programs to see if they did what they claimed to do. My test was pretty simple: I took a text file, assigned a Finder label and some comments, and added a resource fork and some custom extended attributes. If all these things made it through intact, I’d consider the tool good enough for my purposes.
The results were interesting. First off, every program I tested successfully maintained the resource fork, which is really the most important part of the file to keep around. Additionally, every program except for tar managed to keep the Finder label. As for extended attributes, only tar, the Interarchy backup format and cpio successfully kept the custom extended attributes. This is especially baffling given that the resource fork in MacOS X 10.4 is nothing more than an extended attribute!
The most troubling thing for me was that none of the programs I tested managed to maintain the “Spotlight Comments” I’d added. I frequently use these comments as a way to tag my files for Spotlight searching, so their loss is somewhat problematic. It turns out that these spotlight comments are stored in the invisible .DS_Store file in the same folder as the file I was backing up. So provided I restored (and backed up) the whole folder, that wouldn’t be a problem. Still, it would be nice to see it handle all that.
Update: I hadn’t originally tested it, since Apple hadn’t listed it on their OS X pages as a utility that had been updated to work with resource forks, but some online discussions led me to believe that the pax archiver had also been updated. Indeed, it has, and it successfully maintains resource forks, extended attributes, and Finder labels, just like cpio. It does, however, seem to have bugs when used on systems with ACLs enabled, and like everything else, it loses Finder comments. I have updated the discussion, below, accordingly.
So this leaves me with three solid options: The Interarchy Backup format, pax, and cpio archives. The latter two archive formats are fully compatible with other unix-like systems and can even be expanded using the graphical BOMArchiveHelper on the Mac, so they may be the best choice. Both archivers permit me to compress files in the archive, which is a nice bonus for network archiving. Of the two, pax has some nice advantages, including a larger file size limit on archives and some interesting command-line tricks including the ability to write out to different archive types. Cpio, on the other hand, seems to work on systems with ACLs enabled.
However, I suspect (although I haven’t tested this) that the Interarchy Backup format captures more Mac-specific metadata, since it was designed with exactly that goal in mind, and Apple’s programs are, well, less than entirely consistent in supporting these filesystem features. I do wish Apple could at least provide tools that were consistent and reliable.
It is worthwhile to note that none of the graphical Mac compression utilities managed to maintain extended attributes, not even Apple’s customer zip archiver (which is the same tool used when you archive files in the Finder) nor StuffIt, which has long been a Mac-friendly standby compression program. This leaves Mac users with essentially no easy options for compressing files prior to emailing them or posting them to an FTP site other than disk images (which aren’t cross-platform).
I’m disappointed that the xar archiver, which was designed precisely to handle metadata schemes on different systems, didn’t perform better. It is still a work in progress, so I left some bug reports, and I remain hopeful they will properly support extended attributes and comments in future releases. Since xar can also handle compression as well as encryption, it would be a fantastic solution for off-site backups.
Soon I’ll post how my backup system develops and works over the long-term. If you have any questions or comments, please feel free to sound off, below!