Let me say up front: if you're a Linux newbie, this post is not for you. I'm going to talk about the problems I've had automating my computer backups, and some of the esoteric scripts I've written for this task, and if you're not comfortable with shell scripts and the Linux command line, none of this will do you any good. If you're somewhere between "power user" and "system administrator", you might pick up a few ideas. If you're a new user and you need to back up your PC right now, skip down to number 3. It's the only option that doesn't require command line or scripting skills.
Statement of the problem: I want to back up my PC, regularly, to optical media (recordable CD or DVD). I also want to back up my wife's PC, remotely, through our home network connection. I'd like this to be as automatic as possible. And I'd like to be able to do "incremental" backups (backing up only the files that have changed).
Really, is this too much to ask?
Linux gurus: Yes, I know that this problem has been solved, with a program called
Amanda. Some day I will use Amanda. I bought, and read, a whole damned book --
Unix Backup and Recovery, which I hasten to add is out of print -- just to learn how to set up Amanda. And when I have a quiet week of time, I will do just that. Amanda is for those with system administrator skills (more on this in Part 2.) If you're an open-source guru and you want to do something useful, write an interface that will help a newbie configure and use Amanda.
This is the short form of this article, with program listings omitted. For the full version with listings
click here.
1. "cdbackup": smbmount scriptWhen I started this project, in 2002, I had just started using Red Hat Linux and my wife was still using Windows 9x, which added to the challenge. I had configured Samba on my machine to access her system over the network. I learned about smbmount, which would allow a Samba shared disk drive to be mounted in my local filesystem. And I learned about mkisofs and cdrecord, to create and write CD-R images. Since all of these required long and complicated command lines, I wrote a simple shell script to automate the process.
Every week, I would issue the following commands (as root):
cdbackup brad
cdbackup d
cdbackup wendy
I should mention that "d" is a Windows partition on my hard drive, which contained my pre-Linux -- but still current -- user files.
Now this was a fairly simple script, but it only worked because each of the three directories I was backing up were smaller than the CD-R limit of 700 MB. And it was rather wasteful of CD-Rs, since I always had to burn three. (Another problem, which I had yet to learn, was that mkisofs would not preserve the permissions of my Linux files.) As my directories grew beyond 700 MB, I needed a better solution.
2. "tarbackup": smbtar multivolume scriptI knew (from somewhere) that the standard Unix/Linux "tar" archiver was able to split an archive across multiple volumes. (In my old MS/DOS days, I could do this with pkzip.) And I found this useful web page,
"Automating backups with tar", which explained how to use tar for incremental backups. So I started hacking the script from that web page to automatically write to multiple CD-Rs. I discovered the hard way that tar doesn't work well with smbmount volumes (it gets the modification dates confused), but there is another program called smbtar that works over Samba.
tarbackup daily
tarbackup full
Since my PC had both a CD writer and a ZIP drive, and CD-Rs were still fairly expensive, I planned to do a daily incremental backup to the ZIP disk and a weekly backup to CD-R. As it happens, I rarely did the daily backup, trusting to the weekly "full" backup.
This script -- actually, three scripts -- was quite a bit more complicated, since it had to handle writing a CD-R and prompting for a new CD-R when a volume was full. But it worked, and was the most automatic of all the solutions I've tried.
3. Xandros CD WriterAround 2004 I switched distributions, and installed Xandros Linux version 2 on both my wife's computer and my own. (Her Windows PC was becoming more unreliable, and Xandros was the first Linux distro that I thought she would be comfortable with.) Unfortunately, Xandros broke my backup script -- for some reason which I have never discovered, the "cdrecord" program in the Xandros distribution didn't work from the command line, even though I could use the Xandros File Manager to burn CD-Rs with no problem.
(A note to users of other distros: as of version 2, Xandros does not include a CD-burning program like
K3B or
X-CD-Roast. Instead, they have extended the Konqueror file manager to include CD burning functions. The result looks remarkably like the CD burning program I used back in my Windows days, and is quite friendly to new users. But almost any Linux CD burning software should be able to write a directory to a "data" CD.)
This began a period of backup drudgery. Every week I would launch the File Manager, create a new CD project, and drag the directories I wanted to back up over to the project. (It seemed that, if I "saved" a CD project, it saved the complete file list -- so the next week it would back up only the same files, and not notice any new files. Thus every week I had to start from scratch.) I would add my home directory, and then have to remove some of the subdirectories to get below 700 MB. And then I'd have to start a new CD for those excluded subdirectories. I didn't even try to do a network backup; I installed a CD writer on my wife's PC, and once a week I'd sneak in when she wasn't using her PC to do a quick backup.
It was entirely manual, and rather slow. I began to avoid doing backups.
One advantage -- or so I thought -- was that the CDs were readable on any computer without special software. But when my wife's hard drive crashed, and I had to restore from a backup set, I learned that the standard ISO-9660 CD-ROM format does not support Linux file permissions! Group, owner, and permissions were lost from her home directory. And while most of the files use the default ownership and 644 permissions, a few do not...and things stop working. It took a while to straighten out that mess.
4. tar over ssh, and DVD scriptMy backups needed to be easier to make, and easier to restore. Fortunately, two events happened at about this time: I installed a DVD writer, and a friend tipped me to using tar over ssh.
The DVD writer let me replace my weekly stack of six CD-Rs with a single DVD-R -- easier to use, and less expensive per megabyte. Xandros 2 had a working command-line DVD burning program, growisofs, so I could write a script once again, to create tar archives (which preserve Linux file permissions) and write them to CD or DVD.
Since my wife was now using Linux, I didn't need to use Samba or smbtar (which loses Linux file permissions). But a friend told me the trick of logging into her computer over our home network with ssh. (The ssh server is installed, but not activated, by default in Xandros; activating it was trivial.) A single command line can archive her home directory and send it to a .tar file on my machine:
ssh -l wendy Wendysnew "tar cf - /home/wendy/" | cat > /tmp/wendy.tar
When I installed the DVD writer, I repartitioned my hard drive so that /tmp was big enough for two 4.7 GB DVD images. This let me write one script to create all the tar archives, and two more scripts to write them to DVD-R.
backupfetch
backupburn (files)
backupadd (files)
The one limitation of growisofs under Xandros 2 was that it was limited to writing 2 GB at a time. So to fill the DVD, I wrote "burn" to write to an empty DVD (-Z option) and "add" to write to a partially-full DVD (-M option). Also, it seemed to be limited to writing a maximum of 4 GB to a DVD. (Both of these limitations seem to be removed since I upgraded to Xandros 4, which uses a 2.6 Linux kernel.)
Now this was a lot better, because I had to type a few commands once a week. Since the backups were all tar archives, I didn't lose any file permissions. There were only two drawbacks. First, I had to break the backup into several smaller-than-2-GB archives, and manually select which archives went onto which DVD (so as to make the optimum use of the available 4 GB). Second, each week, the "fetch" operation pulled a few GB over the 10 Mbps Ethernet connection from my wife's PC...and that took the better part of an hour.
5. rsyncSecond problem first: I knew of the program 'rsync', which will keep two directories on two different machines synchronized. The problem, I thought, was that I would have to install an rsync server program on one of the machines; so I kept procrastinating. Then someone at our Linux Users Group told me that rsync would work over an ssh connection, and I already had an ssh server running on my wife's PC (it's not activated by default in Xandros, but it's easy to activate from the control panel).
By creating a home directory for my wife on my PC, and a home directory for me on her PC, I can now backup both of our machines to each other with a script of just three lines (which must be run by the root user):
ssh-add /root/.ssh/private.key
rsync -az --delete /home/brad/ brad@Wendysnew:/home/brad
rsync -az --delete wendy@Wendysnew:/home/wendy/ /home/wendy
(The --delete option is a bit scary, since it will
remove from the destination machine any files it doesn't find on the source machine. If you type the source and destination backwards, you can delete your home directory! Before I ran this the first time, I backed up both machines to a DVD.)
One big advantage of rsync is that it copies only the changed and new files. So, even on our 10 Mbps Ethernet, the rsync in both directions takes only about five minutes. This means that it's fast enough to do daily.
Which brings me to the second advantage: I now have a two-tier backup system. Every day I back up our machines to each other. (If you have a single machine, you can back up to a second hard drive.) Once a week, I copy those directories to DVD. So if one machine fails, I have a fresh backup; and if by some catastrophe both machines fail, the most we stand to lose is one week's work. And by backing up to external media (DVDs), I can store some backups off-site, so even if the house burns down, I can recover much of our data.
I'm still using my 'backupfetch' script to create tar archives (though now I just create them from my own PC), and 'backupburn' and 'backupadd' to write DVDs. I'm now hoping to automate that procedure. In the next installment, I'll talk about the 'scdbackup' program that I'm experimenting with, and also a bit more about Amanda, Bacula, and other backup utilities.