You are currently viewing all posts in the general category.

Cryptshot: Automated, Encrypted Backups with rsnapshot

Earlier this year I switched from Duplicity to rsnapshot for my local backups. Duplicity uses a full + incremental backup schema: the first time a backup is executed, all files are copied to the backup medium. Successive backups copy only the deltas of changed objects. Over time this results in a chain of deltas that need to be replayed when restoring from a backup. If a single delta is somehow corrupted, the whole chain is broke. To minimize the chances of this happening, the common practice is to complete a new full backup every so often – I usually do a full backup every 3 or 4 weeks. Completing a full backup takes time when you’re backing up hundreds of gigabytes, even over USB 3.0. It also takes up disk space. I keep around two full backups when using Duplicity, which means I’m using a little over twice as much space on the backup medium as what I’m backing up.

The backup schema that rsnapshot uses is different. The first time it runs, it completes a full backup. Each time after that, it completes what could be considered a “full” backup, but unchanged files are not copied over. Instead, rsnapshot simply hard links to the previously copied file. If you modify very large files regularly, this model may be inefficient, but for me – and I think for most users – it’s great. Backups are speedy, disk space usage on the backup medium isn’t too much more than the data being backed up, and I have multiple full backups that I can restore from.

The great strength of Duplicity – and the great weakness of rsnapshot – is encryption. Duplicity uses GnuPG to encrypt backups, which makes it one of the few solutions appropriate for remote backups. In contrast, rsnapshot does no encryption. That makes it completely inappropriate for remote backups, but the shortcoming can be worked around when backing up locally.

My local backups are done to an external, USB hard drive. Encrypting the drive is simple with LUKS and dm-crypt. For example, to encrypt /dev/sdb:

$ cryptsetup --cipher aes-xts-plain --key-size 512 --verify-passphrase luksFormat /dev/sdb

The device can then be opened, formatted, and mounted.

$ cryptsetup luksOpen /dev/sdb backup_drive
$ mkfs.ext4 -L backup /dev/mapper/backup_drive
$ mount /dev/mapper/backup_drive /mnt/backup/

At this point, the drive will be encrypted with a passphrase. To make it easier to mount programatically, I also add a key file full of some random data generated from /dev/urandom.

$ dd if=/dev/urandom of=/root/supersecretkey bs=1024 count=8
$ chmod 0400 /root/supersecretkey
$ cryptsetup luksAddKey /dev/sdb /root/supersecretkey

There are still a few considerations to address before backups to this encrypted drive can be completed automatically with no user interaction. Since the target is a USB drive and the source is a laptop, there’s a good chance that the drive won’t be plugged in when the scheduler kicks in the backup program. If it is plugged in, the drive needs to be decrypted before calling rsnapshot to do its thing. I wrote a wrapper script called cryptshot to address these issues.

Cryptshot is configured with the UUID of the target drive and the key file used to decrypt the drive. When it is executed, the first thing it does is look to see if the UUID exists. If it does, that means the drive is plugged in and accessible. The script then decrypts the drive with the specified key file and mounts it. Finally, rsnapshot is called to execute the backup as usual. Any argument passed to cryptshot is passed along to rsnapshot. What that means is that cryptshot becomes a drop-in replacement for encrypted, rsnapshot backups. Where I previously called rsnapshot daily, I now call cryptshot daily. Everything after that point just works, with no interaction needed from me.

If you’re interested in cryptshot, you can download it directly from GitHub. The script could easily be modified to execute a backup program other than rsnapshot. You can clone my entire backups repository if you’re also interested in the other scripts I’ve written to manage different aspects of backing up data.

I spent the equinox in the Glacier Peak Wilderness.

A few days. Me and a pack and some mountains.

Goodbye summer, hello fall.

Ruck

Currently reading: Journey to the Centre of the Earth by Richard and Nicholas Crane.

If you’re at all interested in bikes, lightweight backpacking, or a combination thereof, you must read this book.

In 1986, Dick and Nick rode lightweight, steel race bikes from the Bay of Bengal across Bangladesh, up and over the Himalaya, across the Tibetan Plateau, and through the Gobi desert to the point of the earth furthest from the sea. They were sawing their toothbrushes in half and cutting extraneous buckles off of their panniers before “bikepacking” (or “ultralight backpacking”) was a thing. The appendix includes a complete gear list and relevant discussion.

A snowy night in Tibet

The book is currently out of print, but used copies can be found. A PDF version is available here.

I've long been an 8-speed man on my bikes.

More gears seem unnecessary, but the market has other ideas. I wanted to upgrade my brifters. There were practically no options, so last week I made the jump to 9-speed. Now I’m running an 11-32 9-speed cassette and a 30/42/52 triple chainring.

9 Speed

Tarsnapper: Managing Tarsnap Backups

Tarsnap bills itself as “online backups for the truly paranoid”. I began using the service last January. It fast became my preferred way to backup to the cloud. It stores data on Amazon S3 and costs $0.30 per GB per month for storage and $0.30 per GB for bandwidth. Those prices are higher than just using Amazon S3 directly, but Tarsnap implements some impressive data de-duplication and compression that results in the service costing very little. For example, I currently have 67 different archives stored in Tarsnap from my laptop. They total 46GB in size. De-duplicated that comes out to 1.9GB. After compression, I only pay to store 1.4GB. Peanuts.

Of course, the primary requirement for any online backup service is encryption. Tarsnap delivers. And, most importantly, the Tarsnap client is open-source, so the claims of encryption can actually be verified by the user. The majority of for-profit, online backup services out there fail on this critical point.

So Tarsnap is amazing and you should use it. The client follows the Unix philosophy: “do one thing and do it well”. It’s basically like tar. It can create archives, read the contents of an archive, extract archives, and delete archives. For someone coming from an application like Duplicity, the disadvantage to the Tarsnap client is that it doesn’t include any way to automatically manage backups. You can’t tell Tarsnap how many copies of a backup you wish to keep, or how long backups should be allowed to age before deletion.

Thanks to the de-duplication and compression, there’s not a great economic incentive to not keep old backups around. It likely won’t cost you that much extra. But I like to keep things clean and minimal. If I haven’t used an online backup in 4 weeks, I generally consider it stale and have no further use for it.

To manage my Tarsnap backups, I wrote a Python script called Tarsnapper. The primary intent was to create a script that would automatically delete old archives. It does this by accepting a maximum age from the user. Whenever Tarsnapper runs, it gets a list of all Tarsnap archives. The timestamp is parsed out from the list and any archive that has a timestamp greater than the maximum allowed age is deleted. This is seamless, and means I never need to manually intervene to clean my archives.

Tarsnapper also provides some help for creating Tarsnap archives. It allows the user to define any number of named archives and the directories that those archives should contain. On my laptop I have four different directories that I backup with Tarsnap, three of them in one archive and the last in another archive. Tarsnapper knows about this, so whenever I want to backup to Tarsnap I just call a single command.

Tarsnapper also can automatically add a suffix to the end of each archive name. This makes it easier to know which archive is which when you are looking at a list. By default, the suffix is the current date and time.

Configuring Tarsnapper can be done either directly by changing the variables at the top of the script, or by creating a configuration file named tarsnapper.conf in your home directory. The config file on my laptop looks like this:

1
2
3
4
5
6
[Settings]
tarsnap: /usr/bin/tarsnap

[Archives]
nous-cloud: /home/pigmonkey/work /home/pigmonkey/documents /home/pigmonkey/vault/
nous-config: /home/pigmonkey/.config

There is also support for command-line arguments to specify the location of the configuration file to use, to delete old archives and exit without creating new archives, and to execute only a single named-archive rather than all of those that you may have defined.

$ tarsnapper.py --help
usage: tarsnapper.py [-h] [-c CONFIG] [-a ARCHIVE] [-r]

A Python script to manage Tarsnap archives.

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG, --config CONFIG
                        Specify the configuration file to use.
  -a ARCHIVE, --archive ARCHIVE
                        Specify a named archive to execute.
  -r, --remove          Remove archives old archives and exit.

It makes using a great service very simple. My backups can all be executed simply by a single call to Tarsnapper. Stale archives are deleted, saving me precious picodollars. I use this system on my laptop, as well as multiple servers. If you’re interested in it, Tarsnapper can be downloaded directly from GitHub. You can clone my entire backups repository if you’re also interested in the other scripts I’ve written to manage different aspects of backing up data.

It's better when you break things completely.

When things are only partly broken your inbox gets flooded with error messages…

Mark Two

Hi there. It’s been a while.

I took a year off from blogging. That wasn’t intentional. I just didn’t have anything to say for a while. Then I did have something to say, but I was tired of how the website looked. If the design doesn’t excite me I tend not to want to blog. (Call me vain, but I want my words to look good.) And redesigning the website – well, that requires an entirely different set of motivations to tackle. It took me some time to get that motivation, and then before I knew it we were here: 10 days short of a year.

During the development process I referred to this design as “mark two”, as it was the second idea I tried out.

The website still runs on Django. The blog is still powered by Vellum, my personal blog application. I’ve been hacking in it for over a year now (even when this website was inactive) and it is much improved since the last time I mentioned it. In the past six months I’ve seen the light of CSS preprocessors. All of the styling for this design is written in SASS and uses the excellent Compass framework. The responsive layout is built with Susy.

If you’re interested in these technical details, you will also be interested to know that the entire website is now open-source. You can find it on GitHub. Fork it, hack it, or borrow some of my CSS for your website.

The other big news is that I have begun to categorize blog posts. Yeah, it’s 2012 and I’m a little late to the party on that one. You may recall that I only began to tag posts in 2008. As it stands right now, all posts are just placed in the great big ameba of a category called “General”. Eventually, they will all have more meaningful categories – I hope. But it will be a while.

Things ought to be more active around here for the foreseeable future.

Sanyo Eneloop Rechargeable Batteries

I go through batteries at a fairly high rate. Electronic devices for the wilderness, such as my headlamp and GPS, see regular use. At home, things like my wireless mouse need power. The biggest drain are my lights – particular in the winter, when they are used to light my regular commute.

This last spring I decided to invest in a set of rechargeable batteries. Although some of my devices run on CR123 batteries, most use AA or AAAs. To start with, I was concerned only with being able to recharge the AA and AAA batteries. Years ago I had a set of rechargeable batteries, but I think the technology was not very developed back then. They seemed to drain quickly and not hold many charges. Today, the market is different. Some brief research showed that there were many options out there, with positive reviews for most of them.

What most reviews seemed to suggest was that the majority of the offerings were all of equal quality, with most differences unlikely to be noticed outside of a laboratory. The most popular, though, seemed to be the Sanyo Eneloop and Maha Powerex batteries. I found some claims that, between the two, the Eneloops held a charge longer while on the shelf.

I decided to try the Eneloop batteries, and purchased a package that included a charger, 8 AAs and 4 AAAs. The charger can charge up to four batteries at once, either AAA or AA, but it must be done in pairs. It cannot charge one battery at a time, or three. This has turned out to be an occasional inconvenience. I have some devices that use three batteries, and some that need just one. To charge the batteries for those devices I always have to give the charger an extra battery.

Sanyo Eneloop Charger

The charger takes around five hours to bring a dead battery up to a full charge. I have read that the Maha Powerex MH-C9000 charger can charge the batteries in a shorter period of time. It also gives the user more control over the charge, which has the potential of increasing the life of the batteries.

The batteries themselves I have been very happy with. I don’t have the knowledge to provide any objective information on their chemistry or electronics. Suffice it to say that they work. They seem to last longer in the same devices than their non-rechargeable counterparts did. I have not noticed any degradation in those batteries that I have recharged. That is not surprising. Sanyo claims the Eneloop batteries can be recharged 1,500 times – a number I have not come anywhere near to approaching.

Electronics Powered by Eneloops

Since the initial purchase, I have bought two more packs of AA and AAA Eneloops. All of my electronics now run on rechargeable batteries, save for those few that require CR123 batteries. The batteries themselves are an expensive investment, but they have payed off. Now that I have a good number of both AA and AAA sizes, and am happy with the Eneloop brand, I would like to purchase a more specialized charger, such as the aforementioned C9000.

If you use any non-rechargeable AA or AAA batteries in your electronics, I recommend giving Eneloops a try. The financial savings alone is enough of a benefit to justify their use.