You are currently viewing all posts tagged with linux.

Archiving Bookmarks

I signed-up for Pinboard in 2014. It provides everything I need from a bookmarking service, which is mostly, you know, bookmarking. I pay for the archival account, meaning that Pinboard downloads a copy of everything I bookmark and provides me with full-text search. I find this useful and well worth the $25 yearly fee, but Pinboard’s archive is only part of the solution. I also need an offline copy of my bookmarks.

Pinboard provides an API that makes it easy to acquire a list of bookmarks. I have a small shell script which pulls down a JSON-formatted list of my bookmarks and adds the file to git-annex. This is controlled via a systemd service and timer, which wraps the script in backitup to ensure daily dumps. The systemd timer itself is controlled by nmtrust, so that it only runs when I am connected to a trusted network.

This provides data portability, ensuring that I could import my tagged URLs to another bookmarking service if I ever found something better than Pinboard (unlikely, competing with Pinboard is futile). But I also want a locally archived copy of the pages themselves, which Pinboard does not offer through the API. I carry very much about being able to work offline. The usefulness of a computer is directly propertional to the amount of data that is accessible without a network connection.

To address this I use bookmark-archiver, a Python script which reads URLs from a variety of input files, including Pinboard’s JSON dumps. It archives each URL via wget, generates a screenshot and PDF via headless Chromium, and submits the URL to the Internet Archive (with WARC hopefully on the way). It will then generate an HTML index page, allowing the archives to be easily browsed. When I want to browse the archive, I simply change into the directory and use python -m http.server to serve the bookmarks at localhost:8000. Once downloaded locally, the archives are of course backed up, via the usual suspects like borg and cryptshot.

The archiver is configured via environment variables. I configure my preferences and point the program at the Pinboard JSON dump in my annex via a shell script (creatively also named bookmark-archiver). This wrapper script is called by the previous script which dumps the JSON from Pinboard.

The result of all of this is that every day I get a fresh dump of all my bookmarks, each URL is archived locally in multiple formats, and the archive enters into my normal backup queue. Link rot may defeat the Supreme Court, but between this and my automated repository tracking I have a pretty good system for backing up useful pieces of other people’s data.

PGP Key Renewal

Last year I demonstrated setting up the USB Armory for PGP key management. The two management operations I perform on the Armory are key signing and key renewal. I set my keys to expire each year, so that each year I need to confirm that I am not dead, still control the keys, and still consider them trustworthy.

After booting up the Armory, I first verify that NTP is disabled and set the current UTC date and time. Time is critical for any cryptography operations, and the Armory has no battery to maintain a clock.

$ timedatectl set-ntp false
$ timedatectl set-time "yyyy-mm-dd hh:mm:ss"

My keys are stored on an encrypted microSD card, which I mount and decrypt.

$ mkdir /mnt/sdcard
$ cryptsetup luksOpen /dev/sda sdcrypt
$ mount /dev/mapper/sdcrypt /mnt/sdcard

Next I’ll export a few environment variables to make things less redundant later on.

$ export YEAR=$(date +%Y)
$ export PREVYEAR=$(($YEAR-1))
$ export GNUPGHOME="/mnt/sdcard/gpg/$YEAR-renewal/.gnupg"
$ export KEYID="0x70B220FF8D2ACF29"

I perform each renewal in a directory specific to the current year, but the GNUPGHOME directory I set for this year’s renewal doesn’t exist yet. Better create it.

$ mkdir -p $GNUPGHOME
$ chmod 700 $GNUPGHOME

I keep a copy of my gpg.conf on the microSD card. That needs to be copied in to the new directory, and I’ll need to tell GnuPG what pinentry program to use.

$ cp /mnt/sdcard/gpg/gpg.conf $GNUPGHOME
$ echo "pinentry-program /usr/bin/pinentry-curses" > $GNUPGHOME/gpg-agent.conf

After renewing the master key and subkey the previous year, I exported them. I’ll now import those backups from the previous year.

$ gpg --import /mnt/sdcard/gpg/$PREVYEAR-renewal/backup/peter\@havenaut.net.master.gpg-key
$ gpg --import /mnt/sdcard/gpg/$PREVYEAR-renewal/backup/peter\@havenaut.net.subkeys.gpg-key

When performing the actual renewal, I’ll set the expiration to 13 months. This needs to be done for the master key, the signing subkey, the encryption subkey, and the authentication subkey.

$ gpg --edit-key $KEYID
trust
5
expire
13m
y
key 1
key 2
key 3
expire
y
13m
y
save

That’s the renewal. I’ll list the keys to make sure they look as expected.

$ gpg --list-keys

Before moving the subkeys to my Yubikey, I back everything up. This will be what I import the following year.

$ mkdir /mnt/sdcard/gpg/$YEAR-renewal/backup
$ gpg --armor --export-secret-keys $KEYID > /mnt/sdcard/gpg/$YEAR-renewal/backup/peter\@havenaut.net.master.gpg-key
$ gpg --armor --export-secret-subkeys $KEYID > /mnt/sdcard/gpg/$YEAR-renewal/backup/peter\@havenaut.net.subkeys.gpg-key

Now I can insert my Yubikey, struggle to remember the admin PIN I set on it, and move over the subkeys.

$ gpg --edit-key $KEYID
toggle
key 1 # signature
keytocard
1
key 1
key 2 # encryption
keytocard
2
key 2
key 3 # authentication
keytocard
3
save

When I list the secret keys, I expect them to all be stubs (showing as ssb>).

$ gpg --list-secret-keys

Of course, for this to be useful I need to export my renewed public key and copy it to some place where it can be brought to a networked machine for dissemination.

$ gpg --armor --export $KEYID > /mnt/sdcard/gpg/$YEAR-renewal/peter\@havenaut.net.public.gpg-key
$ mkdir /mnt/usb
$ mount /dev/sdb1 /mnt/usb
$ cp /mnt/sdcard/gpg/$YEAR-renewal/peter\@havenaut.net.public.gpg-key /mnt/usb/

That’s it. Clean up, shutdown, and lock the Armory up until next year.

$ umount /mnt/usb
$ umount /mnt/sdcard
$ cryptsetup luksClose sdcrypt
$ systemctl poweroff

LUKS Header Backup

I’d neglected backup LUKS headers until Gwern’s data loss postmortem last year. After reading his post I dumped the headers of the drives I had accessible, but I never got around to performing the task on my less frequently accessed drives. Last month I had trouble mounting one of those drives. It turned out I was simply using the wrong passphrase, but the experience prompted me to make sure I had completed the header backup procedure for all drives.

I dump the header to memory using the procedure from the Arch wiki. This is probably unnecessary, but only takes a few extra steps. The header is stored in my password store, which is obsessively backed up.

$ sudo mkdir /mnt/tmp
$ sudo mount ramfs /mnt/tmp -t ramfs
$ sudo cryptsetup luksHeaderBackup /dev/sdc --header-backup-file /mnt/tmp/dump
$ sudo chown pigmonkey:pigmonkey /mnt/tmp/dump
$ pass insert -m crypt/luksheader/themisto < /mnt/tmp/dump
$ sudo umount /mnt/tmp
$ sudo rmdir /mnt/tmp

An Inbox for Taskwarrior

My experience with all task managements systems – whether software or otherwise – is that the more you put into them, the more useful they become. Not only adding as many tasks as possible, however small they may be, but also enriching the tasks with as much metadata as possible. When I began using taskwarrior, one of the problems I encountered was how to address this effectively.

Throughout the day I’ll be working on something when I receive an unrelated request. I want to log those requests so that I remember them and eventually complete them, but I don’t want to break from whatever I’m currently doing and take the time to mark these tasks up with the full metadata they eventually need. Context switching is expensive.

To address this, I’ve introduced the idea of a task inbox. I have an alias to add a task to taskwarrior with a due date of tomorrow and a tag of inbox.

alias ti='task add due:tomorrow tag:inbox'

This allows me to very quickly add a task without needing to think about it.

$ ti do something important

Each morning I run task ls. The tasks which were previously added to my inbox are at the top, overdue with a high priority. At this point I’ll modify each of them, removing the inbox tag, setting a real due date, and assigning them to a project. If they are more complex, I may also add annotations or notes, or build the task out with dependencies. If the task is simple – something that may only take a minute or two – I’ll just complete it immediately and mark it as done without bothering to remove the inbox tag.

This alias lowers the barrier of entry enough that I am likely to log even the smallest of tasks, while the inbox concept provides a framework for me to later make the tasks rich in a way that allows me to take advantage of the power that taskwarrior provides.

Borg Assimilation

For years the core of my backup strategy has been rsnapshot via cryptshot to various external drives for local backups, and Tarsnap for remote backups.

Tarsnap, however, can be slow. It tends to take somewhere between 15 to 20 minutes to create my dozen or so archives, even if little has changed since the last run. My impression is that this is simply due to the number of archives I have stored and the number of files I ask it to archive. Once it has decided what to do, the time spent transferring data is negligible. I run Tarsnap hourly. Twenty minutes out of every hour seems like a lot of time spent Tarsnapping.

I’ve eyed Borg for a while (and before that, Attic), but avoided using it due to the rapid development of its earlier days. While activity is nice, too many changes too close together do not create a reassuring image of a backup project. Borg seems to have stabilized now and has a large enough user base that I feel comfortable with it. About a month ago, I began using it to backup my laptop to rsync.net.

Initially I played with borgmatic to perform and maintain the backups. Unfortunately it seems to have issues with signal handling, which caused me to end up with annoying lock files left over from interrupted backups. Borg itself has good documentation and is easy to use, and I think it is useful to build familiarity with the program itself instead of only interacting with it through something else. So I did away with borgmatic and wrote a small bash script to handle my use case.

Creating the backups is simple enough. Borg disables compression by default, but after a little experimentation I found that LZ4 seemed to be a decent compromise between compression and performance.

Pruning backups is equally easy. I knew I wanted to match roughly what I had with Tarsnap: hourly backups for a day or so, daily backups for a week or so, then a month or two of weekly backups, and finally a year or so of monthly backups.

My only hesitation was in how to maintain the health of the backups. Borg provides the convenient borg check command, which is able to verify the consistency of both a repository and the archives themselves. Unsurprisingly, this is a slow process. I didn’t want to run it with my hourly backups. Daily, or perhaps even weekly, seemed more reasonable, but I did want to make sure that both checks were completed successfully with some frequency. Luckily this is just the problem that I wrote backitup to solve.

Because the consistency checks take a while and consume some resources, I thought it would also be a good idea to avoid performing them when I’m running on battery. Giving backitup the ability to detect if the machine is on battery or AC power was a simple hack. The script now features the -a switch to specify that the program should only be executed when on AC power.

My completed Borg wrapper is thus:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#!/bin/sh
export BORG_PASSPHRASE='supers3cr3t'
export BORG_REPO='borg-rsync:borg/nous'
export BORG_REMOTE_PATH='borg1'

# Create backups
echo "Creating backups..."
borg create --verbose --stats --compression=lz4             \
    --exclude ~/projects/foo/bar/baz                        \
    --exclude ~/projects/xyz/bigfatbinaries                 \
    ::'{hostname}-{user}-{utcnow:%Y-%m-%dT%H:%M:%S}'        \
    ~/documents                                             \
    ~/projects                                              \
    ~/mail                                                  \
    # ...etc

# Prune backups
echo "Pruning backups..."
borg prune --verbose --list --prefix '{hostname}-{user}-'    \
    --keep-within=2d                                         \
    --keep-daily=14                                          \
    --keep-weekly=8                                          \
    --keep-monthly=12                                        \

# Check backups
echo "Checking repository..."
backitup -a                                             \
    -p 172800                                           \
    -l ~/.borg_check-repo.lastrun                       \
    -b "borg check --verbose --repository-only"         \
echo "Checking archives..."
backitup -a                                             \
    -p 259200                                           \
    -l ~/.borg_check-arch.lastrun                       \
    -b "borg check --verbose --archives-only --last 24" \

This is executed by a systemd service.

[Unit]
Description=Borg Backup

[Service]
Type=oneshot
ExecStart=/home/pigmonkey/bin/borgwrapper.sh

[Install]
WantedBy=multi-user.target

The service is called hourly by a systemd timer.

[Unit]
Description=Borg Backup Timer

[Timer]
Unit=borg.service
OnCalendar=hourly
Persistent=True

[Install]
WantedBy=timers.target

I don’t enable the timer directly, but add it to /usr/local/etc/trusted_units so that nmtrust activates it when I’m connected to trusted networks.

$ echo "borg.timer,user:pigmonkey" >> /usr/local/etc/trusted_units

I’ve been running this for about a month now and have been pleased with the results. It averages about 30 seconds to create the backups every hour, and another 30 seconds or so to prune the old ones. As with Tarsnap, deduplication is great.

------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               19.87 GB             18.41 GB             10.21 MB
All archives:              836.02 GB            773.35 GB             19.32 GB
                       Unique chunks         Total chunks
Chunk index:                  371527             14704634
------------------------------------------------------------------------------

The most recent repository consistency check took about 30 minutes, but only runs every 172800 seconds, or once every other day. The most recent archive consistency check took about 40 minutes, but only runs every 259200 seconds, or once per 3 days. I’m not sure that those schedules are the best option for the consistency checks. I may tweak their frequencies, but because I know they will only be executed when I am on a trusted network and AC power, I’m less concerned about the length of time.

With Borg running hourly, I’ve reduced Tarsnap to run only once per day. Time will tell if Borg will slow as the number of stored archives increase, but for now running Borg hourly and Tarsnap daily seems like a great setup. Tarsnap and Borg both target the same files (with a few exceptions). Tarsnap runs in the AWS us-east-1 region. I’ve always kept my rsync.net account in their Zurich datacenter. This provides the kind of redundancy that lets me rest easy.

Contrary to what you might expect given the number of blog posts on the subject, I actually spend close to no time worrying about data loss in my day to day life, thanks to stuff like this. An ounce of prevention, and all that. (Maybe a few kilograms of prevention in my case.)

Terminal Weather

I do most of my computing in the terminal. Minimizing switches to graphical applications helps to improve my efficiency. While the web browser does tend to be superior for consuming and interacting with detailed weather forecasts, I like using wttr.in for answering simple questions like “Do I need a jacket?” or “Is it going to rain tomorrow?”

Of course, weather forecasts are location department. I don’t want to have to think about where I am every time I want to use wttr. To feed it my current location, I use jq to parse the zip code from the output of ip-api.com.

curl wttr.in/"${1:-$(curl http://ip-api.com/json | jq 'if (.zip | length) != 0 then .zip else .city end')}"

I keep this in a shell script so that I have a simple command that gives me current weather for wherever I happen to be – as long as I’m not connected to a VPN.

$ wttr
Weather report: 94107

     \   /     Sunny
      .-.      62-64 °F
   ― (   ) ―   → 19 mph
      `-’      12 mi
     /   \     0.0 in
...

Automated Repository Tracking

I have confidence in my backup strategies for my own data, but until recently I had not considered backing up other people’s data.

Recently, the author of a repository that I tracked on GitHub deleted his account and disappeared from the information super highway. I had a local copy of the repository, but I had not pulled it for a month. A number of recent changes were lost to me. This inspired me to setup the system I now use to automatically update local copies of any code repositories that are useful or interesting to me.

I clone the repositories into ~/library/src and use myrepos to interact with them. I use myrepos for work and personal repositories as well, so to keep this stuff segregated I setup a separate config file and a shell alias to refer to it.

alias lmr='mr --config $HOME/library/src/myrepos.conf --directory=$HOME/library/src'

Now when I want to add a new repository, I clone it normally and register it with myrepos.

$ cd ~/library/src
$ git clone https://github.com/warner/magic-wormhole
$ cd magic-wormhole && lmr register

The ~/library/src/myrepos.conf file has a default section which states that no repository should be updated more than once every 24 hours.

[DEFAULT]
skip = [ "$1" = update ] && ! hours_since "$1" 24

Now I can ask myrepos to update all of my tracked repositories. If it sees that it has already updated a repository within 24 hours, myrepos will skip the repository.

$ lmr update

To automate this I create a systemd service.

[Unit]
Description=Update library repositories

[Service]
Type=oneshot
ExecStart=/usr/bin/mr --config %h/library/src/myrepos.conf -j5 update

[Install]
WantedBy=multi-user.target

And a systemd timer to run the service every hour.

[Unit]
Description=Update library repositories timer

[Timer]
Unit=library-repos.service
OnCalendar=hourly
Persistent=True

[Install]
WantedBy=timers.target

I don’t enable this timer directly, but instead add it to my trusted_units file so that nmtrust will enable it only when I am on a trusted network.

$ echo "library-repos.timer,user:pigmonkey" >> /usr/local/etc/trusted_units

If I’m curious to see what has been recently active, I can ls -ltr ~/library/src. I find this more useful than GitHub stars or similar bookmarking.

I currently track 120 repositories. This is only 3.3 GB, which means I can incorporate it into my normal backup strategies without being concerned about the extra space.

The internet can be fickle, but it will be difficult for me to loose a repository again.

The USB Armory for PGP Key Management

I use a Yubikey Neo for day-to-day PGP operations. For managing the secret key itself, such as during renewal or key signing, I use a USB Armory with host adapter. In host mode, the Armory provides a trusted, open source platform that is compact and easily secured, making it ideal for key management.

Setting up the Armory is fairly straightforward. The Arch Linux ARM project provides prebuilt images. From my laptop, I follow their instructions to prepare the micro SD card, where /dev/sdX is the SD card.

$ dd if=/dev/zero of=/dev/sdX bs=1M count=8
$ fdisk /dev/sdX
# `o` to clear any partitions
# `n`, `p`, `1`, `2048`, `enter` to create a new primary partition in the first position with a first sector of 2048 and the default last sector
# `w` to write
$ mkfs.ext4 /dev/sdX1
$ mkdir /mnt/sdcard
$ mount /dev/sdX1 /mnt/sdcard

And then extract the image, doing whatever verification is necessary after downloading.

$ wget http://os.archlinuxarm.org/os/ArchLinuxARM-usbarmory-latest.tar.gz
$ bsdtar -xpf ArchLinuxARM-usbarmory-latest.tar.gz -C /mnt/sdcard
$ sync

Followed by installing the bootloader.

$ sudo dd if=/mnt/sdcard/boot/u-boot.imx of=/dev/sdX bs=512 seek=2 conv=fsync
$ sync

The bootloader must be tweaked to enable host mode.

$ sed -i '/#setenv otg_host/s/^#//' /mnt/sdcard/boot/boot.txt
$ cd /mnt/sdcard/boot
$ ./mkscr

For display I use a Plugable USB 2.0 UGA-165 adapter. To setup DisplayLink one must configure the correct modules.

$ sed -i '/blacklist drm_kms_helper/s/^/#/g' /mnt/sdcard/etc/modprobe.d/no-drm.conf
$ echo "blacklist udlfb" >> /mnt/sdcard/etc/modprobe.d/no-drm.conf
$ echo udl > /mnt/sdcard/etc/modules-load.d/udl.conf

Finally, I copy over pass and ctmg so that I have them available on the Armory and unmount the SD card.

$ cp /usr/bin/pass /mnt/sdcard/bin/
$ cp /usr/bin/ctmg /mnt/sdcard/bin/
$ umount /mnt/sdcard

The SD card can then be inserted into the Armory. At no time during this process – or at any point in the future – is the Armory connected to a network. It is entirely air-gapped. As long as the image was not compromised and the Armory is stored securely, the platform should remain trusted.

Note that because the Armory is never on a network, and it has no internal battery, it does not keep time. Upon first boot, NTP should be disabled and the time and date set.

$ timedatectl set-ntp false
$ timedatectl set-time "yyyy-mm-dd hh:mm:ss" # UTC

On subsequent boots, the time and date should be set with timedatectl set-time before performing any cryptographic operations.