You are currently viewing all posts tagged with linux.

2015-08-08

Jailing the Browser

The web browser is one of our computers’ primary means of interaction with the unwashed mashes. Combined with the unfortunately large attack surface of modern browsers, this makes a sandbox which does not depend on the browser itself an attractive idea.

Firejail is a simple, lightweight sandbox that uses linux namespaces to prevent programs from accessing things they do not need.

Firejail ships with default profiles for Firefox and Chromium. These profiles drop capabilities, filter syscalls, and prevent access to common directories like /sbin, ~/.gnupg and ~/.ssh. This is a good start, but I see little reason to give the browser access to much of anything in my home directory.

The --private flag instructs Firejail to mount a new user home directory in a temporary filesystem. The directory is empty and all changes are discarded when the sandbox is closed – think of it as a more effective private browsing or incognito mode that also resets your browser to factory defaults.

$ firejail --private firefox

A more useful option for normal browsing is to specify a directory that Firejail should use as the user home. This allows you to keep a consistent browser profile and downloads directory, but still prevents the browser from accessing anything else in the normal user home.

$ mkdir ~/firefox
$ mv ~/.mozilla ~/firefox/
$ firejail --private=firefox firefox

This is the method I default to for my browsing. I’ve created my own Firejail profile for Firefox at ~/.config/firejail/firefix.profile which implements this.

include /etc/firejail/disable-mgmt.inc
caps.drop all
seccomp
netfilter
noroot

# Use ~/firefox as user home
private firefox

The only inconvenience I’ve discovered with this is that linking my Vimperator configuration files into the directory from my dotfiles repository creates a dangling link from the perspective of anything running within the jail. Since it cannot access my real home directory, it cannot see the link target in the ~/.dotfiles directory. I have to copy the configuration files into ~/firefox and then manually keep them in sync. I modify these files infrequently enough that for me this is worth the trade-off.

The temporary filesystem provided by --private is still useful when accessing websites that are especially sensitive (such as a financial institution) or especially shady. In my normal browser profiles, I have a number of extensions installed that block ads, disable scripts, etc. If these extensions completely break a website, and I don’t want to take the time to figure out which of the dozens of things I’m blocking are required for the website to function, I’ll just spin up a sandboxed browser with the --private flag, comfortable in the knowledge that whatever dirty scripts the site is running are limited in their ability to harm me.

I perform something like 90% of my web browsing in Firefox, but use Chromium for various tasks throughout the day. Both run in Firejail sandboxes, helping to keep me safe when surfing the information superhighway. Other programs, like torrent applications and PDF readers, also make good candidates for running within Firejail.

2013-05-29

backups, crypto, linux, annex

Optical Backups of Photo Archives

I store my photos in git-annex. A full copy of the annex exists on my laptop and on an external drive. Encrypted copies of all of my photos are stored on Amazon S3 (which I pay for) and box.com (which provides 50GB for free) via git-annex special remotes. The photos are backed-up to an external drive daily with the rest of my laptop hard drive via backitup.sh and cryptshot. My entire laptop hard drive is also mirrored monthly to an external drive stored off-site.

(The majority of my photos are also on Flickr, but I don’t consider that a backup or even reliable storage.)

All of this is what I consider to be the bare minimum for any redundant data storage. Photos have special value, above the value that I assign to most other data. This value only increases with age. As such they require an additional backup method, but due to the size of my collection I want to avoid backup methods that involve paying for more online storage, such as Tarsnap.

I choose optical discs as the medium for my photo backups. This has the advantage of being read-only, which makes it more difficult for accidental deletions or corruption to propagate through the backup system. DVD-Rs have a capacity of 4.7 GBs and a cost of around $0.25 per disc. Their life expectancy varies, but 10-years seem to be a reasonable low estimate.

Preparation

I keep all of my photos in year-based directories. At the beginning of every year, the previous year’s directory is burned to a DVD.

Certain years contain few enough photos that the entire year can fit on a single DVD. More recent years have enough photos of a high enough resolution that they require multiple DVDs.

Encrypt

Leaving unencrypted data around is bad form. The archive (or each of the files resulting from splitting the large archive) is next encrypted and signed with GnuPG.

$ gpg -eo 2012.tar.bz.gpg 2012.tar.bz
$ gpg -bo 2012.tar.bz.gpg.sig 2012.tar.bz.gpg

Imaging

The encrypted archive and the detached signature of the encrypted archive are what will be burned to the disc. (Or, in the case of a large archive, the encrypted splits of the full archive and the associated signatures will be burned to one disc per split/signature combonation.) Rather than burning them directly, an image is created first.

$ mkisofs -V "Photos: 2012 1/1" -r -o 2012.iso 2012.tar.bz.gpg 2012.tar.bz.gpg.sig

If the year has a split archive requiring multiple discs, I modify the sequence number in the volume label. For example, a year requiring 3 discs will have the label Photos: 2012 1/3.

Parity

When I began this project I knew that I wanted some sort of parity information for each disc so that I could potentially recover data from slightly damaged media. My initial idea was to use parchive via par2cmdline. Further research led me to dvdisaster which, despite being a GUI-only program, seemed more appropriate for this use case.

Both dvdisaster and parchive use the same Reed–Solomon error correction codes. Dvdidaster is aimed at optical media and has the ability to place the error correction data on the disc by augmenting the disc image, as well as storing the data separately. It can also scan media for errors and assist in judging when the media is in danger of becoming defective. This makes it an attractive option for long-term storage.

I use dvdisaster with the RS02 error correction method, which augments the image before burning. Depending on the size of the original image, this will result in the disc having anywhere from 20% to 200% redundancy.

Verify

After the image has been augmented, I mount it and verify the signature of the encrypted file on the disc against the local copy of the signature. I’ve never had the signatures not match, but performing this step makes me feel better.

$ sudo mount -o loop 2012.iso /mnt/disc
$ gpg --verify 2012.tar.bz.gpg.sig /mnt/disc/2012.tar.bz.gpg
$ sudo umount /mnt/disc

Burn

The final step is to burn the augmented image. I always burn discs at low speeds to diminish the chance of errors during the process.

$ cdrecord -v speed=4 dev=/dev/sr0 2012.iso

Similar to the optical backups of my password database, I burn two copies of each disc. One copy is stored off-site. This provides a reasonably level of assurance against any loss of my photos.

2013-04-10

micro, linux

I occasionally find myself editing video for work.

In the past I’ve used OpenShot, but I was hit with some bugs that made me search for alternatives. Now I use Kdenlive, which I’ve found to be very capable. OpenShot currently has a Kickstarter campaign on to fund further development. Even though I’m no longer a user, I happily backed the project.

2013-04-04

shell, backups, code, crypto, linux

Password Management with Vim and GnuPG

The first password manager I ever used was a simple text file encrypted with GnuPG. When I needed a password I would decrypt the file, read it in Vim, and copy the required entry to the system clipboard. This system didn’t last. At the time I wasn’t using GnuPG for much else, and this was in the very beginning of my Vim days, when the program seemed cumbersome and daunting. I shortly moved to other, purpose-built password managers.

After some experimentation I landed on KeePassX, which I used for a number of years. Some time ago I decided that I wanted to move to a command-line solution. KeePassX and a web browser were the only graphical applications that I was using with any regularity. I could see no need for a password manager to have a graphical interface, and the GUI’s dependency on a mouse decreased my productivity. After a cursory look at the available choices I landed right back where I started all those years ago: Vim and GnuPG.

These days Vim is my most used program outside of a web browser and I use GnuPG daily for handling the majority of my encryption needs. My greater familiarity with both of these tools is one of the reasons I’ve been successful with the system this time around. I believe the other reason is my more systematic approach.

Structure

The power of this system comes from its simplicity: passwords are stored in plain text files that have been encrypted with GnuPG. Every platform out there has some implementation of the PGP protocol, so the files can easily be decrypted anywhere. After they’ve been decrypted, there’s no fancy file formats to deal with. It’s all just text, which can be manipulated with a plethora of powerful tools. I favor reading the text in Vim, but any text editor will do the job.

All passwords are stored within a directory called ~/pw. Within this directory are multiple files. Each of these files can be thought of as a separate password database. I store bank information in financial.gpg. Login information for various shopping websites are in ecommerce.gpg. My email credentials are in email.gpg. All of these entries could very well be stored in a single file, but breaking it out into multiple files allows me some measure of access control.

Access

I regularly use two computers: my laptop at home and a desktop machine at work. I trust my laptop. It has my GnuPG key on it and it should have access to all password database files. I do not place complete trust in my machine at work. I don’t trust it enough to give it access to my GnuPG key, and as such I have a different GnuPG key on that machine that I use for encryption at work.

Having passwords segregated into multiple database files allows me to encrypt the different files to different keys. Every file is encrypted to my primary GnuPG key, but only some are encrypted with my work key. Login credentials needed for work are encrypted to the work key. I have no need to login to my bank accounts at work, and it wouldn’t be prudent to do so on a machine that I do not fully trust, so the financial.gpg file is not encrypted to my work key. If someone compromises my work computer, they still will be no closer to accessing my banking credentials.

Git

The ~/pw directory is a git repository. This gives me version control on all of my passwords. If I accidentally delete an entry I can always get it back. It also provides syncing and redundant storage without depending on a third-party like Dropbox.

Keys

An advantage of using a directory full of encrypted files as my password manager is that I’m not limited to only storing usernames and passwords. Any file can be added to the repository. I keep keys for backups, SSH keys, and SSL keys (all of which have been encrypted with my GnuPG key) in the directory. This gives me one location for all of my authentication credentials, which simplifies the locating and backing up of these important files.

Markup

Each file is structured with Vim folds and indentation. There are various ways for Vim to fold text. I use markers, sticking with the default {{{/}}} characters. A typical password entry will look like this:

Amazon{{{
    user:   foo@bar.com
    pass:   supers3cr3t
    url:    https://amazon.com
}}}

Each file is full of entries like this. Certain entries are grouped together within other folds for organization. Certain entries may have comments so that I have a record of the false personally identifiable information the service requested when I registered.

Super Ecommerce{{{
    user:   foobar
    pass:   g0d
    Comments{{{
        birthday:   1/1/1911
        first car:  delorean
    }}}
}}}

Following a consistent structure like this makes the file easier to navigate and allows for the possibility of the file being parsed by a script. The fold markers come into play with my Vim configuration.

Vim

I use Vim with the vim-gnupg plugin. This makes editing of encrypted files seamless. When opening existing files, the contents are decrypted. When opening new files, the plugin asks which recipients the file should be encrypted to. When a file is open, leaking the clear text is avoided by disabling viminfo, swapfile, and undofile. I run gpg-agent so that my passphrase is remembered for a short period of time after I use it. This makes it easy and secure to work with (and create) the encrypted files with Vim. I define a few extra options in my vimrc to facilitate working with passwords.

""""""""""""""""""""
" GnuPG Extensions "
""""""""""""""""""""

" Tell the GnuPG plugin to armor new files.
let g:GPGPreferArmor=1

" Tell the GnuPG plugin to sign new files.
let g:GPGPreferSign=1

augroup GnuPGExtra
" Set extra file options.
    autocmd BufReadCmd,FileReadCmd *.\(gpg\|asc\|pgp\) call SetGPGOptions()
" Automatically close unmodified files after inactivity.
    autocmd CursorHold *.\(gpg\|asc\|pgp\) quit
augroup END

function SetGPGOptions()
" Set updatetime to 1 minute.
    set updatetime=60000
" Fold at markers.
    set foldmethod=marker
" Automatically close all folds.
    set foldclose=all
" Only open folds with insert commands.
    set foldopen=insert
endfunction

The first two options simply tell vim-gnupg to always ASCII-armor and sign new files. These have nothing particular to do with password management, but are good practices for all encrypted files.

The first autocmd calls a function which holds the options that I wanted applied to my password files. I have these options apply to all encrypted files, although they’re intended primarily for use when Vim is acting as my password manager.

Folding

The primary shortcoming with using an encrypted text file as a password database is the lack of protection against shoulder-surfing. After the file has been decrypted and opened, anyone standing behind you can look over your shoulder and view all the entries. This is solved with folds and is what most of these extra options address.

I set foldmethod to marker so that Vim knows to look for all the {{{/}}} characters and use them to build the folds. Then I set foldclose to all. This closes all folds unless the cursor is in them. This way only one fold can be open at a time – or, to put it another way, only one password entry is ever visible at once.

The final fold option instructs Vim when it is allowed to open folds. Folds can always be opened manually, but by default Vim will also open them for many other cases: if you navigate to a fold, jump to a mark within a fold or search for a pattern within a fold, they will open. By setting foldopen to insert I instruct Vim that the only time it should automatically open a fold is if my cursor is in a fold and I change to insert mode. The effect of this is that when I open a file, all folds are closed by default. I can navigate through the file, search and jump through matches, all without opening any of the folds and inadvertently exposing the passwords on my screen. The fold will open if I change to insert mode within it, but it is difficult to do that by mistake.

I have my spacebar setup to toggle folds within Vim. After I have navigated to the desired entry, I can simply whack the spacebar to open it and copy the credential that I need to the system clipboard. At that point I can whack the spacebar again to close the fold, or I can quit Vim. Or I can simply wait.

Locking

The other special option I set is updatetime. Vim uses this option to determine when it should write swap files for crash recovery. Since vim-gnupg disables swap files for decrypted files, this has no effect. I use it for something else.

In the second autocmd I tell Vim to close itself on CursorHold. CursorHold is triggered whenever no key has been pressed for the time specified by updatetime. So the effect of this is that my password files are automatically closed after 1 minute of inactivity. This is similar to KeePassX’s behaviour of “locking the workspace” after a set period of inactivity.

Clipboard

To easily copy a credential to the system clipboard from Vim I have two shortcuts mapped.

" Yank WORD to system clipboard in normal mode
nmap <leader>y "+yE

" Yank selection to system clipboard in visual mode
vmap <leader>y "+y

Vim can access the system clipboard using both the * and + registers. I opt to use + because X treats it as a selection rather than a cut-buffer. As the Vim documentation explains:

Selections are “owned” by an application, and disappear when that application (e.g., Vim) exits, thus losing the data, whereas cut-buffers, are stored within the X-server itself and remain until written over or the X-server exits (e.g., upon logging out).

The result is that I can copy a username or password by placing the cursor on its first character and hitting <leader>y. I can paste the credential wherever it is needed. After I close Vim, or after Vim closes itself after 1 minute of inactivity, the credential is removed from the clipboard. This replicates KeePassX’s behaviour of clearing the clipboard so many seconds after a username or password has been copied.

Generation

Passwords should be long and unique. To satisfy this any password manager needs some sort of password generator. Vim provides this with its ability to call and read external commands I can tell Vim to call the standard-issue pwgen program to generate a secure 24-character password utilizing special characters and insert the output at the cursor, like this:

:r!pwgen -sy 24 1

Backups

The ~/pw directory is backed up in the same way as most other things on my hard drive: to Tarsnap via Tarsnapper, to an external drive via rsnapshot and cryptshot, rsync to a mirror drive. The issue with these standard backups is that they’re all encrypted and the keys to decrypt them are stored in the password manager. If I loose ~/pw I’ll have plenty of backups around, but none that I can actually access. I address this problem with regular backups to optical media.

At the beginning of every month I burn the password directory to two CDs. One copy is stored at home and the other at an off-site location. I began these optical media backups in December, so I currently have two sets consisting of five discs each. Any one of these discs will provide me with the keys I need to access a backup made with one of the more frequent methods.

Of course, all the files being burned to these discs are still encrypted with my GnuPG key. If I loose that key or passphrase I will have no way to decrypt any of these files. Protecting one’s GnuPG key is another problem entirely. I’ve taken steps that make me feel confident in my ability to always be able to recover a copy of my key, but none that I’m comfortable discussing publicly.

Shell

I’ve defined a shell function, pw(), that operates exactly like the function I use for notes on Unix.

# Set the password database directory.
PASSDIR=~/pw

# Create or edit password databases.
pw() {
    cd "$PASSDIR"
    if [ ! -z "$1" ]; then
        $EDITOR $(buildfile "$1")
        cd "$OLDPWD"
    fi
}

This allows me to easily open any password file from wherever I am in the filesystem without specifying the full path. These two commands are equivalent, but the one utilizing pw() requires fewer keystrokes:

$ vim ~/pw/financial.gpg
$ pw financial

The function changes to the password directory before opening the file so that while I’m in Vim I can drop down to a shell with :sh and already be in the proper directory to manipulate the files. After I close Vim the function returns me to the previous working directory.

This still required a few more keystrokes than I like, so I configured my shell to perform autocompletion in the directory. If financial.gpg is the only file in the directory beginning with an “f”, typing pw f<tab> is all that is required to open the file.

Simplicity

This setup provides simplicity, power, and portability. It uses the same tools that I already employ in my daily life, and does not require the use of the mouse or any graphical windows. I’ve been happily utilizing it for about 6 months now.

Initially I had thought I would supplement the setup with a script that would search the databases for a desired entry, using some combination of grep, awk and cut, and then copy it to my clipboard via xsel. As it turns out, I haven’t felt the desire to do this. Simply opening the file in Vim, searching for the desired entry, opening the fold and copying the credential to the system clipboard is quick enough. The whole process, absent of typing in my passphrase, takes me only a couple of seconds.

Resources

I’m certainly not the first to come up with the idea of managing password with Vim. These resources were particularly useful to me when I was researching the possibilities:

File encryption and password management by Conner McDaniel
Keep passwords in encrypted file on the Vim Wiki
Password Safe with Vim and OpenSSL by Noah

If you’re interesting in other ideas for password management, password-store and KeePassC are both neat projects that I follow.

2013 June 30: larsks has hacked together a Python script to convert KeepassX XML exports to the plain-text markup format that I use.

This article was modified 2013-06-30.

2013-01-05

micro, linux

Freedom for users, not for software.

Mako has published a short essay in which he argues in favor of a conceptual shift, referring to “free users” rather than “free software”. It draws attention to the larger societal impact of proprietary software, and points out that software that is free-as-in-freedom can still create dependent and vulnerable users when used as a 3rd party service in the cloud.

2012-12-22

shell, code, linux

Notes on Unix

As a long-time user of Unix-like systems, I prefer to do as much work in the command-line as possible. I store data in plain text whenever appropriate. I edit in vim and take advantage of the pipeline to manipulate the data with powerful tools like awk and grep.

Notes are one such instance where plain text makes sense. All of my notes – which are scratch-pads for ideas, reference material, logs, and whatnot – are kept as individual text files in the directory ~/documents/notes. The entire ~/documents directory is synced between my laptop and my work computer, and of course it is backed up with tarsnap.

When I want to read, edit or create a note, my habit is simply to open the file in vim.

1	`$ vim ~/documents/notes/todo.txt`

When I want to view a list of my notes, I can just ls the directory. I pass along the -t and -r flags. The first flag sorts the files by modification date, newest first. The second flag reverses the order. The result is that the most recently modified files end up at the bottom of the list, nearest the prompt. This allows me to quickly see which notes I have recently created or changed. These notes are generally active – they’re the ones I’m currently doing something with, so they’re the ones I want to see. Using ls to see which files have been most recently modified is incredibly useful, and a behaviour that I use often enough to have created an alias for it.

1	`$ lt ~/documents/notes`

Recently I was inspired by some extremely simple shell functions on the One Thing Well blog to make working with my notes even easier.

The first function, n(), takes the name of the note as an argument – minus the file extension – and opens it.

function n {
nano ~/Dropbox/Notes/$1.txt 
}

I liked the idea. It would allow me to open a note from anywhere in the filesystem without specifying the full path. After changing the editor and the path, I could open the same note as before with far fewer keystrokes.

1	`$ n todo`

I needed to make a few changes to the function to increase its flexibility.

First, the extension. Most of my notes have .txt extensions. Some have a .gpg extension.¹ Some have no extension. I didn’t want to force the .txt extension in the function.

If I specified a file extension, that extension should be used. If I failed to specify the extension, I wanted the function to open the file as I specified it only if it existed. Otherwise, I wanted it to look for that file with a .gpg extension and open that if it was found. As a last resort, I wanted it to open the file with a .txt extension regardless of whether it existed or not. I implemented this behaviour in a separate function, buildfile(), so that I could take advantage of it wherever I wanted.

# Take a text file and build it with an extension, if needed.
# Prefer gpg extension over txt.
buildfile() {
    # If an extension was given, use it.
    if [[ "$1" == *.* ]]; then
        echo "$1"

    # If no extension was given...
    else
        # ... try the file without any extension.
        if [ -e "$1" ]; then
            echo "$1"
        # ... try the file with a gpg extension.
        elif [ -e "$1".gpg ]; then
            echo "$1".gpg
        # ... use a txt extension.
        else
            echo "$1".txt
        fi
    fi
}

I then rewrote the original note function to take advantage of this.

# Set the note directory.
NOTEDIR=~/documents/notes

# Create or edit notes.
n() {
    # If no note was given, list the notes.
    if [ -z "$1" ]; then
        lt "$NOTEDIR"
    # If a note was given, open it.
    else
        $EDITOR $(buildfile "$NOTEDIR"/"$1")
    fi
}

Now I can edit the note ~/documents/notes/todo.txt by simply passing along the filename without an extension.

1	`$ n todo`

If I want to specify the extension, that will work too.

1	`$ n todo.txt`

If I have a note called ~/documents/notes/world-domination.gpg, I can edit that without specifying the extension.

1	`$ n world-domination`

If I have a note called ~/documents/notes/readme, I can edit that and the function will respect the lack of an extension.

1	`$ n readme`

Finally, if I want to create a note, I can just specify the name of the note and a file with that name will be created with a .txt extension.

The other change I made was to make the function print a reverse-chronologically sorted list of my notes if it was called with no arguments. This allows me to view my notes by typing a single character.

$ n

The original blog post also included a function ns() for searching notes by their title.

function ns {
ls -c ~/Dropbox/Notes | grep $1
}

I thought this was a good idea, but I considered the behaviour to be finding a note rather than searching a note. I renamed the function to reflect its behaviour, took advantage of my ls alias, and made the search case-insensitive. I also modified it so that if no argument was given, it would simply print an ordered list of the notes, just like n().

# Find a note by title.
nf() {
    if [ -z "$1" ]; then
        lt "$NOTEDIR"
    else
        lt "$NOTEDIR" | grep -i $1
    fi
}

I thought it would be nice to also have a quick way to search within notes. That was accomplished with ns(), the simplest function of the trio.

# Search within notes.
ns() { cd $NOTEDIR; grep -rin $1; cd "$OLDPWD"; }

I chose to change into the note directory before searching so that the results do not have the full path prefixed to the filename. Thanks to the first function I don’t need to specify the full path of the note to open it, and this makes the output easier to read.

$ ns "take over"
world-domination.txt:1:This is my plan to take over the world.
$ n world-domination

To finish it up, I added completion to zsh so that I can tab-complete filenames when using n.

# Set autocompletion for notes.
compctl -W $NOTEDIR -f n

If you’re interested in seeing some of the other way that I personalize my working environment, all of my dotfiles are on GitHub. The note functions are in shellrc, which is my shell-agnostic configuration file that I source from both zshrc and bashrc².

Notes

↵ I use vim with the gnupg.vim plugin for seamless editing of PGP-encrypted files.
↵ I prefer zsh, but I do still find myself in bash on some machines. I find it prudent to maintain configurations for both shells. There's a lot of cross-over between them and I like to stick to the DRY principle.

2012-10-20

linux

Strategies for Working Offline

Earlier this year I went through a period where I had only intermittent internet access. Constant, high-speed internet access is so common these days that I had forgotten what it was like to work on a computer that was offline more often than it was online. It provided an opportunity to reevaluate some aspects of how I work with data.

E-Mail

For a period of a couple years I accessed my mail through the Gmail web interface exclusively. About a year ago, I moved back to using a local mail client. That turned out to be a lucky move for this experience. I found that I had three requirements for my mail:

I needed to be able to read mail when offline (all of it, attachments included),
compose mail when offline,
and queue messages up to be automatically sent out the next time the mail client found that it had internet access.

Those requirements are fairly basic. Most mail clients will handle them without issue. I’ve always preferred to connect to my mail server via IMAP rather than POP3. Most mail clients offer to cache messages retrieved over IMAP. They do this for performance reasons rather than to provide the ability to read mail offline, but the result is the same. For mail clients like Mutt that don’t have built-in caching, a tool like OfflineImap is great.

Wikipedia

Wikipedia is too valuable a resource to not have offline access to. There are a number of options for getting a local copy. I found Kiwix to be a simple and effective solution. It downloads a compressed copy of the Wikipedia database and provides a web-browser-like interface to read it. The English Wikipedia is just shy of 10 gigabytes (other languages are of course available). That includes all articles, but no pictures, history or talk pages. Obviously this is something you want to download before you’re depending on coffee shops and libraries for intermittent internet access. After Kiwix has downloaded the database, it needs to be indexed for proper searching. Indexing is a resource-intensive process that will take a long time, but it’s worth it. When it’s done, you’ll have a not-insignificant chunk of our species’ combined knowledge sitting on your hard-drive. (It’s the next best thing to the Guide, really.)

Arch Wiki

For folks like myself who run Arch Linux, the Arch Wiki is an indispensable resource. For people who use other distributions, it’s less important, but still holds value. I think it’s the single best repository of Linux information out there.

For us Arch users, getting a local copy is simple. The arch-wiki-docs package provides a flat HTML copy of the wiki. Even better is the arch-wiki-lite package which provides a console interface for searching and viewing the wiki.

Users of other distributions could, at a minimum, extract the contents of the arch-wiki-docs package and grep through it.

Tunneling

Open wireless networks are dirty places. I’m never comfortable using them without tunneling my traffic. An anonymizing proxy like Tor is overkill for a situation like this. A full-fledged VPN is the best option, but I’ve found sshuttle to make an excellent poor-man’s VPN. It builds a tunnel over SSH, while addressing some of the shortcomings of vanilla SSH port forwarding. All traffic is forced through the tunnel, including DNS queries. If you have a VPS, a shared hosting account, or simply a machine sitting at home, sshuttle makes it dead simple to protect your traffic when on unfamiliar networks.

YouTube

YouTube is a great source of both education and entertainment. If you are only going to be online for a couple hours a week, you probably don’t want to waste those few hours streaming videos. There’s a number of browser plugins that allow you to download YouTube videos, but my favorite solution is a Python program called youtube-dl. (It’s unfortunately named, as it also supports downloading videos from other sites like Vimeo and blip.tv.) You pass it the URL of the YouTube video page and it grabs the highest-quality version available. It has a number of powerful options, but for me the killer feature is the ability to download whole playlists. Say you want to grab every episode of the great web-series Sync. Just pass the URL of the playlist to youtube-dl.

$ youtube-dl -t https://www.youtube.com/playlist?list=PL168F329FADED6741

That’s it. It goes out and grabs every video. (The -t flag tells it to use the video’s title in the file name.) If you come back a few weeks later and think there might have been a couple new videos added to the playlist, you can just run the same command again but with the -c flag, which tells it to resume. It will see that it already downloaded part of the playlist and will only get videos that it doesn’t yet have.

Even now that I’m back to having constant internet access, I still find myself using youtube-dl on a regular basis. If I find a video that I want to watch at a later time, I download it. That way I don’t have to worry about buffering, or the video disappearing due to DMCA take-down requests.

Backups

I keep backitup.sh in my network profile so that my online backups attempt to execute whenever I get online. If you’re only online once or twice a week, you probably have more important things to do than remembering to manually trigger your online backups. It’s nice to have that automated.

This article was modified 2012-12-22.

2012-10-03

shell, backups, code, linux

Back It Up: A Solution for Laptop Backups

A laptop presents some problems for reliably backing up data. Unlike a server, the laptop may not always be turned on. When it is on, it may not be connected to the backup medium. If you’re doing online backups, the laptop may be offline. If you’re backing up to an external drive, the drive may not be plugged in. To address these issues I wrote a shell script called backitup.sh.

The Problem

Let’s say you want to backup a laptop to an external USB drive once per day with cryptshot.

You could add a cron entry to call cryptshot.sh at a certain time every day. What if the laptop isn’t turned on? What if the drive isn’t connected? In either case the backup will not be completed. The machine will then wait a full 24 hours before even attempting the backup again. This could easily result in weeks passing without a successful backup.

If you’re using anacron, or one of its derivatives, things get slightly better. Instead of specifying a time to call cryptshot.sh, you set the cron interval to @daily. If the machine is turned off at whatever time anacron is setup to execute @daily scripts, all of the commands will simply be executed the next time the machine boots. But that still doesn’t solve the problem of the drive not being plugged in.

The Solution

backitup.sh attempts to perform a backup if a certain amount of time has passed. It monitors for a report of successful completion of the backup. Once configured, you no longer call the backup program directly. Instead, you call backitup.sh. It then decides whether or not to actually execute the backup.

How it works

The script is configured with the backup program that should be executed, the period for which you want to complete backups, and the location of a file that holds the timestamp of the last successful backup. It can be configured either by modifying the variables at the top of the script, or by passing in command-line arguments.

$ backitup.sh -h
Usage: backitup.sh [OPTION...]
Note that any command line arguments overwrite variables defined in the source.

Options:
    -p      the period for which backups should attempt to be executed
            (integer seconds or 'DAILY', 'WEEKLY' or 'MONTHLY')
    -b      the backup command to execute; note that this should be quoted if it contains a space
    -l      the location of the file that holds the timestamp of the last successful backup.
    -n      the command to be executed if the above file does not exist

When the script executes, it reads the timestamp contained in the last-run file. This is then compared to the user-specified period. If the difference between the timestamp and the current time is greater than the period, backitup.sh calls the backup program. If the difference between the stored timestamp and the current time is less than the requested period, the script simply exits without running the backup program.

After the backup program completes, the script looks at the returned exit code. If the exit code is 0, the backup was completed successfully, and the timestamp in the last-run file is replaced with the current time. If the backup program returns a non-zero exit code, no changes are made to the last-run file. In this case, the result is that the next time backitup.sh is called it will once again attempt to execute the backup program.

The period can either be specified in seconds or with the strings DAILY, WEEKLY or MONTHLY. The behaviour of DAILY differs from 86400 (24-hours in seconds). With the latter configuration, the backup program will only attempt to execute once per 24-hour period. If DAILY is specified, the backup may be completed successfully at, for example, 23:30 one day and again at 00:15 the following day.

Use

You still want to backup a laptop to an external USB drive once per day with cryptshot. Rather than calling cryptshot.sh, you call backitup.sh.

Tell the script that you wish to complete daily backups, and then use cron to call the script more frequently than the desired backup period. For my local backups, I call backitup.sh every hour.

@hourly backitup.sh -l ~/.cryptshot-daily -b "cryptshot.sh daily"

The default period of backitup.sh is DAILY, so in this case I don’t have to provide a period of my own. But I also do weekly and monthly backups, so I need two more entries to execute cryptshot with those periods.

@hourly backitup.sh -l ~/.cryptshot-monthly -b "cryptshot.sh monthly" -p MONTHLY
@hourly backitup.sh -l ~/.cryptshot-weekly -b "cryptshot.sh weekly" -p WEEKLY

All three of these entries are executed hourly, which means that at the top of every hour, my laptop attempts to back itself up. As long as the USB drive is plugged in during one of those hours, the backup will complete. If cryptshot is executed, but fails, another attempt will be made the next hour. Daily backups will only be successfully completed, at most, once per day; weekly backups, once per week; and monthly backups, once per month. This setup works well for me, but if you want a higher assurance that your daily backups will be completed every day you could change the cron interval to */5 * * * *, which will result in cron executing backitup.sh every 5 minutes.

What if you want to perform daily online backups with Tarsnapper?

@hourly backitup.sh -l ~/.tarsnapper-lastrun -b tarsnapper.py

At the top of every hour your laptop will attempt to run Tarsnap via Tarsnapper. If the laptop is offline, it will try again the following hour. If Tarsnap begins but you go offline before it can complete, the backup will be resumed the following hour.

The script can of course be called with something other than cron. Put it in your ~/.profile and have you backups attempt to execute every time you login. Add it to your network manager and have your online backups attempt to execute every time you get online. If you’re using something like udev, have your local backups attempt to execute every time your USB drive is plugged in.

The Special Case

The final configuration option of backitup.sh represents a special case. If the script runs and it can’t find the specified file, the default behaviour is to assume that this is the first time it has ever run: it creates the file and executes the backup. That is what most users will want, but this behaviour can be changed.

When I first wrote backitup.sh it was to help manage backups of my Dropbox folder. Dropbox doesn’t provide support client-side encryption, which means users need to handle encryption themselves. The most common way to do this is to create an encfs file-system or two and place those within the Dropbox directory. That’s the way I use Dropbox.

I wanted to backup all the data stored in Dropbox with Tarsnap. Unlike Dropbox, Tarsnap does do client-side encryption, so when I backup my Dropbox folder, I don’t want to actually backup the encrypted contents of the folder – I want to backup the decrypted contents. That allows me to take better advantage of Tarsnap’s deduplication and it makes restoring backups much simpler. Rather than comparing inodes and restoring a file using an encrypted filename like 6,8xHZgiIGN0vbDTBGw6w3lf/1nvj1,SSuiYY0qoYh-of5YX8 I can just restore documents/todo.txt.

If my encfs filesystem mount point is ~/documents, I can configure Tarsnapper to create an archive of that directory, but if for some reason the filesystem is not mounted when Tarsnapper is called, I would be making a backup of an empty directory. That’s a waste of time. The solution is to tell backitup.sh to put the last-run file inside the encfs filesystem. If it can’t find the file, that means that the filesystem isn’t mounted. If that’s the case, I tell it to call the script I use to automatically mount the encfs filesystem (which, the way I have it setup, requires no interaction from me).

@hourly backitup.sh -l ~/documents/.lastrun -b tarsnapper.py -n ~/bin/encfs_automount.sh

Problem Solved

backitup.sh solves all of my backup scheduling problems. I only call backup programs directly if I want to make an on-demand backup. All of my automated backups go through backitup.sh. If you’re interested in the script, you can download it directly from GitHub. You can clone my entire backups repository if you’re also interested in the other scripts I’ve written to manage different aspects of backing up data.

Hey yo but wait, back it up, hup, easy back it up

This article was modified 2012-12-22.