pig-monkey.com - annexhttps://pig-monkey.com/2020-08-05T15:17:03-07:00Organizing Ledger2020-08-03T00:00:00-07:002020-08-05T15:17:03-07:00Pig Monkeytag:pig-monkey.com,2020-08-03:/2020/08/organizing-ledger/<p><a href="https://www.ledger-cli.org/">Ledger</a> is a <a href="https://en.wikipedia.org/wiki/Double-entry_bookkeeping">double-entry accounting system</a> that stores data in plain text. I began using it in 2012. Almost every dollar that has passed through my world since then is tracked by Ledger.<sup class="footnote-ref" id="fnref:cash"><a rel="footnote" href="#fn:cash" title="see footnote">1</a></sup></p> <p>Ledger is not the only <a href="https://plaintextaccounting.org/">plain text accounting system</a> out there. It has inspired others, such …</p><p><a href="https://www.ledger-cli.org/">Ledger</a> is a <a href="https://en.wikipedia.org/wiki/Double-entry_bookkeeping">double-entry accounting system</a> that stores data in plain text. I began using it in 2012. Almost every dollar that has passed through my world since then is tracked by Ledger.<sup class="footnote-ref" id="fnref:cash"><a rel="footnote" href="#fn:cash" title="see footnote">1</a></sup></p> <p>Ledger is not the only <a href="https://plaintextaccounting.org/">plain text accounting system</a> out there. It has inspired others, such as <a href="https://hledger.org/">hledger</a> and <a href="http://furius.ca/beancount/">beancount</a>. I began with Ledger for lack of a compelling argument in favor of the alternatives. After close to a decade of use, my only regret is that I didn&rsquo;t start using earlier.</p> <p>My Ledger repository is stored at <code>~/library/ledger</code>. This repository contains a <code>data</code> directory, which includes yearly Ledger journal files such as <code>data/2019.ldg</code> and <code>data/2020.ldg</code>. Ledger files don&rsquo;t necessarily need to be split at all, but I like having one file per year. In January, after I clear the last transaction from the previous year, I know the year is locked and the file never gets touched again (unless I go back in to rejigger my account structure).</p> <p>The root of the directory has a <code>.ledger</code> file which includes all of these data files, plus a special journal file with periodic transactions that I sometimes use for budgeting. My <code>~/.ledgerrc</code> file tells Ledger to use the <code>.ledger</code> file as the primary journal, which has the effect of including all the yearly files.</p> <div class="highlight"><pre><span></span><code>$ cat ~/.ledgerrc --file ~/library/ledger/.ledger --date-format<span class="o">=</span>%Y-%m-%d $ cat ~/library/ledger/.ledger include data/periodic.ldg include data/2012.ldg include data/2013.ldg include data/2014.ldg include data/2015.ldg include data/2016.ldg include data/2017.ldg include data/2018.ldg include data/2019.ldg include data/2020.ldg </code></pre></div> <p>Ledger&rsquo;s include format does support globbing (ie <code>include data/*.ldg</code>) but the ordering of the transactions can get weird, so I prefer to be explicit.</p> <p>The repository also contains receipts in the <code>receipts</code> directory, invoices in the <code>invoices</code> directory, scans of checks (remember those?) in the <code>checks</code> directory, and CSV dumps from banks in the <code>dump</code> directory.</p> <div class="highlight"><pre><span></span><code>$ tree -d ~/library/ledger /home/pigmonkey/library/ledger ├── checks ├── data ├── dump ├── invoices └── receipts <span class="m">5</span> directories </code></pre></div> <p>The repository is managed using a mix of vanilla git and <a href="https://git-annex.branchable.com/">git-annex</a>.<sup class="footnote-ref" id="fnref:annex"><a rel="footnote" href="#fn:annex" title="see footnote">2</a></sup> It is important to me that the Ledger journal files in the <code>data</code> directory are stored directly in git. I want the ability to diff changes before committing them, and to be able to pull the history of those files. Every other file I want stored in git-annex. I don&rsquo;t care about the history of files like PDF receipts. They never change. In fact, I want to make them read-only so I can&rsquo;t accidentally change them. I want encrypted versions of them distributed to my numerous <a href="/2016/08/rclone/">special remotes</a> for safekeeping, and someday I may even want to drop old receipts or invoices from my local store so that they don&rsquo;t take up disk space until I actually need to read them. That sounds like asking a lot, but git-annex magically solves all the problems with its <a href="https://git-annex.branchable.com/tips/largefiles/"><code>largefiles</code> configuration option</a>.</p> <div class="highlight"><pre><span></span><code>$ cat ~/library/ledger/.gitattributes *.ldg annex.largefiles<span class="o">=</span>nothing </code></pre></div> <p>This tells git-annex that any file ending with <code>*.ldg</code> should not be treated as a &ldquo;large file&rdquo;, which means it should be added directly to git. Any other file should be added to git-annex and <a href="https://git-annex.branchable.com/git-annex-lock/">locked</a>, making it read-only. Having this configured means that I can just blindly <code>git annex add .</code> or <code>git add .</code> within the repository and git-annex will always do the right thing.</p> <p>I don&rsquo;t run the <a href="https://git-annex.branchable.com/assistant/">git-annex assistant</a> in this repository because I don&rsquo;t want any automatic commits. Like a traditional git repository, I only commit changes to Ledger&rsquo;s journal files after reviewing the diffs, and I want those commits to have meaningful messages.</p> <div id="footnotes"> <h2>Notes</h2> <ol> <li id="fn:cash"><a rev="footnote" href="#fnref:cash" class="footnote-return" title="return to article">&crarr;</a> I do not always track miscellaneous cash transactions less than $20. If a thing costs more than that, it is worth tracking, regardless of what it is or how it was purchased. If it costs less than that, and it isn't part of a meaningful expense account, I'll probably let laziness win out. If I buy a $8 sandwich for lunch with cash, it'll get logged, because I care about tracking dining expenses. If I buy a $1 pencil erasure, I probably won't log it, because it isn't part of an account worth considering.</li> <li id="fn:annex"><a rev="footnote" href="#fnref:annex" class="footnote-return" title="return to article">&crarr;</a> I bet you <a href="/tag/annex/">saw that coming</a>.</li> </ol> </div>Whenever I buy a new piece of equipment, I store its manual as a PDF.2020-02-26T00:00:00-08:002020-02-26T20:25:29-08:00Pig Monkeytag:pig-monkey.com,2020-02-26:/2020/02/manual-storage/<p>If an internet search doesn&rsquo;t come up with a copy of the manual, I&rsquo;ll scan the dead tree version and <a href="/2019/07/ocrmypdf/">OCR it</a>. The document is then stored in an <a href="https://git-annex.branchable.com/">annex</a> at <code>~/documents/manuals/</code>. I rarely reference the product manual after initial setup, but when I need it, it …</p><p>If an internet search doesn&rsquo;t come up with a copy of the manual, I&rsquo;ll scan the dead tree version and <a href="/2019/07/ocrmypdf/">OCR it</a>. The document is then stored in an <a href="https://git-annex.branchable.com/">annex</a> at <code>~/documents/manuals/</code>. I rarely reference the product manual after initial setup, but when I need it, it&rsquo;s extremely valuable to have it available &ndash; immediately and offline &ndash; as a PDF with a searchable text layer.</p> <p>Some products don&rsquo;t have manuals, but do have specification sheets. I store these in the same location. Sometimes I&rsquo;ll just save the product page from the manufacturer&rsquo;s website as a PDF. This allows me to easily lookup the dimensions of a thing I bought 14 years ago, despite the product being long discontinued by the manufacturer, or the manufacturer no longer existing.</p>Music Organization with Beets2019-07-22T00:00:00-07:002019-07-22T17:49:10-07:00Pig Monkeytag:pig-monkey.com,2019-07-22:/2019/07/beets/<p>I organize my music with <a href="http://beets.io/">Beets</a>.</p> <p>Beets <a href="https://beets.readthedocs.io/en/stable/reference/cli.html#import">imports</a> music into my library, warns me if I&rsquo;m <a href="https://beets.readthedocs.io/en/stable/plugins/missing.html">missing</a> tracks, identifies tracks based on their <a href="https://beets.readthedocs.io/en/stable/plugins/chroma.html">accoustic fingerprint</a>, <a href="https://beets.readthedocs.io/en/stable/plugins/scrub.html">scrubs</a> extraneous metadata, fetches and stores <a href="https://beets.readthedocs.io/en/stable/plugins/fetchart.html">album art</a>, cleans <a href="https://beets.readthedocs.io/en/stable/plugins/lastgenre.html">genres</a>, fetches <a href="https://beets.readthedocs.io/en/stable/plugins/lyrics.html">lyrics</a>, and &ndash; most importantly &ndash; <a href="https://beets.readthedocs.io/en/stable/plugins/mbsync.html">fetches metadata</a> from <a href="https://musicbrainz.org/">MusicBrainz</a>. After some basic <a href="https://github.com/pigmonkey/dotfiles/blob/master/config/beets/config.yaml">configuration</a>, all …</p><p>I organize my music with <a href="http://beets.io/">Beets</a>.</p> <p>Beets <a href="https://beets.readthedocs.io/en/stable/reference/cli.html#import">imports</a> music into my library, warns me if I&rsquo;m <a href="https://beets.readthedocs.io/en/stable/plugins/missing.html">missing</a> tracks, identifies tracks based on their <a href="https://beets.readthedocs.io/en/stable/plugins/chroma.html">accoustic fingerprint</a>, <a href="https://beets.readthedocs.io/en/stable/plugins/scrub.html">scrubs</a> extraneous metadata, fetches and stores <a href="https://beets.readthedocs.io/en/stable/plugins/fetchart.html">album art</a>, cleans <a href="https://beets.readthedocs.io/en/stable/plugins/lastgenre.html">genres</a>, fetches <a href="https://beets.readthedocs.io/en/stable/plugins/lyrics.html">lyrics</a>, and &ndash; most importantly &ndash; <a href="https://beets.readthedocs.io/en/stable/plugins/mbsync.html">fetches metadata</a> from <a href="https://musicbrainz.org/">MusicBrainz</a>. After some basic <a href="https://github.com/pigmonkey/dotfiles/blob/master/config/beets/config.yaml">configuration</a>, all of this happens automatically when I import new files into my library.</p> <p>After the files have been imported, beets makes it easy to query my library based on any of the clean, consistent, high quality, crowd-sourced metadata.</p> <div class="highlight"><pre><span></span><code>$ beet stats genre:ambient Tracks: <span class="m">649</span> Total time: <span class="m">2</span>.7 days Approximate total size: <span class="m">22</span>.4 GiB Artists: <span class="m">76</span> Albums: <span class="m">53</span> Album artists: <span class="m">34</span> $ beet ls -a <span class="s1">&#39;added:2019-07-01..&#39;</span> Deathcount <span class="k">in</span> Silicon Valley - Acheron Dlareme - Compass The Higher Intelligence Agency <span class="p">&amp;</span> Biosphere - Polar Sequences JK/47 - Tokyo Empires Matt Morton - Apollo <span class="m">11</span> Soundtrack $ beet ls -ap albumartist:joplin /home/pigmonkey/library/audio/music/Janis Joplin/Full Tilt Boogie /home/pigmonkey/library/audio/music/Janis Joplin/I Got Dem Ol<span class="err">&#39;</span> Kozmic Blues Again Mama! </code></pre></div> <p>As regular readers will have surmised, the files themselves are stored in <a href="https://git-annex.branchable.com/">git-annex</a>.</p>Optical Backups of Financial Archives2019-06-29T00:00:00-07:002019-06-29T14:49:50-07:00Pig Monkeytag:pig-monkey.com,2019-06-29:/2019/06/optical-financal-backups/<p>Every year I burn an optical archive of my financial documents, similar to how (and why) I <a href="/2013/05/optical-photo-backups/">create optical backups of photos</a>. I schedule this financial archive for the spring, after the previous year&rsquo;s taxes have been submitted and accepted. <a href="https://taskwarrior.org/">Taskwarrior</a> solves the problem of remembering to complete the …</p><p>Every year I burn an optical archive of my financial documents, similar to how (and why) I <a href="/2013/05/optical-photo-backups/">create optical backups of photos</a>. I schedule this financial archive for the spring, after the previous year&rsquo;s taxes have been submitted and accepted. <a href="https://taskwarrior.org/">Taskwarrior</a> solves the problem of remembering to complete the archive.</p> <div class="highlight"><pre><span></span><code>$ task add project:finance due:2019-04-30 recur:yearly wait:due-4weeks <span class="s2">&quot;burn optical financial archive with parity&quot;</span> </code></pre></div> <p>The archive includes two <a href="https://git-annex.branchable.com/">git-annex</a> repositories.</p> <p>The first is my <a href="https://www.ledger-cli.org/">ledger</a> repository. Ledger is the double-entry accounting system I began using in 2012 to record the movement of every penny that crosses one of my bank accounts (small cash transactions, less than about $20, are usually-but-not-always except from being recorded). In addition to the plain-text ledger files, this repository also holds PDF or JPG images of receipts.</p> <p>The second repository holds my tax information. Each tax year gets a <a href="https://git.zx2c4.com/ctmg/about/">ctmg</a> container which contains any documents used to complete my tax returns, the returns themselves, and any notifications of those returns being accepted.</p> <p>The yearly optical archive that I create holds the entirety of these two repositories &ndash; not just the information from the previous year &ndash; so really each disc only needs to have a shelf life of 12 months. Keeping the older discs around just provides redundancy for prior years.</p> <h2>Creating the Archive</h2> <p>The process of creating the archive is very similar to the process I outlined six years ago for the photo archives.</p> <p>The two repositories, combined, are about 2GB (most of that is the directory of receipts from the ledger repository). I burn these to a 25GB BD-R disc, so file size is not a concern. I&rsquo;ll <code>tar</code> them, but skip any compression, which would just add extra complexity for no gain.</p> <div class="highlight"><pre><span></span><code>$ mkdir ~/tmp/archive $ <span class="nb">cd</span> ~/library $ tar cvf ~/tmp/archive/ledger.tar ledger $ tar cvf ~/tmp/archive/tax.tar tax </code></pre></div> <p>The ledger archive will get signed and encrypted with my PGP key. The contents of the tax repository are already encrypted, so I&rsquo;ll skip encryption and just sign the archive. I like using detached signatures for this.</p> <div class="highlight"><pre><span></span><code>$ <span class="nb">cd</span> ~/tmp/archive $ gpg -e -r peter@havenaut.net -o ledger.tar.gpg ledger.tar $ gpg -bo ledger.tar.gpg.sig ledger.tar.gpg $ gpg -bo tax.tar.sig tax.tar $ rm ledger.tar </code></pre></div> <p>Previously, when creating optical photo archives, I used <a href="https://web.archive.org/web/20160427222800/http://dvdisaster.net/en/index.html">DVDisaster</a> to create the disc image with parity. DVDisaster no longer exists. The code can still be found, and the program still works, but nobody is developing it and it doesn&rsquo;t even an official web presence. This makes me uncomfortable for a tool that is part of my long-term archiving plans. As a result, I&rsquo;ve moved back to using <a href="https://parchive.github.io/">Parchive</a> for parity. Parchive also does not have much in the way of active development around it, but it <a href="https://github.com/Parchive/par2cmdline/commits/master">is still maintained</a>, has been around for a long period of time, is still used by a wide community, and will probably continue to exist as long as people share files on less-than-perfectly-reliable mediums.</p> <p>As previously mentioned, I&rsquo;m not worried about the storage space for these files, so I tell <code>par2create</code> to create PAR2 files with 30% redundancy. I suppose I could go even higher, but 30% seems like a good number. By default this process will be allowed to use 16MB of memory, which is cute, but RAM is cheap and I usually have enough to spare so I&rsquo;ll give it permission to use up to 8GB.</p> <div class="highlight"><pre><span></span><code>$ par2create -r30 -m8000 recovery.par2 * </code></pre></div> <p>Next I&rsquo;ll use <a href="http://md5deep.sourceforge.net/">hashdeep</a> to generate message digests for all the files in the archive.</p> <div class="highlight"><pre><span></span><code>$ hashdeep * &gt; hashes </code></pre></div> <p>At this point all the file processing is completed. I&rsquo;ll put a blank disc in my burner (a <a href="https://pioneerelectronics.com/PUSA/Computer/Computer+Drives/BDR-XD05B">Pioneer BDR-XD05B</a>) and burn the directory using <a href="http://fy.chalmers.se/~appro/linux/DVD+RW/">growisofs</a>.</p> <div class="highlight"><pre><span></span><code>$ growisofs -Z /dev/sr0 -V <span class="s2">&quot;Finances 2019&quot;</span> -r * </code></pre></div> <h2>Verification</h2> <p>The final step is to verify the disc. I have a few options on this front. These are the same steps I&rsquo;d take years down the road if I actually needed to recover data from the archive.</p> <p>I can use the previous hashes to find any files that do not match, which is a quick way to identify bit rot.</p> <div class="highlight"><pre><span></span><code>$ hashdeep -x -k hashes *.<span class="o">{</span>gpg,tar,sig,par2<span class="o">}</span> </code></pre></div> <p>I can check the integrity of the PGP signatures.</p> <div class="highlight"><pre><span></span><code>$ gpg --verify tax.tar.gpg<span class="o">{</span>.sig,<span class="o">}</span> $ gpg --verify tax.tar<span class="o">{</span>.sig,<span class="o">}</span> </code></pre></div> <p>I can use the PAR2 files to verify the original data files.</p> <div class="highlight"><pre><span></span><code>$ par2 verify recovery.par2 </code></pre></div>Archiving Bookmarks2018-11-23T00:00:00-08:002018-12-01T19:30:07-08:00Pig Monkeytag:pig-monkey.com,2018-11-23:/2018/11/archiving-bookmarks/<p>I signed-up for <a href="https://pinboard.in/">Pinboard</a> in 2014. It provides everything I need from a bookmarking service, which is mostly, you know, bookmarking. I pay for the <a href="https://pinboard.in/upgrade/">archival account</a>, meaning that Pinboard downloads a copy of everything I bookmark and provides me with full-text search. I find this useful and well worth …</p><p>I signed-up for <a href="https://pinboard.in/">Pinboard</a> in 2014. It provides everything I need from a bookmarking service, which is mostly, you know, bookmarking. I pay for the <a href="https://pinboard.in/upgrade/">archival account</a>, meaning that Pinboard downloads a copy of everything I bookmark and provides me with full-text search. I find this useful and well worth the $25 yearly fee, but Pinboard&rsquo;s archive is only part of the solution. I also need an offline copy of my bookmarks.</p> <p>Pinboard provides an <a href="https://pinboard.in/api/">API</a> that makes it easy to acquire a list of bookmarks. I have a <a href="https://github.com/pigmonkey/systools/blob/master/pinboard-backup.sh">small shell script</a> which pulls down a JSON-formatted list of my bookmarks and adds the file to <a href="https://git-annex.branchable.com/">git-annex</a>. This is controlled via a systemd <a href="https://github.com/pigmonkey/dotfiles/blob/master/config/systemd/user/pinboard-backup.service">service</a> and <a href="https://github.com/pigmonkey/dotfiles/blob/master/config/systemd/user/pinboard-backup.timer">timer</a>, which wraps the script in <a href="https://github.com/pigmonkey/backitup/">backitup</a> to ensure daily dumps. The systemd timer itself is controlled by <a href="https://github.com/pigmonkey/nmtrust">nmtrust</a>, so that it only runs when I am connected to a trusted network.</p> <p>This provides data portability, ensuring that I could import my tagged URLs to another bookmarking service if I ever found something better than Pinboard (unlikely, <a href="https://blog.pinboard.in/2017/06/pinboard_acquires_delicious/">competing with Pinboard is futile</a>). But I also want a locally archived copy of the pages themselves, which Pinboard does not offer through the API. I carry very much about being able to <a href="/2012/10/working-offline/">work offline</a>. The usefulness of a computer is directly propertional to the amount of data that is accessible without a network connection.</p> <p>To address this I use <a href="https://github.com/pirate/bookmark-archiver">bookmark-archiver</a>, a Python script which reads URLs from a variety of input files, including Pinboard&rsquo;s JSON dumps. It archives each URL via wget, generates a screenshot and PDF via headless Chromium, and submits the URL to the Internet Archive (<a href="https://github.com/pirate/bookmark-archiver/issues/6">with WARC hopefully on the way</a>). It will then generate an HTML index page, allowing the archives to be easily browsed. When I want to browse the archive, I simply change into the directory and use <code>python -m http.server</code> to serve the bookmarks at <code>localhost:8000</code>. Once downloaded locally, the archives are of course backed up, via the usual suspects like <a href="/2017/07/borg/">borg</a> and <a href="https://github.com/pigmonkey/cryptshot">cryptshot</a>.</p> <p>The archiver is configured via environment variables. I configure my preferences and point the program at the Pinboard JSON dump in my annex via <a href="https://github.com/pigmonkey/systools/blob/master/bookmark-archiver">a shell script</a> (creatively also named <code>bookmark-archiver</code>). This wrapper script <a href="https://github.com/pigmonkey/systools/blob/master/pinboard-backup.sh#L14">is called by the previous script</a> which dumps the JSON from Pinboard.</p> <p>The result of all of this is that every day I get a fresh dump of all my bookmarks, each URL is archived locally in multiple formats, and the archive enters into my normal backup queue. <a href="https://www.gwern.net/Archiving-URLs#link-rot">Link rot</a> may <a href="https://www.theatlantic.com/technology/archive/2013/09/49-of-the-links-cited-in-supreme-court-decisions-are-broken/279901/">defeat the Supreme Court</a>, but between this and my <a href="/2017/06/repos/">automated repository tracking</a> I have a pretty good system for backing up useful pieces of other people&rsquo;s data.</p>On E-Books2018-11-17T00:00:00-08:002018-11-17T16:07:18-08:00Pig Monkeytag:pig-monkey.com,2018-11-17:/2018/11/ebooks/<p>The <a href="https://en.wikipedia.org/wiki/Amazon_Kindle#Kindle_Paperwhite_(2nd_generation)">Kindle Paperwhite</a> has been my primary medium for consuming books since the beginning of 2014. <a href="https://en.wikipedia.org/wiki/E_Ink">E Ink</a> is a great display technology that I wish was more wide spread, but beyond the fact that the Kindle (and I assume other e-readers) makes for a pleasant reading experience, the real …</p><p>The <a href="https://en.wikipedia.org/wiki/Amazon_Kindle#Kindle_Paperwhite_(2nd_generation)">Kindle Paperwhite</a> has been my primary medium for consuming books since the beginning of 2014. <a href="https://en.wikipedia.org/wiki/E_Ink">E Ink</a> is a great display technology that I wish was more wide spread, but beyond the fact that the Kindle (and I assume other e-readers) makes for a pleasant reading experience, the real value in electronic books is storage.</p> <p>At its peak my physical collection was somewhere north of 200 books. As <a href="/2010/08/on-books/">I mentioned years ago</a> I took inspiration from Gary Snyder&rsquo;s character in The Dharma Bums and stored my books in milk crates, which stack like a bookcase for normal use and kept the collection pre-boxed for moving. But that many books still take up space, and are still annoying to move. And in some regards they are fragile &ndash; redundant data storage is expensive in meatspace.</p> <p>My digital library currently sits at 572 books and 13 gigabytes (the size skyrocketed after I began to archive a few comics). I could not justify that many physical books in my life. I still have a collection of dead trees, but I&rsquo;m down to 3 milk crates. I store my digital library in <a href="https://git-annex.branchable.com/">git-annex</a>, allowing me to <a href="/2016/08/rclone/">redundantly replicate</a> my collection across the globe, as well as keep copies in <a href="/2016/08/storage/">cold storage</a>. I also burn yearly <a href="/2013/05/optical-photo-backups/">optical backups</a> of the library to <a href="https://en.wikipedia.org/wiki/M-DISC">M-DISC</a>. The library is managed with <a href="https://calibre-ebook.com/">Calibre</a>.</p> <p>When I first bought the Kindle it required internet access to associate with my Amazon account. Ever since then, it has been in airplane mode. I spun up a temporary wireless network for the setup that I then deleted after the process was complete, ensuring that even if Amazon&rsquo;s airplane mode was untrustworthy, the device would not be able to phone home. The advantages of giving the Kindle internet access seem minute, and are far outweighed by the disadvantage of having to trust Amazon.</p> <p>If I purchase a book from Amazon, I select the &ldquo;Download &amp; Transfer via USB&rdquo; option. This results in a crippled <a href="https://en.wikipedia.org/wiki/Kindle_File_Format">AZW</a> file. I am under the radical delusion that I should own what I purchase, so I import that file into Calibre using the <a href="https://github.com/apprenticeharper/DeDRM_tools">DeDRM_tools</a> plugin. This strips any DRM, making the book ready to be consumed and archived. Books are transferred between my computer and the Kindle via USB, which Calibre makes simple.</p> <p>When I acquire books through other channels, my preferred format is always <a href="https://en.wikipedia.org/wiki/EPUB">EPUB</a>: an open format that is simply a zip archive of HTML files. Calibre&rsquo;s built-in conversion tools are quite good, giving me confidence that any e-book format I import into the library will be readable at any point in the future, but my preference is to store data in formats that are open, accessible, and understandable. The closer one gets to well-formatted plain text, the closer one gets to god.</p> <p>While the Kindle excels at the linear reading of novels, I&rsquo;ve also come to appreciate digital copies of reference books and technical manuals. Often the first reading of these types of books involves lots of flipping back and forth, which is easier in the dead tree variant, but after that first reading the searchability of the digital copy is far more useful for reference. The physical size of these types of books also makes them even more difficult to carry and store than other books, all but guaranteeing you won&rsquo;t have access to them when you need to reference them. Digital books solve that problem.</p> <p>I&rsquo;m confident in my ability to securely store digital data. Whenever I import a book into my library, I know that I now have permanent access to that knowledge for the rest of my life, regardless of environmental disaster, the whims of publishing houses, or the size of my living quarters.</p>Cold Storage2016-08-26T00:00:00-07:002016-08-27T18:19:34-07:00Pig Monkeytag:pig-monkey.com,2016-08-26:/2016/08/storage/<p>This past spring I mentioned my <a href="/2016/03/backup/">cold storage setup</a>: a number of encrypted 2.5&rdquo; drives in external enclosures, stored inside a <a href="http://www.pelican.com/us/en/product/watertight-protector-hard-cases/small-case/standard/1200/">Pelican 1200</a> case, secured with <a href="https://securitysnobs.com/Abloy-Protec2-PL-321-Padlock.html">Abloy Protec2 321</a> locks. Offline, secure, and infrequently accessed storage is an important component of any strategy for resilient data. The ease with …</p><p>This past spring I mentioned my <a href="/2016/03/backup/">cold storage setup</a>: a number of encrypted 2.5&rdquo; drives in external enclosures, stored inside a <a href="http://www.pelican.com/us/en/product/watertight-protector-hard-cases/small-case/standard/1200/">Pelican 1200</a> case, secured with <a href="https://securitysnobs.com/Abloy-Protec2-PL-321-Padlock.html">Abloy Protec2 321</a> locks. Offline, secure, and infrequently accessed storage is an important component of any strategy for resilient data. The ease with which this can be managed with <a href="https://git-annex.branchable.com/">git-annex</a> only increases <a href="/tag/annex/">my infatuation with the software</a>.</p> <p><a href="https://www.flickr.com/photos/pigmonkey/29168947362/in/dateposted/" title="Data Data Data Data Data"><img src="https://c3.staticflickr.com/9/8405/29168947362_2c7ecc9a97_c.jpg" width="800" height="450" alt="Data Data Data Data Data"></a></p> <p>I&rsquo;ve been happy with the <a href="https://www.amazon.com/gp/product/B00MPWYLHO/">Seagate ST2000LM003</a> drives for this application. Unfortunately the enclosures I first purchased did not work out so well. I had two die within a few weeks. They&rsquo;ve been replaced with the <a href="https://www.amazon.com/gp/product/B00YT6TOJO/">SIG JU-SA0Q12-S1</a>. These claim to be compatible with drives up to 8TB (someday I&rsquo;ll be able to buy 8TB 2.5&rdquo; drives) and support USB 3.1. They&rsquo;re also a bit thinner than the previous enclosures, so I can easily fit five in my box. The Seagate drives offer about 1.7 terabytes of usable space, giving this setup a total capacity of 8.5 terabytes.</p> <p>Setting up git-annex to support this type of cold storage is fairly straightforward, but does necessitate some familiarity with how the program works. Personally, I prefer to do all my setup manually. I&rsquo;m happy to let the <a href="http://git-annex.branchable.com/assistant/">assistant</a> watch my repositories and manage them after the setup, and I&rsquo;ll occasionally fire up the <a href="https://git-annex.branchable.com/design/assistant/webapp/">web app</a> to see what the assistant daemon is doing, but I like the control and understanding provided by a manual setup. The power and flexibility of git-annex is deceptive. Using it solely through the simplified interface of the web app greatly limits what can be accomplished with it.</p> <h2>Encryption</h2> <p>Before even getting into git-annex, the drive should be encrypted with <a href="https://en.wikipedia.org/wiki/Linux_Unified_Key_Setup">LUKS</a>/<a href="https://en.wikipedia.org/wiki/Dm-crypt">dm-crypt</a>. The need for this could be avoided by using something like <a href="https://git-annex.branchable.com/special_remotes/gcrypt/">gcrypt</a>, but LUKS/dm-crypt is an ingrained habit and part of my workflow for all external drives. Assuming the drive is <code>/dev/sdc</code>, pass <code>cryptsetup</code> some sane defaults:</p> <div class="highlight"><pre><span></span><code>$ sudo cryptsetup --cipher aes-xts-plain64 --key-size <span class="m">512</span> --hash sha512 luksFormat /dev/sdc </code></pre></div> <p>With the drive encrypted, it can then be opened and formatted. I&rsquo;ll give the drive a human-friendly label of <code>themisto</code>.</p> <div class="highlight"><pre><span></span><code>$ sudo cryptsetup luksOpen /dev/sdc themisto_crypt $ sudo mkfs.ext4 -L themisto /dev/mapper/themisto_crypt </code></pre></div> <p>At this point the drive is ready. I close it and then mount it with <a href="https://github.com/coldfix/udiskie">udiskie</a> to make sure everything is working. How the drive is mounted doesn&rsquo;t matter, but I like udiskie because it can <a href="https://github.com/pigmonkey/dotfiles/blob/master/config/udiskie/config.yml#L5">integrate with my password manager</a> to get the drive passphrase.</p> <div class="highlight"><pre><span></span><code>$ sudo cryptsetup luksClose /dev/mapper/themisto_crypt $ udiskie-mount -r /dev/sdc </code></pre></div> <h2>Git-Annex</h2> <p>With the encryption handled, the drive should now be mounted at <code>/media/themisto</code>. For the first few steps, we&rsquo;ll basically follow the <a href="https://git-annex.branchable.com/walkthrough/">git-annex walkthrough</a>. Let&rsquo;s assume that we are setting up this drive to be a repository of the annex <code>~/video</code>. The first step is to go to the drive, clone the repository, and initialize the annex. When initializing the annex I prepend the name of the remote with <code>satellite :</code>. My cold storage drives are all named after satellites, and doing this allows me to easily identify them when looking at a list of remotes.</p> <div class="highlight"><pre><span></span><code>$ <span class="nb">cd</span> /media/themisto $ git clone ~/video $ <span class="nb">cd</span> video $ git annex init <span class="s2">&quot;satellite : themisto&quot;</span> </code></pre></div> <h3>Disk Reserve</h3> <p>Whenever dealing with a repository that is bigger (or may become bigger) than the drive it is being stored on, it is important to set a disk reserve. This tells git-annex to always keep some free space around. I generally like to set this to 1 GB, which is way larger than it needs to be.</p> <div class="highlight"><pre><span></span><code>$ git config annex.diskreserve <span class="s2">&quot;1 gb&quot;</span> </code></pre></div> <h3>Adding Remotes</h3> <p>I&rsquo;ll then tell this new repository where the original repository is located. In this case I&rsquo;ll refer to the original using the name of my computer, <code>nous</code>.</p> <div class="highlight"><pre><span></span><code>$ git remote add nous ~/video </code></pre></div> <p>If other remotes already exist, now is a good time to add them. These could be <a href="https://git-annex.branchable.com/special_remotes/">special remotes</a> or normal ones. For this example, let&rsquo;s say that we have already completed this whole process for another cold storage drive called <code>sinope</code>, and that we have an <a href="https://git-annex.branchable.com/special_remotes/S3/">s3</a> remote creatively named <code>s3</code>.</p> <div class="highlight"><pre><span></span><code>$ git remote add sinope /media/sinope/video $ <span class="nb">export</span> <span class="nv">AWS_ACCESS_KEY_ID</span><span class="o">=</span><span class="s2">&quot;...&quot;</span> $ <span class="nb">export</span> <span class="nv">AWS_SECRET_ACCESS_KEY</span><span class="o">=</span><span class="s2">&quot;...&quot;</span> $ git annex enableremote s3 </code></pre></div> <h3>Trust</h3> <p><a href="https://git-annex.branchable.com/trust/">Trust</a> is a critical component of how git-annex works. Any new annex will default to being semi-trusted, which means that when running operations within the annex on the main computer &ndash; say, dropping a file &ndash; git-annex will want to confirm that <code>themisto</code> has the files that it is supposed to have. In the case of <code>themisto</code> being a USB drive that is rarely connected, this is not very useful. I tell git-annex to trust my cold storage drives, which means that if git-annex has a record of a certain file being on the drive, it will be satisfied with that. This increases the risk for potential data-loss, but for this application I feel it is appropriate.</p> <div class="highlight"><pre><span></span><code>$ git annex trust . </code></pre></div> <h3>Preferred Content</h3> <p>The final step that needs to be taken on the new repository is to tell it what files it should want. This is done using <a href="https://git-annex.branchable.com/preferred_content/">preferred content</a>. The <a href="https://git-annex.branchable.com/preferred_content/standard_groups/">standard groups</a> that git-annex ships with cover most of the bases. Of interest for this application is the <code>archive</code> group, which wants all content except that which has already found its way to another archive. This is the behaviour I want, but I will duplicate it into a custom group called <code>satellite</code>. This keeps my cold storage drives as standalone things that do not influence any other remotes where I may want to use the default <code>archive</code>.</p> <div class="highlight"><pre><span></span><code>$ git annex groupwanted satellite <span class="s2">&quot;(not copies=satellite:1) or approxlackingcopies=1&quot;</span> $ git annex group . satellite $ git annex wanted . groupwanted </code></pre></div> <p>For other repositories, I may want to store the data on multiple cold storage drives. In that case I would create a <code>redundantsatellite</code> group that wants all content which is not already present in two other members of the group.</p> <div class="highlight"><pre><span></span><code>$ git annex groupwanted redundantsatellite <span class="s2">&quot;(not copies=redundantsatellite:2) or approxlackingcopies=1&quot;</span> $ git annex group . redundantsatellite $ git annex wanted . groupwanted </code></pre></div> <h3>Syncing</h3> <p>With everything setup, the new repository is ready to sync and to start to ingest content from the remotes it knows about!</p> <div class="highlight"><pre><span></span><code>$ git annex sync --content </code></pre></div> <p>However, the original repository also needs to know about the new remote.</p> <div class="highlight"><pre><span></span><code>$ <span class="nb">cd</span> ~/video $ git remote add themisto /media/themisto/video $ git annex sync </code></pre></div> <p>The same is the case for any other previously existing repository, such as <code>sinope</code>.</p>Redundant File Storage2016-08-19T00:00:00-07:002016-08-19T20:27:23-07:00Pig Monkeytag:pig-monkey.com,2016-08-19:/2016/08/rclone/<p>As I&rsquo;ve <a href="/tag/annex/">mentioned previously</a>, I store just about everything that matters in <a href="https://git-annex.branchable.com/">git-annex</a> (the only exception is code, which is stored directly in regular git). One of git-annex&rsquo;s many killer features is <a href="https://git-annex.branchable.com/special_remotes/">special remotes</a>. They make tenable this whole &ldquo;cloud storage&rdquo; thing that we do now.</p> <p>A special …</p><p>As I&rsquo;ve <a href="/tag/annex/">mentioned previously</a>, I store just about everything that matters in <a href="https://git-annex.branchable.com/">git-annex</a> (the only exception is code, which is stored directly in regular git). One of git-annex&rsquo;s many killer features is <a href="https://git-annex.branchable.com/special_remotes/">special remotes</a>. They make tenable this whole &ldquo;cloud storage&rdquo; thing that we do now.</p> <p>A special remote allows me to store my files with a large number of service providers. It makes this easy to do by abstracting away the particulars of the provider, allowing me to interact with all of them in the same way. It makes this safe to do by providing <a href="https://git-annex.branchable.com/encryption/">encryption</a>. These factors encourage redundancy, reducing my reliance on any one provider.</p> <p>Recently I began playing with <a href="http://rclone.org/">rclone</a>. Rclone is a program that supports file syncing for a handful of cloud storage providers. That&rsquo;s semi-interesting by itself but, more significantly, there is <a href="https://github.com/DanielDent/git-annex-remote-rclone">a git-annex special remote wrapper</a>. That means any of the providers supported by rclone can be used as a special remote. I looked through all of rclone&rsquo;s supported providers and decided there were a few that I had no reason not to use.</p> <h2>Hubic</h2> <p><a href="https://hubic.com/en/">Hubic</a> is a storage provider from <a href="https://www.ovh.com/us/">OVH</a> with a data center in France. Their <a href="https://hubic.com/en/offers/">pricing</a> is attractive. I&rsquo;d happily pay €50 per year for 10TB of storage. Unfortunately they limit connections to 10 Mbit/s. In my experience they ended up being even slower than this. Slow enough that I don&rsquo;t want to give them money, but there&rsquo;s still no reason not to take advantage of their free 25 GB plan.</p> <p>After signing up, I <a href="http://rclone.org/hubic/">setup a new remote in rclone</a>.</p> <div class="highlight"><pre><span></span><code><span class="err">$</span><span class="w"> </span><span class="n">rclone</span><span class="w"> </span><span class="n">config</span><span class="w"></span> <span class="n">n</span><span class="p">)</span><span class="w"> </span><span class="k">New</span><span class="w"> </span><span class="n">remote</span><span class="w"></span> <span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="k">Set</span><span class="w"> </span><span class="n">configuration</span><span class="w"> </span><span class="n">password</span><span class="w"></span> <span class="n">q</span><span class="p">)</span><span class="w"> </span><span class="n">Quit</span><span class="w"> </span><span class="n">config</span><span class="w"></span> <span class="n">n</span><span class="o">/</span><span class="n">s</span><span class="o">/</span><span class="n">q</span><span class="o">&gt;</span><span class="w"> </span><span class="n">n</span><span class="w"></span> <span class="n">name</span><span class="o">&gt;</span><span class="w"> </span><span class="n">hubic</span><span class="o">-</span><span class="n">annex</span><span class="w"></span> <span class="n">Type</span><span class="w"> </span><span class="k">of</span><span class="w"> </span><span class="n">storage</span><span class="w"> </span><span class="k">to</span><span class="w"> </span><span class="n">configure</span><span class="p">.</span><span class="w"></span> <span class="nf">Choose</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">number</span><span class="w"> </span><span class="k">from</span><span class="w"> </span><span class="n">below</span><span class="p">,</span><span class="w"> </span><span class="ow">or</span><span class="w"> </span><span class="n">type</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">your</span><span class="w"> </span><span class="n">own</span><span class="w"> </span><span class="k">value</span><span class="w"></span> <span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Amazon</span><span class="w"> </span><span class="n">Drive</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;amazon cloud drive&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">2</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Amazon</span><span class="w"> </span><span class="n">S3</span><span class="w"> </span><span class="p">(</span><span class="n">also</span><span class="w"> </span><span class="n">Dreamhost</span><span class="p">,</span><span class="w"> </span><span class="n">Ceph</span><span class="p">)</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;s3&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">3</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Backblaze</span><span class="w"> </span><span class="n">B2</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;b2&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">4</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Dropbox</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;dropbox&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">5</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Google</span><span class="w"> </span><span class="n">Cloud</span><span class="w"> </span><span class="n">Storage</span><span class="w"> </span><span class="p">(</span><span class="n">this</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="n">Google</span><span class="w"> </span><span class="n">Drive</span><span class="p">)</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;google cloud storage&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">6</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Google</span><span class="w"> </span><span class="n">Drive</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;drive&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">7</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Hubic</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;hubic&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">8</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="k">Local</span><span class="w"> </span><span class="k">Disk</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;local&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">9</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Microsoft</span><span class="w"> </span><span class="n">OneDrive</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;onedrive&quot;</span><span class="w"></span> <span class="mi">10</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Openstack</span><span class="w"> </span><span class="n">Swift</span><span class="w"> </span><span class="p">(</span><span class="n">Rackspace</span><span class="w"> </span><span class="n">Cloud</span><span class="w"> </span><span class="n">Files</span><span class="p">,</span><span class="w"> </span><span class="n">Memset</span><span class="w"> </span><span class="n">Memstore</span><span class="p">,</span><span class="w"> </span><span class="n">OVH</span><span class="p">)</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;swift&quot;</span><span class="w"></span> <span class="mi">11</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Yandex</span><span class="w"> </span><span class="k">Disk</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;yandex&quot;</span><span class="w"></span> <span class="n">Storage</span><span class="o">&gt;</span><span class="w"> </span><span class="mi">7</span><span class="w"></span> <span class="n">Hubic</span><span class="w"> </span><span class="n">Client</span><span class="w"> </span><span class="n">Id</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">leave</span><span class="w"> </span><span class="n">blank</span><span class="w"> </span><span class="n">normally</span><span class="p">.</span><span class="w"></span> <span class="n">client_id</span><span class="o">&gt;</span><span class="w"> </span> <span class="n">Hubic</span><span class="w"> </span><span class="n">Client</span><span class="w"> </span><span class="n">Secret</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">leave</span><span class="w"> </span><span class="n">blank</span><span class="w"> </span><span class="n">normally</span><span class="p">.</span><span class="w"></span> <span class="n">client_secret</span><span class="o">&gt;</span><span class="w"> </span> <span class="n">Remote</span><span class="w"> </span><span class="n">config</span><span class="w"></span> <span class="k">Use</span><span class="w"> </span><span class="n">auto</span><span class="w"> </span><span class="n">config</span><span class="vm">?</span><span class="w"></span> <span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">Say</span><span class="w"> </span><span class="n">Y</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="n">sure</span><span class="w"></span> <span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">Say</span><span class="w"> </span><span class="n">N</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">you</span><span class="w"> </span><span class="k">are</span><span class="w"> </span><span class="n">working</span><span class="w"> </span><span class="k">on</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">remote</span><span class="w"> </span><span class="ow">or</span><span class="w"> </span><span class="n">headless</span><span class="w"> </span><span class="n">machine</span><span class="w"></span> <span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="n">Yes</span><span class="w"></span> <span class="n">n</span><span class="p">)</span><span class="w"> </span><span class="k">No</span><span class="w"></span> <span class="n">y</span><span class="o">/</span><span class="n">n</span><span class="o">&gt;</span><span class="w"> </span><span class="n">y</span><span class="w"></span> <span class="k">If</span><span class="w"> </span><span class="n">your</span><span class="w"> </span><span class="n">browser</span><span class="w"> </span><span class="n">doesn</span><span class="err">&#39;</span><span class="n">t</span><span class="w"> </span><span class="k">open</span><span class="w"> </span><span class="n">automatically</span><span class="w"> </span><span class="k">go</span><span class="w"> </span><span class="k">to</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="nl">link</span><span class="p">:</span><span class="w"> </span><span class="nl">http</span><span class="p">:</span><span class="o">//</span><span class="mf">127.0.0.1</span><span class="err">:</span><span class="mi">53682</span><span class="o">/</span><span class="n">auth</span><span class="w"></span> <span class="nf">Log</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="ow">and</span><span class="w"> </span><span class="n">authorize</span><span class="w"> </span><span class="n">rclone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">access</span><span class="w"></span> <span class="n">Waiting</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">code</span><span class="p">...</span><span class="w"></span> <span class="n">Got</span><span class="w"> </span><span class="n">code</span><span class="w"></span> <span class="o">--------------------</span><span class="w"></span> <span class="o">[</span><span class="n">remote</span><span class="o">]</span><span class="w"></span> <span class="n">client_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span> <span class="n">client_secret</span><span class="w"> </span><span class="o">=</span><span class="w"> </span> <span class="n">token</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="err">{</span><span class="ss">&quot;access_token&quot;</span><span class="err">:</span><span class="ss">&quot;XXXXXX&quot;</span><span class="err">}</span><span class="w"></span> <span class="o">--------------------</span><span class="w"></span> <span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="n">Yes</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="n">OK</span><span class="w"></span> <span class="n">e</span><span class="p">)</span><span class="w"> </span><span class="n">Edit</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="n">remote</span><span class="w"></span> <span class="n">d</span><span class="p">)</span><span class="w"> </span><span class="k">Delete</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="n">remote</span><span class="w"></span> <span class="n">y</span><span class="o">/</span><span class="n">e</span><span class="o">/</span><span class="n">d</span><span class="o">&gt;</span><span class="w"> </span><span class="n">y</span><span class="w"></span> </code></pre></div> <p>With that setup, I went into my <code>~/documents</code> annex and added the remote.</p> <div class="highlight"><pre><span></span><code>$ git annex initremote hubic <span class="nv">type</span><span class="o">=</span>external <span class="nv">externaltype</span><span class="o">=</span>rclone <span class="nv">target</span><span class="o">=</span>hubic-annex <span class="nv">prefix</span><span class="o">=</span>annex-documents <span class="nv">chunk</span><span class="o">=</span>50MiB <span class="nv">encryption</span><span class="o">=</span>shared <span class="nv">rclone_layout</span><span class="o">=</span>lower <span class="nv">mac</span><span class="o">=</span>HMACSHA512 </code></pre></div> <p>I want git-annex to automatically send everything to Hubic, so I took advantage of <a href="https://git-annex.branchable.com/preferred_content/standard_groups/">standard groups</a> and put the repository in the <code>backup</code> group.</p> <div class="highlight"><pre><span></span><code>$ git annex wanted hubic standard $ git annex group hubic backup </code></pre></div> <p>Given Hubic&rsquo;s slow speed, I don&rsquo;t really want to download files from it unless I need to. This can be configured in git-annex by setting the cost of the remote. Local repositories default to 100 and remote repositories default to 200. I gave the Hubic remote a high cost so that it will only be used if no other remotes are available.</p> <div class="highlight"><pre><span></span><code>$ git config remote.hubic.annex-cost <span class="m">500</span> </code></pre></div> <p>If you would like to try Hubic, I have a <a href="https://hubic.com/home/new/?referral=FATDIA">referral code</a> which gives us both an extra 5GB for free.</p> <h2>Backblaze B2</h2> <p><a href="https://www.backblaze.com/b2/cloud-storage.html">B2</a> is the cloud storage offering from backup company <a href="https://www.backblaze.com/">Backblaze</a>. I don&rsquo;t know anything about them, but at $0.005 per GB I like their <a href="https://www.backblaze.com/b2/cloud-storage-providers.html">pricing</a>. A quick search of reviews shows that the main complaint about the service is that they offer no geographic redundancy, which is entirely irrelevant to me since I build my own redundancy with my half-dozen or so remotes per repository.</p> <p>Signing up with Backblaze took a bit longer. They wanted a phone number for 2-factor authentication, I wanted to give them a credit card so that I could use more than the 10GB they offer for free, and I had to generate an application key to use with rclone. After that, the <a href="http://rclone.org/b2/">rclone setup</a> was simple.</p> <div class="highlight"><pre><span></span><code><span class="err">$</span><span class="w"> </span><span class="n">rclone</span><span class="w"> </span><span class="n">config</span><span class="w"></span> <span class="n">n</span><span class="p">)</span><span class="w"> </span><span class="k">New</span><span class="w"> </span><span class="n">remote</span><span class="w"></span> <span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="k">Set</span><span class="w"> </span><span class="n">configuration</span><span class="w"> </span><span class="n">password</span><span class="w"></span> <span class="n">q</span><span class="p">)</span><span class="w"> </span><span class="n">Quit</span><span class="w"> </span><span class="n">config</span><span class="w"></span> <span class="n">n</span><span class="o">/</span><span class="n">s</span><span class="o">/</span><span class="n">q</span><span class="o">&gt;</span><span class="w"> </span><span class="n">n</span><span class="w"></span> <span class="n">name</span><span class="o">&gt;</span><span class="w"> </span><span class="n">b2</span><span class="o">-</span><span class="n">annex</span><span class="w"></span> <span class="n">Type</span><span class="w"> </span><span class="k">of</span><span class="w"> </span><span class="n">storage</span><span class="w"> </span><span class="k">to</span><span class="w"> </span><span class="n">configure</span><span class="p">.</span><span class="w"></span> <span class="nf">Choose</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">number</span><span class="w"> </span><span class="k">from</span><span class="w"> </span><span class="n">below</span><span class="p">,</span><span class="w"> </span><span class="ow">or</span><span class="w"> </span><span class="n">type</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">your</span><span class="w"> </span><span class="n">own</span><span class="w"> </span><span class="k">value</span><span class="w"></span> <span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Amazon</span><span class="w"> </span><span class="n">Drive</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;amazon cloud drive&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">2</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Amazon</span><span class="w"> </span><span class="n">S3</span><span class="w"> </span><span class="p">(</span><span class="n">also</span><span class="w"> </span><span class="n">Dreamhost</span><span class="p">,</span><span class="w"> </span><span class="n">Ceph</span><span class="p">)</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;s3&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">3</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Backblaze</span><span class="w"> </span><span class="n">B2</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;b2&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">4</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Dropbox</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;dropbox&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">5</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Google</span><span class="w"> </span><span class="n">Cloud</span><span class="w"> </span><span class="n">Storage</span><span class="w"> </span><span class="p">(</span><span class="n">this</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="n">Google</span><span class="w"> </span><span class="n">Drive</span><span class="p">)</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;google cloud storage&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">6</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Google</span><span class="w"> </span><span class="n">Drive</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;drive&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">7</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Hubic</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;hubic&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">8</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="k">Local</span><span class="w"> </span><span class="k">Disk</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;local&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">9</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Microsoft</span><span class="w"> </span><span class="n">OneDrive</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;onedrive&quot;</span><span class="w"></span> <span class="mi">10</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Openstack</span><span class="w"> </span><span class="n">Swift</span><span class="w"> </span><span class="p">(</span><span class="n">Rackspace</span><span class="w"> </span><span class="n">Cloud</span><span class="w"> </span><span class="n">Files</span><span class="p">,</span><span class="w"> </span><span class="n">Memset</span><span class="w"> </span><span class="n">Memstore</span><span class="p">,</span><span class="w"> </span><span class="n">OVH</span><span class="p">)</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;swift&quot;</span><span class="w"></span> <span class="mi">11</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Yandex</span><span class="w"> </span><span class="k">Disk</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;yandex&quot;</span><span class="w"></span> <span class="n">Storage</span><span class="o">&gt;</span><span class="w"> </span><span class="mi">3</span><span class="w"></span> <span class="n">Account</span><span class="w"> </span><span class="n">ID</span><span class="w"></span> <span class="n">account</span><span class="o">&gt;</span><span class="w"> </span><span class="mi">123456789</span><span class="n">abc</span><span class="w"></span> <span class="n">Application</span><span class="w"> </span><span class="k">Key</span><span class="w"></span> <span class="k">key</span><span class="o">&gt;</span><span class="w"> </span><span class="mi">0123456789</span><span class="n">abcdef0123456789abcdef0123456789</span><span class="w"></span> <span class="n">Endpoint</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">service</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">leave</span><span class="w"> </span><span class="n">blank</span><span class="w"> </span><span class="n">normally</span><span class="p">.</span><span class="w"></span> <span class="n">endpoint</span><span class="o">&gt;</span><span class="w"> </span> <span class="n">Remote</span><span class="w"> </span><span class="n">config</span><span class="w"></span> <span class="o">--------------------</span><span class="w"></span> <span class="o">[</span><span class="n">remote</span><span class="o">]</span><span class="w"></span> <span class="n">account</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">123456789</span><span class="n">abc</span><span class="w"></span> <span class="k">key</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0123456789</span><span class="n">abcdef0123456789abcdef0123456789</span><span class="w"></span> <span class="n">endpoint</span><span class="w"> </span><span class="o">=</span><span class="w"> </span> <span class="o">--------------------</span><span class="w"></span> <span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="n">Yes</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="n">OK</span><span class="w"></span> <span class="n">e</span><span class="p">)</span><span class="w"> </span><span class="n">Edit</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="n">remote</span><span class="w"></span> <span class="n">d</span><span class="p">)</span><span class="w"> </span><span class="k">Delete</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="n">remote</span><span class="w"></span> <span class="n">y</span><span class="o">/</span><span class="n">e</span><span class="o">/</span><span class="n">d</span><span class="o">&gt;</span><span class="w"> </span><span class="n">y</span><span class="w"></span> </code></pre></div> <p>With that, it was back to <code>~/documents</code> to initialize the remote and send it all the things</p> <div class="highlight"><pre><span></span><code>$ git annex initremote b2 <span class="nv">type</span><span class="o">=</span>external <span class="nv">externaltype</span><span class="o">=</span>rclone <span class="nv">target</span><span class="o">=</span>b2-annex <span class="nv">prefix</span><span class="o">=</span>annex-documents <span class="nv">chunk</span><span class="o">=</span>50MiB <span class="nv">encryption</span><span class="o">=</span>shared <span class="nv">rclone_layout</span><span class="o">=</span>lower <span class="nv">mac</span><span class="o">=</span>HMACSHA512 $ git annex wanted b2 standard $ git annex group b2 backup </code></pre></div> <p>While I did not measure the speed with B2, it feels as fast as my <a href="https://aws.amazon.com/s3/">S3</a> or <a href="http://www.rsync.net/products/git-annex-pricing.html">rsync.net</a> remotes, so I didn&rsquo;t bother setting the cost.</p> <h2>Google Drive</h2> <p>While I do not regularly use Google services for personal things, I do have a Google account for Android stuff. Google Drive offers <a href="https://support.google.com/drive/answer/2375123?hl=en">15 GB of storage for free</a> and <a href="http://rclone.org/drive/">rclone supports it</a>, so why not take advantage?</p> <div class="highlight"><pre><span></span><code><span class="err">$</span><span class="w"> </span><span class="n">rclone</span><span class="w"> </span><span class="n">config</span><span class="w"></span> <span class="n">n</span><span class="p">)</span><span class="w"> </span><span class="k">New</span><span class="w"> </span><span class="n">remote</span><span class="w"></span> <span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="k">Set</span><span class="w"> </span><span class="n">configuration</span><span class="w"> </span><span class="n">password</span><span class="w"></span> <span class="n">q</span><span class="p">)</span><span class="w"> </span><span class="n">Quit</span><span class="w"> </span><span class="n">config</span><span class="w"></span> <span class="n">n</span><span class="o">/</span><span class="n">s</span><span class="o">/</span><span class="n">q</span><span class="o">&gt;</span><span class="w"> </span><span class="n">n</span><span class="w"></span> <span class="n">name</span><span class="o">&gt;</span><span class="w"> </span><span class="n">gdrive</span><span class="o">-</span><span class="n">annex</span><span class="w"></span> <span class="n">Type</span><span class="w"> </span><span class="k">of</span><span class="w"> </span><span class="n">storage</span><span class="w"> </span><span class="k">to</span><span class="w"> </span><span class="n">configure</span><span class="p">.</span><span class="w"></span> <span class="nf">Choose</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">number</span><span class="w"> </span><span class="k">from</span><span class="w"> </span><span class="n">below</span><span class="p">,</span><span class="w"> </span><span class="ow">or</span><span class="w"> </span><span class="n">type</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="n">your</span><span class="w"> </span><span class="n">own</span><span class="w"> </span><span class="k">value</span><span class="w"></span> <span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Amazon</span><span class="w"> </span><span class="n">Drive</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;amazon cloud drive&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">2</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Amazon</span><span class="w"> </span><span class="n">S3</span><span class="w"> </span><span class="p">(</span><span class="n">also</span><span class="w"> </span><span class="n">Dreamhost</span><span class="p">,</span><span class="w"> </span><span class="n">Ceph</span><span class="p">)</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;s3&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">3</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Backblaze</span><span class="w"> </span><span class="n">B2</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;b2&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">4</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Dropbox</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;dropbox&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">5</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Google</span><span class="w"> </span><span class="n">Cloud</span><span class="w"> </span><span class="n">Storage</span><span class="w"> </span><span class="p">(</span><span class="n">this</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="n">Google</span><span class="w"> </span><span class="n">Drive</span><span class="p">)</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;google cloud storage&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">6</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Google</span><span class="w"> </span><span class="n">Drive</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;drive&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">7</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Hubic</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;hubic&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">8</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="k">Local</span><span class="w"> </span><span class="k">Disk</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;local&quot;</span><span class="w"></span> <span class="w"> </span><span class="mi">9</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Microsoft</span><span class="w"> </span><span class="n">OneDrive</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;onedrive&quot;</span><span class="w"></span> <span class="mi">10</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Openstack</span><span class="w"> </span><span class="n">Swift</span><span class="w"> </span><span class="p">(</span><span class="n">Rackspace</span><span class="w"> </span><span class="n">Cloud</span><span class="w"> </span><span class="n">Files</span><span class="p">,</span><span class="w"> </span><span class="n">Memset</span><span class="w"> </span><span class="n">Memstore</span><span class="p">,</span><span class="w"> </span><span class="n">OVH</span><span class="p">)</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;swift&quot;</span><span class="w"></span> <span class="mi">11</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Yandex</span><span class="w"> </span><span class="k">Disk</span><span class="w"></span> <span class="w"> </span><span class="err">\</span><span class="w"> </span><span class="ss">&quot;yandex&quot;</span><span class="w"></span> <span class="n">Storage</span><span class="o">&gt;</span><span class="w"> </span><span class="mi">6</span><span class="w"></span> <span class="n">Google</span><span class="w"> </span><span class="n">Application</span><span class="w"> </span><span class="n">Client</span><span class="w"> </span><span class="n">Id</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">leave</span><span class="w"> </span><span class="n">blank</span><span class="w"> </span><span class="n">normally</span><span class="p">.</span><span class="w"></span> <span class="n">client_id</span><span class="o">&gt;</span><span class="w"> </span> <span class="n">Google</span><span class="w"> </span><span class="n">Application</span><span class="w"> </span><span class="n">Client</span><span class="w"> </span><span class="n">Secret</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">leave</span><span class="w"> </span><span class="n">blank</span><span class="w"> </span><span class="n">normally</span><span class="p">.</span><span class="w"></span> <span class="n">client_secret</span><span class="o">&gt;</span><span class="w"> </span> <span class="n">Remote</span><span class="w"> </span><span class="n">config</span><span class="w"></span> <span class="k">Use</span><span class="w"> </span><span class="n">auto</span><span class="w"> </span><span class="n">config</span><span class="vm">?</span><span class="w"></span> <span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">Say</span><span class="w"> </span><span class="n">Y</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="ow">not</span><span class="w"> </span><span class="n">sure</span><span class="w"></span> <span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">Say</span><span class="w"> </span><span class="n">N</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">you</span><span class="w"> </span><span class="k">are</span><span class="w"> </span><span class="n">working</span><span class="w"> </span><span class="k">on</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">remote</span><span class="w"> </span><span class="ow">or</span><span class="w"> </span><span class="n">headless</span><span class="w"> </span><span class="n">machine</span><span class="w"> </span><span class="ow">or</span><span class="w"> </span><span class="n">Y</span><span class="w"> </span><span class="n">didn</span><span class="s1">&#39;t work</span> <span class="s1">y) Yes</span> <span class="s1">n) No</span> <span class="s1">y/n&gt; y</span> <span class="s1">If your browser doesn&#39;</span><span class="n">t</span><span class="w"> </span><span class="k">open</span><span class="w"> </span><span class="n">automatically</span><span class="w"> </span><span class="k">go</span><span class="w"> </span><span class="k">to</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">following</span><span class="w"> </span><span class="nl">link</span><span class="p">:</span><span class="w"> </span><span class="nl">http</span><span class="p">:</span><span class="o">//</span><span class="mf">127.0.0.1</span><span class="err">:</span><span class="mi">53682</span><span class="o">/</span><span class="n">auth</span><span class="w"></span> <span class="nf">Log</span><span class="w"> </span><span class="ow">in</span><span class="w"> </span><span class="ow">and</span><span class="w"> </span><span class="n">authorize</span><span class="w"> </span><span class="n">rclone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">access</span><span class="w"></span> <span class="n">Waiting</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">code</span><span class="p">...</span><span class="w"></span> <span class="n">Got</span><span class="w"> </span><span class="n">code</span><span class="w"></span> <span class="o">--------------------</span><span class="w"></span> <span class="o">[</span><span class="n">remote</span><span class="o">]</span><span class="w"></span> <span class="n">client_id</span><span class="w"> </span><span class="o">=</span><span class="w"> </span> <span class="n">client_secret</span><span class="w"> </span><span class="o">=</span><span class="w"> </span> <span class="n">token</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="err">{</span><span class="ss">&quot;AccessToken&quot;</span><span class="err">:</span><span class="ss">&quot;xxxx.x.xxxxx_xxxxxxxxxxx_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx&quot;</span><span class="p">,</span><span class="ss">&quot;RefreshToken&quot;</span><span class="err">:</span><span class="ss">&quot;1/xxxxxxxxxxxxxxxx_xxxxxxxxxxxxxxxxxxxxxxxxxx&quot;</span><span class="p">,</span><span class="ss">&quot;Expiry&quot;</span><span class="err">:</span><span class="ss">&quot;2014-03-16T13:57:58.955387075Z&quot;</span><span class="p">,</span><span class="ss">&quot;Extra&quot;</span><span class="err">:</span><span class="k">null</span><span class="err">}</span><span class="w"></span> <span class="o">--------------------</span><span class="w"></span> <span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="n">Yes</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="k">is</span><span class="w"> </span><span class="n">OK</span><span class="w"></span> <span class="n">e</span><span class="p">)</span><span class="w"> </span><span class="n">Edit</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="n">remote</span><span class="w"></span> <span class="n">d</span><span class="p">)</span><span class="w"> </span><span class="k">Delete</span><span class="w"> </span><span class="n">this</span><span class="w"> </span><span class="n">remote</span><span class="w"></span> <span class="n">y</span><span class="o">/</span><span class="n">e</span><span class="o">/</span><span class="n">d</span><span class="o">&gt;</span><span class="w"> </span><span class="n">y</span><span class="w"></span> </code></pre></div> <p>And again, to <code>~/documents</code>.</p> <div class="highlight"><pre><span></span><code>$ git annex initremote gdrive <span class="nv">type</span><span class="o">=</span>external <span class="nv">externaltype</span><span class="o">=</span>rclone <span class="nv">target</span><span class="o">=</span>gdrive-annex <span class="nv">prefix</span><span class="o">=</span>annex-documents <span class="nv">chunk</span><span class="o">=</span>50MiB <span class="nv">encryption</span><span class="o">=</span>shared <span class="nv">rclone_layout</span><span class="o">=</span>lower <span class="nv">mac</span><span class="o">=</span>HMACSHA512 $ git annex wanted gdrive standard $ git annex group gdrive backup </code></pre></div> <p>Rinse and repeat the process for other annexes. Revel in having simple, secure, and redundant storage.</p>I celebrated World Backup Day by increasing the resiliency of data in my life.2016-03-31T00:00:00-07:002016-08-19T20:03:18-07:00Pig Monkeytag:pig-monkey.com,2016-03-31:/2016/03/backup/<p>Four <a href="https://wiki.archlinux.org/index.php/Dm-crypt">encrypted</a> 2TB hard drives, stored in a <a href="http://www.pelican.com/us/en/product/watertight-protector-hard-cases/small-case/standard/1200/">Pelican 1200</a>, with <a href="https://securitysnobs.com/Abloy-Protec2-PL-321-Padlock.html">Abloy Protec2 PL 321</a> padlocks as tamper-evident seals. Having everything that matters stored in <a href="https://git-annex.branchable.com/">git-annex</a> makes projects like this simple: just clone the repositories, define the <a href="https://git-annex.branchable.com/preferred_content/">preferred content expressions</a>, and watch the magic happen.</p> <p><a href="https://www.flickr.com/photos/pigmonkey/25889491200/in/dateposted/" title="Cold Storage"><img src="https://farm2.staticflickr.com/1624/25889491200_7b962ddfd0_c.jpg" width="800" height="450" alt="Cold Storage"></a></p>Antisocial Activity Tracking2015-08-04T00:00:00-07:002019-11-04T19:03:06-08:00Pig Monkeytag:pig-monkey.com,2015-08-04:/2015/08/gpx/<p>A GPS track provides useful a useful log of physical activities. Beyond simply recording a route, the series of coordinate and time mappings allow statistics like distance, speed, elevation, and time to be calculated. I recently decided that I wanted to start recording this information, but I was not interested …</p><p>A GPS track provides useful a useful log of physical activities. Beyond simply recording a route, the series of coordinate and time mappings allow statistics like distance, speed, elevation, and time to be calculated. I recently decided that I wanted to start recording this information, but I was not interested in any of the plethora of social, cloud-based services that are hip these days. A simple <a href="https://en.wikipedia.org/wiki/GPS_eXchange_Format">GPX track</a> gives me all the information I care about, and I don&rsquo;t have a strong desire to share them with a third party provider or a social network.</p> <h2>Recording Tracks</h2> <p>The discovery of <a href="http://code.mendhak.com/gpslogger/">GPSLogger</a> is what made me excited to start this project. A simple but powerful Android application, GPSLogger will log to a number of different formats and, when a track is complete, automatically distribute it. This can be done by uploading the file to a storage provider, emailing it, or posting it to a custom URL. It always logs in metric units but optionally displays in Imperial.</p> <p>What makes GPSLogger really stand out are its performance features. It allows very fine-grained control over GPS use, which allows tracks to be recorded for extended periods of times (such as days) with a negligible impact on battery usage.</p> <p>For activities like running, shorter hikes and bicycle rides I tend to err on the side of accuracy. I set GPSLogger to log a coordinate every 10 seconds, with a minimum distance of 5 meters between points and a minimum accuracy of 10 meters. It will try to get a fix for 120 seconds before timing out, and attempt to meet the accuracy requirement for 60 seconds before giving up.</p> <p>For a longer day-hike, the time between points could be increased to something in the neighborhood of 60 seconds. For a multi-day backpacking trip, a setting of 10 minutes or more would still provide great enough accuracy to make for a useful record of the route. I&rsquo;ve found that being able to control these settings really opens up a lot of tracking possibilities that I would otherwise not consider for fear of battery drain.</p> <p><a data-flickr-embed="true" href="https://www.flickr.com/photos/pigmonkey/20116407608/in/dateposted/" title="GPSLogger"><img src="https://farm1.staticflickr.com/549/20116407608_bafd5c9a3a_c.jpg" width="800" height="534" alt="GPSLogger"></a></p> <h2>Storing Tracks</h2> <p>After a track has been recorded, I transfer it to my computer and store it with <a href="https://git-annex.branchable.com/">git-annex</a>.</p> <p>Everything in my home directory that is not a temporary file is stored either in git or git-annex. By keeping my tracks in an annex rather than directly in git, I can take advantage of git-annex&rsquo;s powerful <a href="https://git-annex.branchable.com/metadata/">metadata</a> support. GPSLogger automatically names tracks with a time stamp, but the annex for my tracks is also configured to <a href="https://git-annex.branchable.com/tips/automatically_adding_metadata/">automatically set the year and month when adding files</a>.</p> <div class="highlight"><pre><span></span><code>$ <span class="nb">cd</span> ~/tracks $ git config annex.genmetadata <span class="nb">true</span> </code></pre></div> <p>After moving a track into the annex, I&rsquo;ll tag it with a custom <code>activity</code> field, with values like <code>run</code>, <code>hike</code>, or <code>bike</code>.</p> <div class="highlight"><pre><span></span><code>$ git annex metadata --set <span class="nv">activity</span><span class="o">=</span>bike <span class="m">20150725110839</span>.gpx </code></pre></div> <p>I also find it useful to tag tracks with a gross location value so that I can get an idea of where they were recorded without loading them on a map. Counties tend to work well for this.</p> <div class="highlight"><pre><span></span><code>$ git annex metadata --set <span class="nv">county</span><span class="o">=</span>sanfrancisco <span class="m">20150725110839</span>.gpx </code></pre></div> <p>Of course, a track may span multiple counties. This is easily handled by git-annex.</p> <div class="highlight"><pre><span></span><code>$ git annex metadata --set <span class="nv">county</span><span class="o">+=</span>marin <span class="m">20150725110839</span>.gpx </code></pre></div> <p>One could also use fields to store location values such as National Park, National Forest or Wilderness Area.</p> <h3>Metadata Views</h3> <p>The reason for storing metadata is the ability to use <a href="https://git-annex.branchable.com/tips/metadata_driven_views/">metadata driven views</a>. This allows me to alter the directory structure of the annex based on the metadata. For instance, I can tell git-annex to show me all tracks grouped by year followed by activity.</p> <div class="highlight"><pre><span></span><code>$ git annex view <span class="s2">&quot;year=*&quot;</span> <span class="s2">&quot;activity=*&quot;</span> $ tree -d . └── <span class="m">2015</span> ├── bike ├── hike └── run </code></pre></div> <p>Or, I could ask to see all the runs I went on this July.</p> <div class="highlight"><pre><span></span><code>$ git annex view <span class="nv">year</span><span class="o">=</span><span class="m">2015</span> <span class="nv">month</span><span class="o">=</span><span class="m">07</span> <span class="nv">activity</span><span class="o">=</span>run </code></pre></div> <p>I&rsquo;ve found this to be a super powerful tool. It gives me the simplicity and flexibility of storing the tracks as plain-text on the filesystem, with some of the querying possibilities of a database. Its usefulness is only limited by the metadata stored.</p> <h2>Viewing Tracks</h2> <p>For simple statistics, I&rsquo;ll use the <code>gpxinfo</code> command provided by <a href="https://github.com/tkrajina/gpxpy">gpxpy</a>. This gives me the basics of time, distance and speed, which is generally all I care about for something like a weekly run.</p> <div class="highlight"><pre><span></span><code>$ gpxinfo <span class="m">20150725110839</span>.gpx File: <span class="m">20150725110839</span>.gpx Length 2D: <span class="m">6</span>.081km Length 3D: <span class="m">6</span>.123km Moving time: <span class="m">00</span>:35:05 Stopped time: n/a Max speed: <span class="m">3</span>.54m/s <span class="o">=</span> <span class="m">12</span>.74km/h Total uphill: <span class="m">96</span>.50m Total downhill: <span class="m">130</span>.50m Started: <span class="m">2015</span>-07-25 <span class="m">18</span>:08:45 Ended: <span class="m">2015</span>-07-25 <span class="m">18</span>:43:50 Points: <span class="m">188</span> Avg distance between points: <span class="m">32</span>.35m Track <span class="c1">#0, Segment #0</span> Length 2D: <span class="m">6</span>.081km Length 3D: <span class="m">6</span>.123km Moving time: <span class="m">00</span>:35:05 Stopped time: n/a Max speed: <span class="m">3</span>.54m/s <span class="o">=</span> <span class="m">12</span>.74km/h Total uphill: <span class="m">96</span>.50m Total downhill: <span class="m">130</span>.50m Started: <span class="m">2015</span>-07-25 <span class="m">18</span>:08:45 Ended: <span class="m">2015</span>-07-25 <span class="m">18</span>:43:50 Points: <span class="m">188</span> Avg distance between points: <span class="m">32</span>.35m </code></pre></div> <p>For a more detailed inspection of the tracks, I opt for <a href="http://sourceforge.net/projects/viking/">Viking</a>. This allows me to load the tracks and view the route on a OpenStreetMap map (or any number of other map layers, such as USGS quads or Bing aerial photography). It includes all the detailed statistics you could care about extracting from a GPX track, including pretty charts of elevation, distance, time and speed.</p> <p>If I want to view the track on my phone before I&rsquo;ve transferred it to my computer, I&rsquo;ll load it in either <a href="http://backcountrynavigator.com/">BackCountry Navigator</a> or <a href="http://osmand.net/">OsmAnd</a>, depending on what kind of map layers I am interested in seeing. For simply viewing the statistics of a track on the phone, I go with <a href="https://play.google.com/store/apps/details?id=com.mendhak.gpsvisualizer">GPS Visualizer</a> (by the same author as GPSLogger).</p>Optical Backups of Photo Archives2013-05-29T00:00:00-07:002013-05-29T00:00:00-07:00Pig Monkeytag:pig-monkey.com,2013-05-29:/2013/05/optical-photo-backups/<p>I store my photos in <a href="http://git-annex.branchable.com/">git-annex</a>. A full copy of the annex exists on my laptop and on an external drive. Encrypted copies of all of my photos are stored on <a href="https://aws.amazon.com/s3/">Amazon S3</a> (which I pay for) and <a href="https://www.box.com/">box.com</a> (which provides 50GB for free) via git-annex <a href="http://git-annex.branchable.com/special_remotes/">special remotes</a>. The …</p><p>I store my photos in <a href="http://git-annex.branchable.com/">git-annex</a>. A full copy of the annex exists on my laptop and on an external drive. Encrypted copies of all of my photos are stored on <a href="https://aws.amazon.com/s3/">Amazon S3</a> (which I pay for) and <a href="https://www.box.com/">box.com</a> (which provides 50GB for free) via git-annex <a href="http://git-annex.branchable.com/special_remotes/">special remotes</a>. The photos are backed-up to an external drive daily with the rest of my laptop hard drive via <a href="/2012/10/back-it-up/">backitup.sh</a> and <a href="/2012/09/cryptshot-automated-encrypted-backups-rsnapshot/">cryptshot</a>. My entire laptop hard drive is also mirrored monthly to an external drive stored off-site.</p> <p>(The majority of my photos are also <a href="http://www.flickr.com/photos/pigmonkey/">on Flickr</a>, but I don&rsquo;t consider that a backup or even reliable storage.)</p> <p>All of this is what I consider to be the bare minimum for any redundant data storage. Photos have special value, above the value that I assign to most other data. This value only increases with age. As such they require an additional backup method, but due to the size of my collection I want to avoid backup methods that involve paying for more online storage, such as <a href="/2012/09/tarsnapper-managing-tarsnap-backups/">Tarsnap</a>.</p> <p>I choose optical discs as the medium for my photo backups. This has the advantage of being read-only, which makes it more difficult for accidental deletions or corruption to propagate through the backup system. DVD-Rs have a capacity of 4.7 GBs and a cost of around $0.25 per disc. Their life expectancy varies, but 10-years seem to be a reasonable low estimate.</p> <h2>Preparation</h2> <p>I keep all of my photos in year-based directories. At the beginning of every year, the previous year&rsquo;s directory is burned to a DVD.</p> <p>Certain years contain few enough photos that the entire year can fit on a single DVD. More recent years have enough photos of a high enough resolution that they require multiple DVDs.</p> <h3>Archive</h3> <p>My first step is to build a compressed archive of each year. I choose <a href="http://www.gnu.org/software/tar/">tar</a> and <a href="http://en.wikipedia.org/wiki/Bzip2">bzip2</a> compression for this because they&rsquo;re simple and reliable.</p> <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span> <span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>$ <span class="nb">cd</span> ~/pictures $ tar cjhf ~/tmp/pictures/2012.tar.bz <span class="m">2012</span> </code></pre></div></td></tr></table></div> <p>If the archive is larger than 3.7 GB, it needs to be split into multiple files. The resulting files will be burned to different discs. The capacity of a DVD is 4.7 GB, but I place the upper file limit at 3.7 GB so that the DVD has a minimum of 20% of its capacity available. This will be filled with parity information later on for redundancy.</p> <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>$ split -d -b 3700M <span class="m">2012</span>.tar.bz <span class="m">2012</span>.tar.bz. </code></pre></div></td></tr></table></div> <h3>Encrypt</h3> <p>Leaving unencrypted data around is <a href="http://www.youtube.com/watch?v=OwHrlM4oVSI">bad form</a>. The archive (or each of the files resulting from splitting the large archive) is next encrypted and signed with <a href="http://www.gnupg.org/">GnuPG</a>.</p> <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span> <span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><code>$ gpg -eo <span class="m">2012</span>.tar.bz.gpg <span class="m">2012</span>.tar.bz $ gpg -bo <span class="m">2012</span>.tar.bz.gpg.sig <span class="m">2012</span>.tar.bz.gpg </code></pre></div></td></tr></table></div> <h2>Imaging</h2> <p>The encrypted archive and the detached signature of the encrypted archive are what will be burned to the disc. (Or, in the case of a large archive, the encrypted splits of the full archive and the associated signatures will be burned to one disc per split/signature combonation.) Rather than burning them directly, an image is created first.</p> <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>$ mkisofs -V <span class="s2">&quot;Photos: 2012 1/1&quot;</span> -r -o <span class="m">2012</span>.iso <span class="m">2012</span>.tar.bz.gpg <span class="m">2012</span>.tar.bz.gpg.sig </code></pre></div></td></tr></table></div> <p>If the year has a split archive requiring multiple discs, I modify the sequence number in the volume label. For example, a year requiring 3 discs will have the label <code>Photos: 2012 1/3</code>.</p> <h3>Parity</h3> <p>When I began this project I knew that I wanted some sort of parity information for each disc so that I could potentially recover data from slightly damaged media. My initial idea was to use <a href="http://en.wikipedia.org/wiki/Parchive">parchive</a> via <a href="https://github.com/BlackIkeEagle/par2cmdline">par2cmdline</a>. Further research led me to <a href="http://dvdisaster.net/en/index.html">dvdisaster</a> which, despite being a GUI-only program, seemed more appropriate for this use case.</p> <p>Both dvdisaster and parchive use the same <a href="http://en.wikipedia.org/wiki/Reed–Solomon_error_correction">Reed–Solomon error correction codes</a>. Dvdidaster is aimed at optical media and has the ability to place the error correction data on the disc by <a href="http://dvdisaster.net/en/howtos30.html">augmenting the disc image</a>, as well as <a href="http://dvdisaster.net/en/howtos20.html">storing the data separately</a>. It can also <a href="http://dvdisaster.net/en/howtos10.html">scan media for errors</a> and assist in judging when the media is in danger of becoming defective. This makes it an attractive option for long-term storage.</p> <p>I use dvdisaster with the <a href="http://dvdisaster.net/en/howtos32.html">RS02</a> error correction method, which augments the image before burning. Depending on the size of the original image, this will result in the disc having anywhere from 20% to 200% redundancy.</p> <h3>Verify</h3> <p>After the image has been augmented, I mount it and verify the signature of the encrypted file on the disc against the local copy of the signature. I&rsquo;ve never had the signatures not match, but performing this step makes me feel better.</p> <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span> <span class="normal">2</span> <span class="normal">3</span></pre></div></td><td class="code"><div><pre><span></span><code>$ sudo mount -o loop <span class="m">2012</span>.iso /mnt/disc $ gpg --verify <span class="m">2012</span>.tar.bz.gpg.sig /mnt/disc/2012.tar.bz.gpg $ sudo umount /mnt/disc </code></pre></div></td></tr></table></div> <h3>Burn</h3> <p>The final step is to burn the augmented image. I always burn discs at low speeds to diminish the chance of errors during the process.</p> <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code>$ cdrecord -v <span class="nv">speed</span><span class="o">=</span><span class="m">4</span> <span class="nv">dev</span><span class="o">=</span>/dev/sr0 <span class="m">2012</span>.iso </code></pre></div></td></tr></table></div> <p>Similar to the optical backups of my <a href="/2013/04/password-management-vim-gnupg/">password database</a>, I burn two copies of each disc. One copy is stored off-site. This provides a reasonably level of assurance against any loss of my photos.</p>Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.2012-11-12T00:00:00-08:002012-11-18T00:00:00-08:00Pig Monkeytag:pig-monkey.com,2012-11-12:/2012/11/never-underestimate-bandwidth-station-wagon-full-tapes-hurtling-down-highway/<p>Terence Eden <a href="http://shkspr.mobi/blog/2012/11/smuggling-usb-sticks/">points out</a> that censorship becomes more difficult as flash memory devices become smaller and gain greater capacity. Case in point: Director <a href="https://en.wikipedia.org/wiki/Jafar_Panahi">Jafar Panahi</a> smuggled <a href="http://uk.imdb.com/title/tt1667905/">This Is Not a Film</a> out of Iran <a href="http://manila-bulletin.net/blog/2011/11/05/film-smuggled-in-usb-up-for-screening-at-13th-cinemanila/">on a flash-drive hidden in a cake</a>. For me, the practicality of the <a href="https://en.wikipedia.org/wiki/Sneakernet">sneakernet</a> became revitalized …</p><p>Terence Eden <a href="http://shkspr.mobi/blog/2012/11/smuggling-usb-sticks/">points out</a> that censorship becomes more difficult as flash memory devices become smaller and gain greater capacity. Case in point: Director <a href="https://en.wikipedia.org/wiki/Jafar_Panahi">Jafar Panahi</a> smuggled <a href="http://uk.imdb.com/title/tt1667905/">This Is Not a Film</a> out of Iran <a href="http://manila-bulletin.net/blog/2011/11/05/film-smuggled-in-usb-up-for-screening-at-13th-cinemanila/">on a flash-drive hidden in a cake</a>. For me, the practicality of the <a href="https://en.wikipedia.org/wiki/Sneakernet">sneakernet</a> became revitalized after I began using <a href="http://git-annex.branchable.com/">git-annex</a> earlier this year.</p>