pig-monkey.com

You are currently viewing all posts tagged with toolchain.

Redswitch

Redshift is a program that adjusts the color temperature of the screen based on time and location. It can automatically fetch one’s location via GeoClue. I’ve used it for years. It works most of the time. But, more often than I’d like, it fails to fetch my location from GeoClue. When this happens, I find GeoClue impossible to debug. Redshift does not cache location information, so when it fails to fetch my location the result is an eye-meltingly bright screen at night. To address this, I wrote a small shell script to avoid GeoClue entirely.

Redswitch fetches the current location via the Mozilla Location Service (using GeoClue’s API key, which may go away). The result is stored and compared against the previous location to determine if the device has moved. If a change in location is detected, Redshift is killed and relaunched with the new location (this will result in a noticeable flash, but there seems to be no alternative since Redshift cannot reload its settings while running). If Redshift is not running, it is launched. If no change in location is detected and Redshift is already running, nothing happens. Because the location information is stored, this can safely be used to launch Redshift when the machine is offline (or when the Mozilla Location Service API is down or rate-limited).

My laptop does not experience frequent, drastic changes in location. I find that having the script automatically execute once upon login is adequate for my needs. If you’re jetting around the world, you could periodically execute the script via cron or a systemd timer.

This solves all my problems with Redshift. I can go back to forgetting about its existence, which is my goal for software of this sort.

Browser Extensions

I try to keep the number of browser extensions I use to a minimum. The following are what I find necessary in Firefox.

ClearURLS

ClearURLs removes extra cruft from URLs. I don’t really a problem with things like UTM parameters. Such things seem reasonable to me. But, more broadly, digital advertising has proved itself hostile to my interests, so I choose to be hostile right back.

Cookie AutoDelete

Cookie AutoDelete deletes cookies after a tab is closed or the domain changes. I whitelist cookies for some of the services I run, like my RSS reader, but every other cookie gets deleted 10 seconds after I leave the site. The extension can also manage other data stores, like IndexedDB and Local Storage.

Feed Preview

Feed Preview adds an icon to the address bar when a page includes an RSS or Atom feed in its header. This used to be built in to Firefox, but for some inexplicable reason they removed it some years ago now. Removing the icon broke one of the core ways that I use a web browser. As the name suggests, the extension can also render a preview of the feed. I don’t use it for that. I just want my icon back.

Firefox Multi-Account Containers

Firefox Multi-Account Containers is a Mozilla provided extension to create different containers and assign domains to them. In modern web browser parlance, a container means isolated storage. So a cookie in container A is not visible within container B, and vice versa.

Temporary Containers

Temporary Containers is the real workhorse of my containment strategy. It generates a new, temporary container for every domain. It automatically deletes the containers it generates 5 minutes after the last tab in that container is closed. This effectively isolates all domains from one another.

History Cleaner

History Cleaner deletes browser history that is older than 200 days. History is useful, but if I haven’t visited a URL in more than 200 days, I probably no longer care about. Having all that cruft automatically cleaned out makes it easier to find what I’m looking for in the remaining history, and speeds up autocomplete in the address bar.

Redirector

Redirector lets you create pattern-based URL redirects. I use it to redirect Reddit URLs to Teddit, Twitter URLs to Nitter, and Wikipedia mobile URLs to the normal Wikipedia site.

Stylus

Stylus allows custom CSS to be applied to websites. I use it to make websites less eye-burningly-bright. Dark Reader is another solution to this problem, but I found it to be somewhat resource intensive. Stylus lets me darken websites with no performance penalty.

Tree Style Tab

Tree Style Tab moves tabs from the default horizontal bar across the top of the browser chrome to a vertical sidebar, and allows the tabs to be placed into a nested tree-like hierarchy. In a recent-ish version of Firefox, Mozilla uglified the default horizontal tab bar. This was what finally pushed me into adopting tree style tabs. It took me a couple weeks to get used to it, but now I’m a convert. I wouldn’t want to use a browser without it. Unfortunately, the extension does seem to have a performance penalty. Not so much during normal use, but it definitely increases the time required to launch the browser. To me, it is worth it.

uBlock Origin

uBlock Origin blocks advertisements, malware, and other waste. This extension should need no introduction. The modern web is unusable without it. Until recently I used this in combination with uMatrix. I removed uMatrix when it was abandoned by the author, but was pleasantly surprised to find that current versions of uBlock by itself satisfies my needs in this department.

User-Agent Switcher

User-Agent Switcher allows the user-agent string to be changed. It seems odd that the user would need an extension to change the user-agent string in their user agent, but here we are. I mostly use this for testing things.

Vim Vixen

Vim Vixen allows the browser to be controlled using vim-like keys. Back in those halcyon days before Mozilla broke their extension system, I switched between two extensions called Vimperator and Pentadactyl to accomplish this. Those were both complete extensions that were able to improve every interaction point with the browser. Vim Vixen is an inferior experience, but seems to be the best current solution. It’s mostly alright.

Wallabagger

Wallabagger lets me save articles to my Wallabag instance with a single click.

Web Archives

Web Archives allows web pages to be looked up in various archives. I just use it for quick access to the Internet Archive’s Wayback Machine.

This post was published on . It was tagged with toolchain.

Organizing Ledger

Ledger is a double-entry accounting system that stores data in plain text. I began using it in 2012. Almost every dollar that has passed through my world since then is tracked by Ledger.1

Ledger is not the only plain text accounting system out there. It has inspired others, such as hledger and beancount. I began with Ledger for lack of a compelling argument in favor of the alternatives. After close to a decade of use, my only regret is that I didn’t start using earlier.

My Ledger repository is stored at ~/library/ledger. This repository contains a data directory, which includes yearly Ledger journal files such as data/2019.ldg and data/2020.ldg. Ledger files don’t necessarily need to be split at all, but I like having one file per year. In January, after I clear the last transaction from the previous year, I know the year is locked and the file never gets touched again (unless I go back in to rejigger my account structure).

The root of the directory has a .ledger file which includes all of these data files, plus a special journal file with periodic transactions that I sometimes use for budgeting. My ~/.ledgerrc file tells Ledger to use the .ledger file as the primary journal, which has the effect of including all the yearly files.

$ cat ~/.ledgerrc
--file ~/library/ledger/.ledger
--date-format=%Y-%m-%d

$ cat ~/library/ledger/.ledger
include data/periodic.ldg
include data/2012.ldg
include data/2013.ldg
include data/2014.ldg
include data/2015.ldg
include data/2016.ldg
include data/2017.ldg
include data/2018.ldg
include data/2019.ldg
include data/2020.ldg

Ledger’s include format does support globbing (ie include data/*.ldg) but the ordering of the transactions can get weird, so I prefer to be explicit.

The repository also contains receipts in the receipts directory, invoices in the invoices directory, scans of checks (remember those?) in the checks directory, and CSV dumps from banks in the dump directory.

$ tree -d ~/library/ledger
/home/pigmonkey/library/ledger
├── checks
├── data
├── dump
├── invoices
└── receipts

5 directories

The repository is managed using a mix of vanilla git and git-annex.2 It is important to me that the Ledger journal files in the data directory are stored directly in git. I want the ability to diff changes before committing them, and to be able to pull the history of those files. Every other file I want stored in git-annex. I don’t care about the history of files like PDF receipts. They never change. In fact, I want to make them read-only so I can’t accidentally change them. I want encrypted versions of them distributed to my numerous special remotes for safekeeping, and someday I may even want to drop old receipts or invoices from my local store so that they don’t take up disk space until I actually need to read them. That sounds like asking a lot, but git-annex magically solves all the problems with its largefiles configuration option.

$ cat ~/library/ledger/.gitattributes
*.ldg annex.largefiles=nothing

This tells git-annex that any file ending with *.ldg should not be treated as a “large file”, which means it should be added directly to git. Any other file should be added to git-annex and locked, making it read-only. Having this configured means that I can just blindly git annex add . or git add . within the repository and git-annex will always do the right thing.

I don’t run the git-annex assistant in this repository because I don’t want any automatic commits. Like a traditional git repository, I only commit changes to Ledger’s journal files after reviewing the diffs, and I want those commits to have meaningful messages.

Notes

  1. I do not always track miscellaneous cash transactions less than $20. If a thing costs more than that, it is worth tracking, regardless of what it is or how it was purchased. If it costs less than that, and it isn't part of a meaningful expense account, I'll probably let laziness win out. If I buy a $8 sandwich for lunch with cash, it'll get logged, because I care about tracking dining expenses. If I buy a $1 pencil erasure, I probably won't log it, because it isn't part of an account worth considering.
  2. I bet you saw that coming.

Searching Books

ripgrep-all is a small wrapper around ripgrep that adds support for additional file formats.

I discovered it while looking for a program that would allow me to search my e-book library without needing to open individual books and search their contents via Calibre. ripgrep-all accomplishes this by using Pandoc to convert files to plain text and then running ripgrep on the output. One of the numerous formats supported by Pandoc is EPUB, which is the format I use to store books.

Running Pandoc on every book in my library to extract its text can take some time, but ripgrep-all caches the extracted text so that subsequent runs are similar in speed to simply searching plain text – which is blazing fast thanks to ripgrep’s speed. It takes around two seconds to search 1,706 books.

$ time(rga -li 'pandemic' ~/library/books/ | wc -l)
33

real    0m1.225s
user    0m2.458s
sys     0m1.759s

Monitoring Legible News

I was sent a link to Legible News last November by someone who had read my post on the now-defunct Breaking News. Legible News is a website that simply scrapes headlines from Wikipedia’s Current Events once per day and presents them in a legible format. This seems like a simple thing, but is far beyond the capabilities of most news organizations today.

Legible News provides no update notification mechanism. I addressed this by plugging it into my urlwatch system. Initially this presented two problems: the email notification included the HTML markup, which I didn’t care about, and it included both the old and new content of every changed line – effectively sending me the news from today and yesterday.

The first problem was easily solved by using the html2text filter provided by urlwatch. This strips out all markup, which is what I thought I wanted. I ran this for a bit before deciding that I did want the output to contain links. What I really wanted was some sort of html2markdown filter.

I also realized I did not just want to be sent new lines, but every line anytime there was a change. If the news yesterday included a section titled “Armed conflicts and attacks”, and the news today included a section with the same title, I wanted that in my output despite it not having changed.

I solved both of these problems using the diff_tool argument of urlwatch. This allows the user to pass in a special tool to replace the default use of diff to generate the notification output. The tool will be called with two arguments: the filename of the previously downloaded version of the URL and the filename of the current version. I wrote a simple script called html2markdown.sh which ignores the first argument and simply passes the second argument to Pandoc for formatting.

1
2
3
4
5
6
7
#!/bin/sh

pandoc --from html \
--to markdown_strict \
--reference-links \
--reference-location=block \
$2

This script is used as the diff_tool in the urlwatch job definition.

1
2
3
4
kind: url
name: Legible News
url: https://legiblenews.com/
diff_tool: /home/pigmonkey/bin/html2markdown.sh

The result is the latest version of Legible News, nicely converted to Markdown, delivered to my inbox every day. The output would be even better if Legible News used semantic markup – specifically heading elements – but it is perfectly serviceable as is.

After I built this I discovered that somebody had created an RSS feed for Legible News using a service called Feed43.

I use FeedIron to repair neutered RSS feeds.

FeedIron is a plugin for my feed reader, Tiny Tiny RSS. It takes broken, partial feeds and extracts the full article content, allowing me to read the article in my feed reader the way god intended. The plugin can be configured to extract content using a number of filters. I find that using the xpath filter to specify an element on the page like div[@class='entry-content'] corrects most neutered feeds.

This post was published on . It was tagged with micro, toolchain.

I use urlwatch to monitor the global information super highway.

urlwatch is a simple program that monitors a list of URLs and sends an alert when it detects a change. It can be configured to only look for changes within certain HTML elements, or to grep for certain strings. I configure it to send me the changes via email. As with RSS-Bridge, this tool is part of my strategy to liberate content from toxic silos and Make the Internet Great Again™.

This post was published on . It was tagged with micro, toolchain.

Personal Information Management

pimutils is a collection of software for personal information management. The core piece is vdirsyncer, which synchronizes calendars and contacts between the local filesystem and CalDav and CardDAV servers. Calendars may then be interacted with via khal, and contacts via khard. There’s not much to say about these three programs, other than they all just work. Having offline access to my calendars and contacts is critical, as is the ability to synchronize that data across machines.

Khard integrates easily with mutt to provide autocomplete when composing emails. I find its interface for creating, editing and reading contacts to be intuitive. It can also output a calendar of birthdays, which can then be imported into khal.

Khal’s interface for adding new calendar events is much simpler and quicker than all the mousing required by GUI calendar programs.

$ khal new 2019-11-16 21:30 5h Alessandro Cortini at Public Works :: 161 Erie St

There are times when a more complex user interface makes calendaring tasks easier. For this Khal offers the interactive option, which provides a TUI for creating, editing and reading events.

Khal can also import iCalendar files, which is a simple way of getting existing events into my world.

$ khal import invite.ics

Vdirsyncer has maintenance problems that may call its future into question, but the whole point of modular tools that operate on open data formats is that they are replaceable.

I have a simple and often used script which calls khal calendar and task list (the latter command being taskwarrior), answering the question: what am I supposed to be doing right now?