pig-monkey.com

You are currently viewing all posts tagged with toolchain.

Monitoring Legible News

I was sent a link to Legible News last November by someone who had read my post on the now-defunct Breaking News. Legible News is a website that simply scrapes headlines from Wikipedia’s Current Events once per day and presents them in a legible format. This seems like a simple thing, but is far beyond the capabilities of most news organizations today.

Legible News provides no update notification mechanism. I addressed this by plugging it into my urlwatch system. Initially this presented two problems: the email notification included the HTML markup, which I didn’t care about, and it included both the old and new content of every changed line – effectively sending me the news from today and yesterday.

The first problem was easily solved by using the html2text filter provided by urlwatch. This strips out all markup, which is what I thought I wanted. I ran this for a bit before deciding that I did want the output to contain links. What I really wanted was some sort of html2markdown filter.

I also realized I did not just want to be sent new lines, but every line anytime there was a change. If the news yesterday included a section titled “Armed conflicts and attacks”, and the news today included a section with the same title, I wanted that in my output despite it not having changed.

I solved both of these problems using the diff_tool argument of urlwatch. This allows the user to pass in a special tool to replace the default use of diff to generate the notification output. The tool will be called with two arguments: the filename of the previously downloaded version of the URL and the filename of the current version. I wrote a simple script called html2markdown.sh which ignores the first argument and simply passes the second argument to Pandoc for formatting.

1
2
3
4
5
6
7
#!/bin/sh

pandoc --from html \
--to markdown_strict \
--reference-links \
--reference-location=block \
$2

This script is used as the diff_tool in the urlwatch job definition.

1
2
3
4
kind: url
name: Legible News
url: https://legiblenews.com/
diff_tool: /home/pigmonkey/bin/html2markdown.sh

The result is the latest version of Legible News, nicely converted to Markdown, delivered to my inbox every day. The output would be even better if Legible News used semantic markup – specifically heading elements – but it is perfectly serviceable as is.

After I built this I discovered that somebody had created an RSS feed for Legible News using a service called Feed43.

I use FeedIron to repair neutered RSS feeds.

FeedIron is a plugin for my feed reader, Tiny Tiny RSS. It takes broken, partial feeds and extracts the full article content, allowing me to read the article in my feed reader the way god intended. The plugin can be configured to extract content using a number of filters. I find that using the xpath filter to specify an element on the page like div[@class='entry-content'] corrects most neutered feeds.

This post was published on . It was tagged with micro, toolchain.

I use urlwatch to monitor the global information super highway.

urlwatch is a simple program that monitors a list of URLs and sends an alert when it detects a change. It can be configured to only look for changes within certain HTML elements, or to grep for certain strings. I configure it to send me the changes via email. As with RSS-Bridge, this tool is part of my strategy to liberate content from toxic silos and Make the Internet Great Again™.

This post was published on . It was tagged with micro, toolchain.

Personal Information Management

pimutils is a collection of software for personal information management. The core piece is vdirsyncer, which synchronizes calendars and contacts between the local filesystem and CalDav and CardDAV servers. Calendars may then be interacted with via khal, and contacts via khard. There’s not much to say about these three programs, other than they all just work. Having offline access to my calendars and contacts is critical, as is the ability to synchronize that data across machines.

Khard integrates easily with mutt to provide autocomplete when composing emails. I find its interface for creating, editing and reading contacts to be intuitive. It can also output a calendar of birthdays, which can then be imported into khal.

Khal’s interface for adding new calendar events is much simpler and quicker than all the mousing required by GUI calendar programs.

$ khal new 2019-11-16 21:30 5h Alessandro Cortini at Public Works :: 161 Erie St

There are times when a more complex user interface makes calendaring tasks easier. For this Khal offers the interactive option, which provides a TUI for creating, editing and reading events.

Khal can also import iCalendar files, which is a simple way of getting existing events into my world.

$ khal import invite.ics

Vdirsyncer has maintenance problems that may call its future into question, but the whole point of modular tools that operate on open data formats is that they are replaceable.

I have a simple and often used script which calls khal calendar and task list (the latter command being taskwarrior), answering the question: what am I supposed to be doing right now?

Terminal Calculations

Qalculate! is a well known GTK-based GUI calculator. For years I ignored it because I failed to realize that it included a terminal interface, qalc. Since learning about qalc last year it has become my go-to calculator. It supports all the same features as the GUI, including RPN and unit conversions. I primarily use GNU Units for unit wrangling, but being able to perform unit conversions within my calculator is sometimes useful.

$ qalc
> 1EUR to USD
It has been 20 day(s) since the exchange rates last were updated
Do you wish to update the exchange rates now? y

  1 * euro = approx. $1.1137000

> 32oC to oF

  32 * celsius = 89.6 oF

The RPN mode is not quite as intuitive as a purpose built RPN calculator like Orpie, but it is adequate for my uses. My most frequent use of RPN mode is totaling a long list of numbers without bothering with all those tedious + symbols.

> rpn on
> stack
The RPN stack is empty
> 85

  85 = 85

> 42

  42 = 42

> 198

  198 = 198

> 5

  5 = 5

> 659

  659 = 659

> stack

  1:    659
  2:    5
  3:    198
  4:    42
  5:    85

> total

  total([659, 5, 198, 42, 85]) = 989

> stack

  1:    989

Also provided are some basic statistics functions that can help save time.

> mean(2,12,5,3,1)
  mean([2, 12, 5, 3, 1]) = 4.6

And of course there are the varaibles and constants you would expect

> 12+3*8)/2
  (12 + (3 * 8)) / 2 = 18
> ans*pi
  ans * pi = 56.548668

I reach for qalc more frequently than alternative calculators like bc, insect, or the Python shell.

I use Blokada to reduce the amount of advertisements on my telephone.

Blokada registers itself as a VPN service on the phone so that it can intercept all network traffic. It then downloads filter lists to route the domains of known advertisers, trackers, etc to a black hole, exactly like what I do on my real computer with hostsctl. For me it has had no noticeable impact on battery life. I have found it especially useful when travelling internationally and purchasing cellular plans with small data caps. The only disadvantage I have found is that Blokada must be disabled when I want to connect to a real VPN via WireGuard or OpenVPN.

Blokada must be installed via F-Droid (or directly through the APK) because Google frowns upon blocking advertisements (but at least Google allows you to install software on your telephone outside of their walled garden, unlike their competitor).