Back in 2003, Amazon.com had a feature which was called “The Page You Made”. This page would automatically collect every item that you clicked on during a session on the site, and automatically add it to list which it would present to you. It's rather similar to the current feature, Your browsing history, except the latter doesn't seem to expire items. It would also show recommendations.

We want to make it easy for you to find what you're looking for at Amazon.com. The Page You Made and Your Recent History are meant to help you keep track of some of the items you've recently viewed and help you find related items that might be of interest. As you browse through the store, we will bring to your attention items similar to those you are looking at. Since your browsing habits change frequently, Your Recent History changes as well. Your sessions expire after a few days and are not stored on the Amazon.com site. This way we can offer you the most relevant purchase suggestions for your recent shopping sessions on the Page You Made. We also give you the ability to alter Your Recent History, by removing recently viewed products or clearing all items. To add pages to Your Recent History, just visit new items that interest you.

There is no punchline or upshot to this; just recording the existence of such a thing. 20 years later, traces of it on the internet are nearly entirely gone.

Posted 2022-12-27

In the sales I purchased a large Western Digital external HDD. I don't really trust hard disks anymore, but all other options are uneconomical or equally untrustworthy, so it's all I have for now. At least it's guaranteed. Anyway I face some troubles when backing up. I had to use gdisk to create the GPT partition instead of parted, for reasons I can't really fathom, but I'll probably stick with gdisk until further notice now.

After repartitioning the new drive the next task was to consolidate 2 generations worth of backup data onto it. I usually stick with rsync -aPv to mirror file trees. I copied all the data from both generations into subdirectories on the same disk. However, it also contains several complete Linux filesystems with archived copies of /sys, /proc, /run, and other data that I don't really care about.

rsync -aPv expands to rsync -rlptgoDPv. -D is not really wanted, though; it's an abbreviation for --devices and --specials, which we don't want. However, all other options we want. We want to be able to do all operations as a regular user. Although some files are sensitive, the backups live in a privileged space, so perhaps we don't care too much about security within the space itself. In this case we can do chmod -R o+rX /tree. This uses X for "special execute" which will make directories world-executable while not affecting the status of the execute bit for files. It will also make everything world-readable which obviously comes with heavy caveats.

We can add the -u or --update option to the rsync command, this will overwrite identically named files in the tree with newer versions from the source. Obviously this does have the potential to lose data, but it may be a reasonable trade-off to make the filesystem more manageable; YMMV. As we're not using --delete the target tree will essentially be an accretion of files; files that get moved will potentially create duplicates. We consider this an OK trade off relative to the dangers of using --delete.

You can use the --log-file=foo.log option to store all progress to a log file which you can examine afterward. You'll want to vet the transfer reasonably carefully to make sure everything completed and you're not deleting potentially valuable things.

Posted 2022-12-26

Just finished watching Borgen: Power and Glory and wanted to give a few thoughts on it.

Initially the most surprising thing about Borgen: Power and Glory is how similar it is to the 2010s series. Who's still here? The ubiquitous Søren Malling is back as a slightly rounder Torben Friis, whose story arc was the standout of series 3. Of course Nyborg is back, and looking much the same; Katrine Fønsmark returns with a more harried visage. A new actor plays Magnus, well-cast; Laura shows up as well, sadly only for a couple of brief cameos. Søren Ravn, also one of the highlights of series 3, is back as well. The themes tread familiar ground of power and negotiation in the political and personal sphere.

Of course, the show had to be updated for the 2020s. One notable absence is the open sexism in the newsroom. In the original series, TV1 newsroom cads would frequently assess female anchors candidly on their looks, unopposed by Friis. Not so here: Katrine Fonsmark now heads up TV1. One may find this a pristinely 'woke' appointment on the surface, but the wrinkle lies in her constant conflicts with her staff. Fonsmark, who was an intransigent rebel under Torben Friis's direction becomes, in effect, "the establishment", makes numerous concessions to pragmatism, and is in turn forced to question herself by young staffers. Mie's request for maternity leave echoes Katrine's similar injunction to Friis in Borgen, when she asks for permission to date Kasper. Fonsmark mishandles this, and the situation explodes in her face through social media. The tweets and messages sent by the characters appear on the screen to advance the plot, unlike the original series where these happened mainly by phone. This is only partially successful: it contributes to a pacing problem with the show.

Fonsmark feuds with the West Asian-looking star anchor, Narciza Aydin, who continually breaks agreements with interviewees in order to focus on pressing human rights issues. Fonsmark falls into the role that Friis formerly had with respect to herself, acting as a constraining force on a maverick reporter. However Narciza is presented rather unsympathetically (and don't you think her name is rather on-the-nose?) Fonsmark, on the other hand, seems to handle the situation rather badly, bulldozing through the office and nakedly asserting her authority. One wonders if the writers are pointing out that this, too, has changed, that these tactics no longer works in a modern office environment; whether, put bluntly, these entitled millenials are not respecting the chain-o'-command. Regardless, the viewer feels sorry for Fonsmark, for whom this conflict with Narciza precipitates a small-scale mental breakdown.

The figure of Asger Holm Kierkegaard is an interesting one. He is something of a nebbish, sharply dressed but lacking street-smarts, and somewhat lacking in a certain masculine strength -- witness his fear of flying and perpetual motion sickness. One wonders if he's intended to replace Kasper Juul. Juul was an archetype of toxic masculinity: promiscuous, troubled, emotionally inarticulate, and supremely square-jawed. Asger is few of these things, though arguably he is promiscuous, entering into an affair with the Greenlandic ambassador's wife. I enjoyed this plotline for its workaday nature: the affair was not glamourized, but rather presented as something ignoble, slightly sordid but still rather touching. The plotline with Tanja I largely did not follow, although Malik was a good character; shame that he had to die to progress the storyline.

Nyborg's storyline is good, though the international politics are sometimes inscrutable. The intense coalition politics of Borgen seems to be largely absent here, which makes sense plot-wise given Nyborg's status as foreign minister rather than PM, but I slightly rued its absence nonetheless. Nyborg's moments with Magnus are great fanservice for viewers of the original series. Their arguments and battles seem wholly believable. Nyborg's descent into cynical political manouevring, aided and abetted by Laugesen, also seems realistic, troubling as it is. Laugesen is well used here as the Dark Lord of spin. Magnus always had a troubled relationship with his mother, so the foreshadowing of the original series plays well here.

The show has been reworked to have an overarching plot for the whole series, rather than a "monster of the week" story structure, as the original series had. This mode is certainly de rigeur for a modern prestige TV series. However the show suffers from questionable pacing. The first 30 minutes of every hour-long episode struggles to keep the viewer's attention as it sets up the drama of the final half. There are facts, figures, and dialogue flying around as fast as the eye can see; blink and you could miss a key plot point. This was the biggest issue with the show to me. It would probably benefit from a second watch.

Posted 2022-11-04

Also known, frustratingly, as "Spot It".

These last two links are problems that are not identical to the Dobble-generation problem, but are related to it.

Posted 2022-09-18

The basic Puppet template that I use to deploy is this:

class main::my_app($server_name, $owner, $group) {
    $web_root = '/srv/http/my-app'
    $backend_root = '/usr/local/lib/my-app'
    $wsgi_path = "${backend_root}/my-app.wsgi"

    file { $backend_root:
        ensure => directory,
        owner => $owner,
        group => $group,
        mode => '0755'
    }

    apache::vhost { $server_name:
        port => '80',
        docroot  => $web_root,
        wsgi_application_group => '%{GLOBAL}',
        wsgi_daemon_process => 'my-app',
        wsgi_process_group => 'my-app',
        wsgi_daemon_process_options => {
            home => $backend_root,
            python-path => $backend_root
        }
        wsgi_script_aliases => { '/' => $wsgi_path }
    }
}

Here, wsgi_script_aliases is the really key point, as this is the canonical way to specify the entry point for a WSGI application. The other settings here are mostly form-filling.

  • Permissions are set user-wise to allow syncing code easily from a dev machine.
  • The value of wsgi-daemon-process needs to be unique across the whole server.
  • Name the .wsgi file with the basename of the project.
  • The Flask entry point should be 'app.py' as is customary.
  • This then allows the .wsgi file to import the app directly.
  • You can touch the script file to force a code reload. This will reload all the code, equivalent to a full process kill and restart.

The contents of my-app.wsgi can just be something like this, in the case of a Flask application:

from app import app

application = app

mod_wsgi expects the name application to contain a WSGI app.

Posted 2022-09-03

At $WORKPLACE I've been observing with interest the different species of interactions that can happen during group chat. Let's divide them into three different areas.

$WORKPLACE_ALPHA -- An agency. This was fun, we set up bots that created a lot of noise in the channel. Remember this was pre-covid, so the chat service in question, HipChat, was a side channel, not a hard requirement for communication. Everything was conducted in a single main channel, all real-time work collaboration went through here. There was no threading. As the team was small, we were all able to keep track of things fairly well, we would sometimes go into DMs if things got complicated.

$WORKPLACE_BETA -- This was a much more stoical workplace Slack, with zero real "banter" and no arguments. I felt quite apprehensive about posting in this Slack. There were only a few channels and little activity; few problems required real-time collaboration to resolve, and if that did become necessary, it was done in DMs. Email was still heavily used here.

$WORKPLACE_GAMMA -- They use Slack as a real "replacement for email", in the way it's supposed to be used. This means a lot of channels, a LOT of threading, heavy reliance on emoji reactions, and while it does feel casual, there's also not much off-topic chat because of the emphasis on signal.

There is still heavy disagreement on the relative merits of coarse-grained vs fine-grained channels. I tend to favour coarse-grained channels, but on the other hand, when you have automations and other non-natural inputs affecting channels, channels can end up functioning more like a specialized Twitter stream or similar, in which case that channel does need to be fine-grained. It's clear that there's a positive correlation between large and coarse-grained channels and off-topic banter. (I don't mean to thereby claim that fine-grained is preferable, or the converse.)

Posted 2022-07-20

I had to download a bunch of files from a Cloudflare-protected site. Said site is annoying to scrape, because it is heavily using JS to generate unguessable links. Selenium is a possibility but I found that it would be defeated by these Cloudflare CAPTCHAs. Cloudscraper is not feasible either because links are not present in the downloaded HTML directly.

It's an acceptable solution to click the links manually in this case, but then another problem emerges. Firefox's download manager is rather substandard. This isn't an issue in most cases, but in this particular case it was a limiting factor, as downloads would fail and not be retried. Firefox also doesn't limit in-progress downloads, meaning they would saturate the connection and fail.

One possibility is you can use aria2c in daemon mode, and use a Firefox extension to add downloads to it. Aria2c will then queue the downloads and run them in order, with optional retry. The extension embeds something called AriaNg, which provides a nice web interface on top of the aria2c daemon, so it's actually quite friendly. The only tricky part is that you may need to start aria2c yourself. You can do that using the config below, which I took from the Arch wiki.

aria2c must be invoked with the --conf-path option to use this.

continue
daemon=true
dir=/mnt/disk/mydownloads
file-allocation=falloc
log-level=warn
max-connection-per-server=4
max-concurrent-downloads=3
max-overall-download-limit=0
min-split-size=5M
enable-http-pipelining=true

enable-rpc=true
rpc-listen-all=true
rpc-secret=xyzzy

retry-wait=5
Posted 2022-07-08

Here I've been attempting to detail my setup that I revised as of the end of 2021.

DHCP configuration -- This is done using isc-dhcp-server on Debian. I use the network range 192.168.0.x for my internal network, being fairly small. Dynamic DHCP assignments are restricted to the .16 - .127 range. I create host stanzas to store reservations, e.g.:

host sprinkhaan {
  hardware ethernet DC:53:60:F3:A7:FF;
  fixed-address 192.168.0.3;
}

I use option domain-search "phys.solasistim.net" to set up the default search domain for all DHCP clients.

DNS configuration -- This is done using BIND. All hosts receive 192.168.0.1 as their DNS server. This forwards to my ISP's DNS servers (Zen, who have been great so far). Using BIND I am able to create "split-horizon" DNS so that e.g. my Puppet server resolves to an internal IP when requested internally. The main zone file defines all hosts in phys.solasistim.net. (The original idea here was to draw a sharp distinction between physical and virtual hosts, but I've become less certain on this. Still, it's good to use a separate domain from the real solasistim.net.) The downside of this setup is that some duplication of the assignments is needed, specifically the fixed-address setup in dhcpd.conf needs to be essentially replicated in solasistim.zone. This doesn't actually matter much in practice because I don't add hosts very often.

PPP interface -- This is done using pppd and a Draytek Vigor 130. The provider file that I use for pppd is fairly standard. I based it on a useful post from a person called Ruben. The only real difference between the two configurations is that I do have to provide my real account password in chap-secrets.

Firewall configuration -- I use nftables. There's only one really notable thing that I do in nftables, which is a bit complicated, and that's clamp my TCP maximum segment size to my path MTU.

table inet filter {
    # ...
    chain forward {
        # ...
        tcp flags syn tcp option maxseg size set rt mtu
        # ...
    }
}

You can find more information on this at the nftables wiki. This problem manifests itself in strange ways with some web sites simply timing out when you attempt to connect to them, while most sites work fine. For instance lbc.co.uk and atlassian.net had this strict MTU requirement as of late 2021. There may also be other ways to address this. I know that setting MTU on clients also resolves the issue, but not all devices seem to respect the MTU setting when it's sent in DHCP. It may also be that setting mtu in /etc/network/interfaces will make a difference -- I have never tried this.

Interface configuration -- In /etc/network/interfaces, we create the regular ppp0 interface that is used for WAN access.

auto ppp0
iface ppp0 inet ppp
    provider zen

We create a bridge between two interfaces. One has a Wifi AP and one has a hardware switch.

auto br0
iface br0 inet static
    address 192.168.0.1
    bridge_ports eno2 eno3

The wifi AP and switch just use their standard Draytek firmware for configuration, which seems to work fine so far.

A few notes/updates: The Vigor 130 has a CLI interface that is accessible via telnet (I believe). This allows some more advanced operations to be performed. There were also some concerns about whether the Zen connection should be Annex A or Annex B. I think we eventually came to the conclusion that it should be Annex A. During the setup of the line, there were frequent connection drops, where the pppd log would read "Modem hangup". These eventually seem to have just gone away without further intervention on my part. I can only assume this was the "line training" phase that is much discussed on UK broadband forums.

Posted 2022-05-11

There's a genre of blog posts that frequently get linked on Hacker News, which are basically attempts to explain why problems that seem conceptually simple are very complex when attempted within an organization.

I'm referring to these as Chesterton's fence posts. There's an analogy to the eponymous principle: those not directly involved in the day to day materiality of commercial software production cannot see the looming network of past failures that has led to the current abundance of caution. Of course, one could still make an argument that some of these temporal drags on tasks completion are due to organizational dysfunction rather than prudence.

How many Microsoft employees does it take to change a lightbulb?

I could do that in a weekend!

The unexpected complications of minor features

Simple software things that are actually very complicated

Reality has a surprising amount of detail

A similar, famous example on HN is that of Dropbox, famously dismissed as something trivially accomplishable via rsync -- a fact that's not wrong, but is nonetheless missing the point.

Posted 2022-04-21

This is an extract from an unpublished paper. I first heard of this concept on the excellent Talking Politics podcast (RIP).


The Copernican principle (or Copernicus method) is a concept developed by American astrophysicist J. Richard Gott. The principle was first hypothesized by Gott in 1969. It is a probability-based method that allows estimating the potential lifespan of observable objects without any further information, other than the dates of their observation.

The method is based on dividing the hypothetical lifespan of a thing into four quarters. It yields a lower bound on the future lifespan of a thing and an upper bound.

If I am observing the object at the beginning of the middle period, there are three quarters of its lifespan left to go. That means that the future is three times as long as the past. However if one assumes one is at the end of that middle period, the future is only one-third as long as the past.

The canonical example used by Gott is the Berlin Wall. Here are the facts we know in this application:

  • We visited the wall in 1969.
  • The wall is eight years old at that time.

Thus the lower bound for the time-left is given by 8 × ⅓, that is, two and two thirds. This would mean that the earliest date for the "end" of the object, i.e. the collapse of the Wall, is two-thirds of the way through 1971.

The upper bound, on the other hand, is given by 8 × 3, 24 years. This means that the upper bound for the fall of the Wall is 1993.

The Copernicus method would claim that there is a 50% chance that the lifespan falls into these bounds.

As one can see this produces a fairly wide interval: just over 21 years in the case of an 8-year-old object. A 45-year-old object gives an interval of 120 years. In fact, the interval size is given by 8x/3, or equivalently, multiplying by two and two thirds. So these bounds are very large, and remember they're always mediated by the condition that the probability of these bounds being true at all is given at only 0.5.

A similar concept is the `Lindy effect' which was first named by Albert Goldman. Goldman's concept was rather divorced from the one that is current now, though: that concept finds its ancestor in Benoit Mandelbrot. The point can be crudely summarized in the following manner: the future predicted lifespan of a thing varies proportionally to its past lifespan.

It's unclear whether the Copernican method applies to repeated 'observations'. For instance, if Gott were to revisit the Berlin wall in, say, 1975, would the same calculation apply? Do we gain any more information by having that six-year gap, beyond the fact that the wall is now 14 years old?

One can use this information to create heuristics based on how long a future piece of knowledge is likely to be valuable for. For instance, SQL was created in 1974, making it 48 years old as of this writing. The lower bound for SQL's lifespan is thus 2038, making it likely a worthwhile investment. However SQL is an extreme outlier in this scenario. As a matter of simple mental arithmetic, if one decides to focus on current media from 1992, such media has lasted thirty years: one can see the lower bound of its lifespan under the Copernican principle would be ten years.

Of course, this only applies in the scenario where we lack other information. If we assume that certain allegations cast a shadow over Woody Allen's career, it may not be prudent to assume that simply because of the empirical fact of his film presenting itself to consciousness that the Copernican principle applies. The lifespan may be artificially shortened, or behave nonlinearly.

Posted 2022-04-10
Schubert's Winterreise
Posted 2022-02-14
Buster to Bullseye
Posted 2021-09-01
React Pain Points
Posted 2021-04-08
Neo4j rearrangeable list
Posted 2020-10-14
OOB redirect_uri values
Posted 2020-10-07
Quick HTTP server
Posted 2020-10-07
The Lowest UUIDv4
Posted 2020-09-24
Ad-hoc Patreon audio scraping
Posted 2020-05-17
SSH key setup
Posted 2019-09-11
Stretch to Buster
Posted 2019-08-05
Subprocess Pipe Comparison
Posted 2019-07-02
The X3 Wiki Archive
Posted 2019-06-16
Fabric 2 cheat sheet
Posted 2019-03-05
Using comboboxes in Qt5
Posted 2019-02-27
System Puppet, CentOS 7 Client
Posted 2019-02-25
X3 savegames
Posted 2019-02-02
Shadow Tween technique in Vue
Posted 2019-01-06
Width list transition in Vue
Posted 2018-12-18
Emoji Representations
Posted 2018-09-14
Thoughts on Cheesesteak & More
Posted 2018-08-29
Vue + GraphQL + PostgreSQL
Posted 2018-07-20
Neo4j Cypher query to NetworkX
Posted 2018-05-09
FP & the 'Context Problem'
Posted 2018-02-27
Cloake Vegetable Biryani
Posted 2018-02-25
FFXII Builds
Posted 2018-02-02
Custom deployments solution
Posted 2017-12-09
SCons and Google Mock
Posted 2017-11-30
Sunday Lamb Aloo
Posted 2017-11-19
centos 6 debian lxc host
Posted 2017-11-03
Srichacha Noodle Soup
Posted 2017-10-17
Kaeng Kari
Posted 2017-10-13
Ayam Bakar (Sri Owen)
Posted 2017-10-12
Pangek Ikan (Sri Owen)
Posted 2017-10-12
Chicken Tikka Balti Masala
Posted 2017-10-06
Clojure Log Configuration
Posted 2017-09-28
Clojure Idioms: strict-get
Posted 2017-09-28
About
Posted 2017-09-18
Philly Cheesesteak
Posted 2017-09-14
Welcome
Posted 2017-09-13
Srichacha Kaeng Pa
Posted 2017-08-31
Malaidar Aloo
Posted 2017-08-10
BBQ Balti Chicken
Posted 2017-07-19
Sabzi Korma
Posted 2017-07-18
Vegetable Tikka Masala
Posted 2017-07-02
Soto Ayam
Posted 2017-06-08
Bombay Aloo w/Bunjarra
Posted 2017-06-03
Chicken Dopiaza
Posted 2017-06-01
LJ Bunjarra
Posted 2017-05-31
Glasgow Lamb Shoulder Tikka
Posted 2017-05-24
Tofu Char Kway Teow
Posted 2017-05-12
King Prawn Balti
Posted 2017-04-24
Ad-hoc Quorn Rogan Josh
Posted 2017-04-15
Glasgow Vindaloo
Posted 2017-03-28
Rempeyek
Posted 2017-03-26
Toombs Saag Balti
Posted 2017-02-25
Glasgow Bombay Rogan Josh
Posted 2017-02-21
Glasgow Chicken Balti
Posted 2017-02-16
Quorn Balti & Cloake Naan
Posted 2017-02-03
Two Spice Marinades
Posted 2017-01-18

This blog is powered by coffee and ikiwiki.