Monday, November 1, 2021


After reading about a storm in a teacup with the which(1) utility in Debian, I decided to play code golf with myself: write a couple of minimalistic which(1) implementations in different languages. Before starting, I thought a shell script would be the cleanest solutions, but that prediction turned out to be wrong.

The spec:

  1. The util should stop immidiately after the first non-existing executable, e.g.

     $ ./my-which ls BOGUS cat
    ./my-which: BOGUS not found in PATH
  2. It should report an error to stderr & return with the code > 1 in case of the error.

The programs below are sorted by terseness.

GNU Make

The Make's manual contains a neat example of pathsearch function that abuses the internal wildcard function in a macro. We can use it with a 'match-anything' target:

#!/usr/bin/make -f
f = $(firstword $(wildcard $(addsuffix /$1,$(subst :, ,$(PATH)))))
%:;@echo $(or $(call f,$@),$(error $@ not found in PATH))

It works like this:

$ ./ ls BOGUS cat
/usr/bin/ls *** BOGUS not found in PATH. Stop.
$ echo $?

That's it. 2 lines + a shebang. If you're unfamiliar with the Make language, I advise you to try it.


A slightly bigger example that still fits in several lines:

#!/usr/bin/env ruby
def f e; (ENV['PATH'] || '').split(?:).map{|d| d+'/'+e}.filter{|p| File.executable?(p)}[0]; end
ARGV.each {|e| puts(f(e) || abort("#{$0}: #{e} not found in PATH")) }

We cheated here a little: there's no check if a file is a directory. Nothing stops you from adding but that increases the length of such a toy program by 18 bytes!


I thought it would be shorter:


f() {
for e in $PATH; do
[ -x "$e/$1" ] && { echo "$e/$1"; return; }
return 1

for d in "$@"; do
f "$d" || { echo "$0: $d not found in PATH" 1>&2; exit 1; }

If you decide to use f() in your scripts, a cut-&-paste won't do: you'll need to save & restore the value of IFS variable & mark e as the local one.

node: callbacks

Async IO doesn't always make life easier.
A philosopher

#!/usr/bin/env node
let fs = require('fs')

let f = (ok, error) => {
let dirs = (process.env.PATH || '').split(':')
return function dive(e) {
let dir = dirs.shift() || error(e); if (!dir) return
let file = dir+'/'+e
fs.access(file, fs.constants.X_OK, err => err ? dive(e) : ok(file))

let args = process.argv.slice(2)
let main = exe => exe && f( e => (console.log(e), main(args.shift())), e => {
console.error(`${process.argv[1]}: ${e} not found in PATH`)
process.exitCode = 1


Again, no checks whether a file is a directory.

We could've avoided callbacks, of course–node has fs.accessSync(), but it throws an exception. Also, just to make this slightly more challenging, I decided to avoid process.exit().

node: FP runs amok

sassa_nf didn't like the example above, mainly because of Array.prototype.shift(), & provided an enhanced version:

#!/usr/bin/env node
const fs = require('fs')
const dirs = (process.env.PATH || '').split(':')

const f = (e, cont) => => d + '/' + e)
.reduce((p, d) => g => p(f => f ? g(f):
fs.access(d, fs.constants.X_OK, err => g(!err && d))),
f => f())(f => f ? (console.log(f), cont()):
(console.error(`${process.argv[1]}: ${e} not found in PATH`), process.exitCode = 1))

process.argv.slice(2).reduce((p, c) => g => p(_ => f(c, g)), f => f())(_ => _)

To understand how it works, you'll need to reformat the arrow function expressions. Nevertheless, I think it serves an artistic purpose as is.

Node, async/await

Certainly, callbacks were an unfortunate chain of events. Thankfully, we have promises for a long time now.

#!/usr/bin/env node

let {access} = require('fs/promises')

let afilter = async (arr, predicate) => {
return (await Promise.allSettled(
.filter( v => v.status === 'fulfilled').map( v => v.value)

let f = e => afilter((process.env.PATH || '').split(':'), async p => {
await access(p+'/'+e, 1)
return p+'/'+e

async function main() {
let args = process.argv.slice(2).map( async p => {
return {exe: p, location: await f(p)}

for await (let r of args) {
if (!r.location.length) {
console.error(`${process.argv[1]}: ${r.exe} not found in PATH`)
process.exitCode = 1


This was tested with node v17.0.1.

I leave it up to you to judge which one of the node variants is more idiotic.


It was impossible to leave it out. It's the longest one, but I consider all the node examples much worse.

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <err.h>
#include <string.h>
#include <limits.h>
#include <stdbool.h>

bool is_exe(const char *name) {
struct stat s;
if (stat(name, &s)) return false;
return (s.st_mode & S_IFMT) == S_IFREG && (s.st_mode & S_IXUSR);

bool exe(const char *dir, const char *e, void *result) {
char *r = (char*)result;
snprintf(r, PATH_MAX, "%s/%s", dir, e);
if (!is_exe(r)) {
r[0] = '\0';
return false;
return true;

void f(const char *e, bool (*callback)(const char *, const char *, void *), void *result) {
char *path = strdup(getenv("PATH") ? getenv("PATH") : "");
char *PATH = path;
char *dir, *saveptr;
while ( (dir = strtok_r((char*)PATH, ":", &saveptr))) {
if (callback(dir, e, result)) break;

int main(int argc, char **argv) {
for (int idx = 1; idx < argc; idx++) {
char e[PATH_MAX];
f(argv[idx], exe, e);
strlen(e) ? (void)printf("%s\n", e) : errx(1, "%s not found in PATH", argv[idx]);

Coincidently, this version is the most correct one: it won't confuse a directory with an executable.

Thursday, September 9, 2021

Basic Latin, Diacritical Marks & IMDB

This is a story of not placing trust in public libraries.

The IMDB website has an auto-complete input element. While its mechanism isn't documented anywhere, you can easily explore it with curl:

$ alias labels='json d | json -a l'
$ imdb=

$ curl -s $imdb/a/ameli.json | labels
Amelia Warner (I)
Austin Amelio
Amelia Clarkson
Amelia Rose Blaire
Amelia Heinle
Amelia Bullmore
Amelia Eve

The endpoint understands acute accents & strokes:

$ curl -s $imdb/b/boże+ciało.json | labels
Corpus Christi
Corpus Christi
Olecia Obarianyk
Alecia Orsini Lebeda
Zwartboek: The Special
The Cult: Edie (Ciao Baby)
Anne-Marie: Ciao Adios
The C.I.A.: Oblivion

(Corpus Christi is the translation of Boże Ciało.)

The funny part starts when you try to enter the same string (boże ciało) in the input field on the IMDB website:

Where's the movie? Turns out, the actual query that a page makes looks like

boe_ciao? Apparently, it tried to convert the string to a basic latin set, replacing spaces with an undescore along the way. It's not terribly hard to spot a little problem here.

This is the actual function that does the convertion:

var ae = /[àÀáÁâÂãÃäÄåÅæÆçÇèÈéÉêÊëËìÍíÍîÎïÏðÐñÑòÒóÓôÔõÕöÖøØùÙúÚûÛüÜýÝÿþÞß]/
, oe = /[àÀáÁâÂãÃäÄåÅæÆ]/g
, ie = /[èÈéÉêÊëË]/g
, le = /[ìÍíÍîÎïÏ]/g
, se = /[òÒóÓôÔõÕöÖøØ]/g
, ce = /[ùÙúÚûÛüÜ]/g
, ue = /[ýÝÿ]/g
, de = /[çÇ]/g
, me = /[ðÐ]/g
, pe = /[ñÑ]/g
, fe = /[þÞ]/g
, be = /[ß]/g;

function ve(e) {
if (e) {
var t = e.toLowerCase();
return t.length > 20 && (t = t.substr(0, 20)),
t = t.replace(/^\s*/, "").replace(/[ ]+/g, "_"),
ae.test(t) && (t = t.replace(oe, "a").replace(ie, "e")
.replace(le, "i").replace(se, "o")
.replace(ce, "u").replace(ue, "y")
.replace(de, "c").replace(me, "d")
.replace(pe, "n").replace(fe, "t").replace(be, "ss")),
t = t.replace(/[\W]/g, "")
return ""

(It took me some pains to extract it from god-awful obfuscated mess that IMDB returns to browsers.)

It's not only the Polish folks whose alphabet gets mangled. The Turks are out of luck too:

ve('Ruşen Eşref Ünaydın')     // => ruen_eref_unaydn

I say the function above sometimes does its job rather wrong:

ve('ąśćńżółıźćę')             // => o

deburr() from lodash is available publicly since February 5, 2015 &, unlike the forlorn IMDB attempt, works fine:

deburr('Boże Ciało')          // => Boze Cialo
deburr('Ruşen Eşref Ünaydın') // => Rusen Esref Unaydin
deburr('ąśćńżółıźćę') // => ascnzolizce

Why not use it?

Tuesday, May 25, 2021

Missing charsets in String to FontSet conversion

After upgrading to Fedora 34 I started to get a strange warning when running vintage X11 apps:

$ xclock
Warning: Missing charsets in String to FontSet conversion

With gv(1) it was much worse–multi-line errors, all related to misconfigured fonts. Some errors I was able to fix via

# dnf reinstall xorg-x11-fonts\*

Why exactly rpm post-install scripts have miscarried during the distro upgrade, remains unknown. Still, the main warning about charsets persisted.

Most classic x11 apps (gv included) are written in (now ancient) libXt library. By grepping through libXt code, I found a function that emits the warning in question. It calls XCreateFontSet(3) & dutifully reports the error, but fails to describe which of the charsets weren't found for a particular font.

A simple patch to libXt:

--- libXt-1.2.0/src/   2021-05-22 00:18:36.359273335 +0300
+++ libXt-1.2.0/src/Converters.c 2021-05-22 00:21:08.550340341 +0300
@@ -973,6 +973,10 @@
"Missing charsets in String to FontSet conversion",
+ fprintf(stderr, "XFontSet fonts: %s\n", fromVal->addr);
+ for (int i = 0; i < missing_charset_count; i++) {
+ fprintf(stderr, " missing charset: %s\n", missing_charset_list[i]);
+ }
if (f != NULL) {
@@ -1006,6 +1009,10 @@
"Missing charsets in String to FontSet conversion",
+ fprintf(stderr, "XFontSet fonts: %s\n", value.addr);
+ for (int i = 0; i < missing_charset_count; i++) {
+ fprintf(stderr, " missing charset: %s\n", missing_charset_list[i]);
+ }
if (f != NULL)
@@ -1030,6 +1036,10 @@
"Missing charsets in String to FontSet conversion",
+ fprintf(stderr, "XFontSet fonts: %s\n", "-*-*-*-R-*-*-*-120-*-*-*-*,*");
+ for (int i = 0; i < missing_charset_count; i++) {
+ fprintf(stderr, " missing charset: %s\n", missing_charset_list[i]);
+ }
if (f != NULL)

gave me some clue:

$ gv
Warning: Missing charsets in String to FontSet conversion
... missing charset: KSC5601.1987-0

What is KSC5601.1987-0? Looks Korean. Why can't XCreateFontSet(3) suddenly find it? I didn't uninstall any fonts during the distro upgrade.

Turns out, the only bitmap font that provided KSC5601.1987-0 charset, daewoo-misc, was removed from xorg-x11-fonts-misc package due to licensing concerns. This is very rude.

It forced me to make a custom rpm package for daewoo-misc fonts. The spec file is here. Notice that I didn't bother to provide a fontconfig configuration (hence the installed font is invisible to Xft), for all I cared was to silence the annoying gv warning.

Saturday, January 30, 2021

Fixing “30 seconds of code”

In the past, the JS portion of 30 seconds of code was a single, big README in a github repo. You can still browse an old revision, of course. It was near perfect for a cursory inspection or a quick search.

In full conformance with all that's bright must fade adage, the README was scraped away for an alternative version that looks like this:

Why, why did they do that?

Thankfully, they put each code "snippet" into a separate .md file (there are 511 of them), which means we can concatenate them in 1 gargantuan file & create a TOC. I thought about an absolute minimum amount of code one would need for that & came up with this:

$ cat Makefile
$(if $(i),,$(error i= param is missing))
out := _out

$(out)/%.html: $(i)/
@mkdir -p $(dir $@)
echo '<h2 id="$(title)">$(title)</h2>' > $@
pandoc $< -t html --no-highlight >> $@

title = $(notdir $(basename $@))

$(out)/30-seconds-of-code.html: template.html $(patsubst $(i)/, $(out)/%.html, $(sort $(wildcard $(i)/*.md)))
cat $^ > $@
echo '</main>' >> $@


(i should be a path to a repo directory with .md files, e.g. make -j4 i=~/Downloads/30-seconds-of-code/snippets)

This converts each .md file to its .html counterpart & prepends template.html to the result:

What's in the template file?

  1. a TOC generator that runs once after DOM is ready;
  2. a handler for the <input> element that filters the TOC according to user's input;
  3. CSS for a 2-column layout.

There is nothing interesting about #3, hence I'm skipping it.

Items 1-2 could be accomplished using 3 trivial functions (look Ma, no React!):

$ sed -n '/script/,$p' template.html
document.addEventListener('DOMContentLoaded', main)

function main() {
let list = mk_list()
document.querySelector('#toc input').oninput = evt => {

function render(list, filter) {
document.querySelector('#toc__list').innerHTML = list(filter).map( v => {
return `<li><a href="#${v}">${v}</a></li>`

function mk_list() {
let h2s = [...document.querySelectorAll('h2')].map( v => v.innerText)
return query => {
return query ? h2s.filter( v => v.toLowerCase().indexOf(query.toLowerCase()) !== -1) : h2s

<nav id="toc"><div><input type="search"><ul id="toc__list"></ul></div></nav>
<main id="doc">

This is all fine & dandy, but 30 seconds of code has many more interesting repos, like snippets of css or reactjs code. They share the same lamentable fate with the js one–once being in a single readme, they have converged lately on a single, badly-searchable website, that displays 1 recipe per user’s query.

The difference between the css/react snippets & the plain js ones is in a necessity of a preview: if you see a tasty recipe for a “Donut spinner”, you’d like to see how the donut spins, before copying the example into your editor.

In such cases, people oft resort to pasting code into one of “Online IDE”s & embedding the result into their tutorial. CodePen, for example, has even more convenient feature: you create a form (with a POST request) that holds a field with a json-formatted string which contains html/css/js assets. That way you can easily make a button “check this out on codepen”. The downside is that a user leaves your page to play with the code.

Another way to show previews alongside the docs is to create an iframe & inject all assets from a snipped into it–in this implementation you don’t rely on 3rd parties & the docs stay fully usable in off-line scenarios (nobody actually needs that, but it sounds useful to have as an option).

This requires greatly expanding the examples above: either we need 3 separate templates: one for js snippets, some other for css recipes & a disheartening one for reactjs chunks; or we force a single template act differently depending on a payload content.

For the latter approach, see this repo.

Wednesday, January 6, 2021

Twitter stats using gnuplot, json & make

Twitter allows to download a subset of user's activites as a zip archive. Unfortunately, there's no useful visualisations of the provided data, except for a simple list of tweets with a date filtering.

For example, what I expected to find but there were no signs of it:

  1. a graph of activities over time;
  2. a list of:
    1. the most popular tweets;
    2. users, to whow I reply the most.

Inside the archive there is data/tweet.js file that contains an array (assigned to a global variable) of "tweet" objects:

window.YTD.tweet.part0 = [ {
"tweet" : {
"retweeted" : false,
"source" : "<a href=\"\" rel=\"nofollow\">Twitter Web Client</a>",
"favorite_count" : "2",
"id" : "12345",
"created_at" : "Sat Jun 23 16:52:42 +0000 2012",
"full_text" : "hello",
"lang" : "en",
}, ...]

The array is already json-formatted, hence it's trivial to convert it to a proper json for filtering with json(1) tool.

Say we want a list of top 5 languages in thich tweets were written. A small makefile:

$ cat
lang: tweets.json
json -a tweet.lang < $< | $(aggregate) | $(sort)
tweets.json: $(i)
unzip -qc $< data/tweet.js | sed 1d | cat <(echo [{) - > $@

aggregate = awk '{r[$$0] += 1} END {for (k in r) print k, r[k]}'
sort = sort -k2 -n | column -t
SHELL := bash -o pipefail

yields to:

$ make -f | tail -5
cs 16
und 286
ru 333
en 460
uk 1075

( is the archive that Twitter permits us to download.)

To draw activity bars, the same technique is applied: we extract a date from each tweet object & aggregate results by a day:

2020-12-31 5
2021-01-03 10
2021-01-04 5

This can be fed to gnuplot:

$ make -f activity.svg

This makefile has an embedded gnuplot script:

$ cat

%.svg: dates.txt
cat <(echo "$$plotscript") $< | gnuplot - > $@

dates.txt: tweets.json
json -e 'd = new Date(this.tweet.created_at); p = s => ("0"+s).slice(-2); = [d.getFullYear(), p(d.getMonth()+1), p(d.getDate())].join`-`' -a < $< | $(aggregate) > $@

export define plotscript =
set term svg background "white"
set grid

set xdata time
set timefmt "%Y-%m-%d"
set format x "%Y-%m"

set xtics rotate by 60 right

set style fill solid
set boxwidth 1

plot "-" using 1:2 with boxes title ""

To list users, to whom one replies the most, is quite simple:

$ cat
users: tweets.json
json -e 'this.users = v => v.screen_name).join`\n`' -a users < $< | $(aggregate) | $(sort)


I'm not much of a tweeter:

$ make -f | tail -5
<redacted> 41
<redacted> 49
<redacted> 60
<redacted> 210
<redacted> 656

Printing the most popular tweets is more cumbersome. We need to:

  1. calculate the rating of each tweet (by a such a complex foumula as favorite_count + retweet_count);
  2. sort all the tweet objects;
  3. slice N tweet objects.

A Make recipe for it is a little too long to show here, but you can grab a makefile that contains the recipe + all the recipes shown above.

Friday, December 11, 2020

Making high-resolution screenshots of Emacs frames

Emacs 27.1 can utilise Cairo drawing backend to take screenshots of itself via x-export-frames function. Unfortunately, the bare bone function is all we have here–there's no UI to it. Moreover, it doesn't support bitmap fonts, which means if you still use, say, Terminus, you get garbage in the output.

I wanted to share a screenshot of a Emacs frame on twitter. Twitter doesn't accept SVGs, for net income of $1.47bn isn't enough to support such a complex thing. The best way to obtain an arbitrary high-resolution png is to get it from a vector image. I found that postscript->png gives the best results & requires only ghostscript installed.

(defun my--screenshot-png(out)
"Save a screenshot of the current frame as a png file. Requires ghostscript."
(let ((ps (concat out ".tmp")))
(my--screenshot ps 'postscript)
(call-process "gs" nil (get-buffer-create "*Shell Command Output*") nil
"-sDEVICE=png16m" "-dBATCH" "-dNOPAUSE"
"-r300" "-dTextAlphaBits=4" "-dGraphicsAlphaBits=4"
(concat "-sOutputFile=" out) ps)
(delete-file ps)

We use 300 dpi here to render a png. my--screenshot function below temporally changes a frame font to Inconsolata:

(defun my--screenshot(out format)
(let ((fontdef (face-attribute 'default :font)))
(set-frame-font "Inconsolata 10")
(with-temp-file out
(insert (x-export-frames nil format)))
(set-frame-font fontdef))

The last bit left is to provide a prompt for a user where to save the screenshot:

(defun my-ss()
"Save a screenshot of the current frame in a file"
(let* ((out (expand-file-name (read-file-name "Output file name: ")))
(ext (file-name-extension out)))
((equal "png" ext)
(my--screenshot-png out))
((equal "ps" ext)
(my--screenshot out 'postscript))
(my--screenshot out (intern ext)))


M-x my-ss<RET>
Output file name: ~/Downloads/1.png<RET>

The physical image size here is 3133x3642.

How I read newsgroups in mutt

It took me a long time, but I've finally removed inn+newsstar from my machine. I don't use patches that add NNTP support to mutt any more. Yet, I still cannot put gmane away, for it's much more convenient to read mailing lists as newsgroups.

What if I just fetch last N posts from newsgroup foo.a & save them in an mbox file for viewing it mutt later on? Then I can do the same for newsgroups foo.b, foo.c, & so on.

How do I fetch? Turns out, there is a nice CLI NNTP client already, called sinntp. The following command downloads fresh articles from comp.lang.c into a conspicuously named mbox file comp.lang.c:

$ sinntp pull --server comp.lang.c

If you run it again, it won't re-download the same articles again, for it saves reported high water mark in ~/.local/share/sinntp/ file.

This only solves a problem for 1 news server & 1 newsgroup. I read multitudes of them; should I write a simple shell script then? If you follow this blog, you may have noticed I try not to employ shell scripts but write makefiles instead.

$ cat ~/.config/nntp2mbox/

This states that I want to grab articles from comp.lang.c newsgroup from news server.

$ cat ~/.config/nntp2mbox/

In this example, the server name is & 1 of the newsgroups is commented out.

There's no more configuration, everything else is done by nntp2mbox makefile:

#!/usr/bin/make -f

# a number of article to pull
limit := 500
g :=

conf := $(or $(XDG_CONFIG_HOME),~/.config)/nntp2mbox
servers := $(wildcard $(conf)/*.conf)
self := $(lastword $(MAKEFILE_LIST))

all: $(servers:%.conf=%.server)

# read a list of newsgroups & run Make for each newsgroup
%.server: %.conf
awk '!/^#/ {print $$1 ".newsgroup"}' $< | grep $(call se,$(g)) | xargs -r $(make) -Bk server=$(notdir $(basename $<))

sinntp pull --server $(server) --limit $(limit) $*

make = $(MAKE) --no-print-directory -f $(self)
se = '$(subst ','\'',$1)'

The following command downloads fresh articles from all the newsgroups (from all the news servers above) to the current directory:

$ nntp2mbox
awk '!/^#/ {print $1 ".newsgroup"}' /home/alex/.config/nntp2mbox/localhost.conf | grep '' | xargs -r /usr/bin/make --no-print-directory -f /home/alex/bin/nntp2mbox -Bk server=localhost
awk '!/^#/ {print $1 ".newsgroup"}' /home/alex/.config/nntp2mbox/ | grep '' | xargs -r /usr/bin/make --no-print-directory -f /home/alex/bin/nntp2mbox -Bk
sinntp pull --server --limit 500 comp.lang.c
awk '!/^#/ {print $1 ".newsgroup"}' /home/alex/.config/nntp2mbox/ | grep '' | xargs -r /usr/bin/make --no-print-directory -f /home/alex/bin/nntp2mbox -Bk
sinntp pull --server --limit 500 gmane.comp.gnu.make.devel
sinntp pull --server --limit 500 gmane.comp.gnu.make.general
sinntp pull --server --limit 500 gmane.comp.window-managers.fvwm

(Yes, it invokes Make recursively, which is a big no-no in many Make circles.)

$ ls
comp.lang.c gmane.comp.gnu.make.general
gmane.comp.gnu.make.devel gmane.comp.window-managers.fvwm

It even supports filtering by a newsgroup name:

$ nntp2mbox g=fvwm

I don't actually read comp.lang.c. If there's anything sane left in comp.* hierarchy, please let me know.

Thursday, December 10, 2020

Reading the Emacs User Survey 2020 Results

More than a month ago some guy made a survey of emacs users. A couple of days ago, he released the results alongside with raw data.

After importing Emacs-User-Survey-2020-clean.csv into sqlite (7,344 rows), the first thing I checked was if someone had mentioned any of my emacs packages &, I kid you not, I got 9 hits for wordnut! Yipee!

Then I started filtering by "For how many years have you been using Emacs?" column. The amount of matched old-timers was staggering (I expected to find next to none):

  • >= 20 years: 1,497 rows
  • >= 15: 2,058
  • >= 10: 2,975

Here's a tiny portion of interesting/hilarious entries:

while learning
42 Finnish Ispell usenet No cursor keys on ADM video terminal in 1978

God knows what is the ADM terminal & where did he get in Finland.

while learning
41 When I got booted into TECO, I was like WTF is this?? Did my modem disconnect? brain transplant
Be written in Common Lisp
41 I often got stuck inside multiple ^R recursive edits but once I understood it was because of mini buffer exits I was ok. Nowadays it's less of an issue.
40 Get rid of "kill" from nomenclature, commands, etc. Why "kill-emacs" instead of "exit-emacs"? (I remapped that one decades ago; maybe it's changed in the district but my command still works). I dislike violence and wish kill buffers were no so named. This may seem minor, but if you don't think that language matters, perhaps you haven't heard the rants of the USA's current president.
40 Stop changing behaviors of next-line, search, etc.
39 Emacs' TECO macros were impenetrable. Nothing comes to mind. I drank the Kool-Aid a loong time ago.
35 stop styling my text with weird font lock crap. gets increasingly hard to turn off
like to see vector drawing and variable width fonts in core
35 not really… the only difficulty is it wasn't on all the machines I used. I first used it on a VAX 11/780 having access to a bootleg hacked version crafted to run on the VAX under VMS. The guys that created that version were cad guys in DEC (AI CAD group) and I was lucky to have it… rather than being stuck with EDT.
But then I started using an Apollo and lost it. So I tried to write my own Emacs. from scratch. I was sort of successful but who has that much extra time… then came the Sun and unix and the HP 9000 and then came Linux (finally) and I had Emacs most everywhere.
35 The learning curve is steep, but quick. I got nuthin.
34 my init file quickly became scrambled up because I pasted in code that I didn't understand nor know how to organize why did I have to figure out how to correctly compile emacs 27 on the latest ubuntu? Why wasn't a package immediately available on all OSs when 27.1 was released?

You'd think that after 34 years of using Emacs, one would be able to discern Emacs maintainers from maintainers of the emacs package in their Linux distro, but no.

while learning
30 I liked it in the old days when, when you ran a repeating macro you would see all the changes zipping through on the screen.
29 Coming from Glosling Emacs, I didn't understand why ^T was so broken in GNU Emacs. In all honesty, when there's a new version of emacs I mostly spend time figuring out how to turn new abominations off or put them back to how they should be.
It would be a big improvement for me if Emacs stuck to text and didn't try to do things with images, tables etc. When I paste anything into Emacs, it should either turn into text or fail.
And more speed is always welcome.
27 This is 27 years ago and at that time everybody considered it cool to know emacs. So, no. Not really… I am professor in a computer science department and teach students. I provide for colleagues and friends a heavily adapted version of emacs that (I believe) is more user-friendly. Nevertheless, it is sad to see that students are not even interested anymore to learn emacs. So, I think the most pressing need is to have a simplified user interface that adheres to the usual standards (similar to what ergoemacs is trying to achieve). All basic functionality should be on menus, keyboard shortcuts should only be an add-on for power users.
26 Everything was hard. Copy paste. Saving files. The UI sort of sucks
25 Terrible, useless documentation
25 I don't understand elisp Responsiveness of the interface: Emacs sometimes feel slow.
23 More than 23 years ago, it was mandatory at my university. Key bindings and the like felt very alien.

What some people think about poor Richard Stallman:

  • RMS should resign so that politics stops guiding Emacs development. It is a tragedy that a great editor continues to be crippled because technical decisions are made for outdated ideological reasons. I would love to contribute but Emacs development is extremely hostile to any non-purist views.
  • Stop letting RMS block good ideas.
  • There appears to be a split between the core developers and the "package" developers. I am confused by the role that RMS still plays in Emacs stewardship, and puzzled that he is not familiar with org-mode.
  • RMS's computing habits are so completely beyond what's normal that he has no idea what modern users want in an editor. If you want emacs to be popular you have to ACTUALLY LISTEN TO FEEDBACK FROM NEW USERS instead of a bunch of greybeards going "oh well emacs is fine for me".
  • I've considered dropping emacs altogether a few times because of RMS's behavior. The one thing I would like emacs to do is to stop having any affiliation with him.
  • Ignore RMS's opinions going forward.
  • … from reading the exchanges on the mailing list and especially RMS' opposition to anything "newfangled" has discouraged me from even trying to contribute to the core.

Comments about the survey itself (I feel sorry for the guy who organized it):

  • I've refused to answer surveys that require proprietary JavaScript before. It's unacceptable for a community survey to demand cooperation with a corporation. I wouldn't've answered this survey were mailing this not an option. Of course, I'd issues sending this response, as I learned what server lay underneath the EMACSSURVEY.ORG domain MX records. It would be better if the SMTP servers were run differently, even by another business than that.
  • The last question on this page [What is the default keybinding to find a file?] stinks to high heaven. I don't know the answer, because my fingers do. But the real reason the last question stinks is that some doofus decided that answers placed there can only be some short number of characters, so I have to put my "I donoknow but mynfingers do" answer here instead of in that questions answer field. So I put a "nonsensical yet accurate" answer there.
  • Death to vi!
  • You are not enough experienced and whole survey is a joke with already set purposes, which we will find later.
    Would you be experienced you would know HTML, no Javascript is required. Would you be experienced, you would know who to hire, and not just linking to third party servers, thus exposing free software users to proprietary Javascript.
    Finally you are exposing their information to third party server which cannot be trusted.
    It is easy to edit few HTML elements and it would be to accept it over CGI and store in the database. I have rewritten the basic Perl form.cgi so many times for myself before 15+ years, and later wrote it for myself in Common Lisp, and I just wait for few free time to rewrite it in Emacs Lisp. All what you need is emacs CGI package and Emacs to prepare HTML.
    But I guess you are not getting what I am speaking about.
  • RMS did nothing wrong.
  • Is JotForm Free Software?
  • I disagree in the way the survey has been released without the emacs mantainers.
  • I had to disable no-script, so I'm angry.

On a serious note, if you'd like to read what newbies really think of Emacs, filter "For how many years have you been using Emacs?" by 0, although it'll take a great deal of time (533 rows to examine).

Tuesday, October 27, 2020


What do you do when you need to add a formula to a epub? Most epub readers don't support MathML yet, hence you resort to making SVGs via mathjax-node-cli. Then you test the epub in several über-popular readers to discover that only Kindle & Google Play Books render such SVGs correctly, the rest either loses all the characters in equations (KOReader) or just draws sad little boxes in place of the images (Moon+ Reader).

How do you produce PNGs then? In the past, mathjax-node had an option of a png export, but it has been deprecated.

There's a way to do it w/ pdflatex: (1) generate a pdf w/ the help of texlive-standalone package, (2) convert the pdf to a png.

This doesn't sound complicated & it's not complicated, but there's no helpful wrappers available and if you want to integrate the tex→png process into your build pipeline, prepare to deal w/ the usual aux/log rubbish that any TeX program leaves around.

Here's a makefile that does the conversion:

#!/usr/bin/make -f

$(if $(f),,$(error "Usage: tex2png f='E=mc^2' output.png"))
dpi := 600
devnull := $(if $(findstring s, $(MAKEFLAGS)),> /dev/null)

pdflatex -jobname "$(basename $@)" -interaction=batchmode '\nofiles\documentclass[border=0.2pt]{standalone}\usepackage{amsmath}\usepackage{varwidth}\begin{document}\begin{varwidth}{\linewidth}\[ '$(call se,$(f))' \]\end{varwidth}\end{document}' $(devnull)
@rm -f "$(basename $@).log"

%.png: %.pdf
gs -sDEVICE=pngalpha -dQUIET -dBATCH -dNOPAUSE -r$(dpi) -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -sOutputFile="$@" "$<"

se = '$(subst ','\'',$1)'

It automatically removes all intermidiate files; iff you mistype a formula it saves a .log file to peruse.

For example, render Parkinson's coefficient of inefficiency (published in ~1957):

$ ./tex2png -s output.png f='x = \frac{m^{o}(a-d)}{y + p\sqrt{b}}'

(x = the number of members effectively present at the moment when the efficient working of a committee has become manifestly impossible; m = the average number of members actually present; o = the number of members influenced by outside pressure groups; a = the average age of the members; d = the distance in cm between the two members who are seated farthest from each other; y = the number of years since the cabinet or committee was first formed; p = the patience of the chairman, as measured on the Peabody scale; b = the average blood pressure of the three oldest members, taken shortly before the time of meeting.)

Thursday, August 27, 2020

Steganography with zip archives

The elegance of CVE-2020-1464 comes from the internal structure of Zip file format. While many other archive formats, like Microsoft Cab, put an index of the compressed files in the beginning of an archive, zip archivers place it in the end of a file.

The reason is historical: apparently, in 1989 disk drives were so slow, that adding a new blob to an existing file & appending a new index to it was cheaper then copying chunks of the original archive to a new file.

The CVE reminded me of an old joke of hiding a .zip in a .jpg. When you append a .zip to an image file, the recipient of the jpeg not necessarily notices junk in the image, but if you know about such a 'hidden' part, any ordinary unzip tool is able to extract it.

This got me thinking: can we hide a file inside of a .zip? BlackHat Europe 2010 had a talk about steganography in popular archives formats. In one of the described tricks, carefully inserting a blob before a zip index, makes it invisible to all common unpackers.

To verify this claim, I wrote a couple of small Ruby scripts, that inject & extract a 'hidden' blob. The approach works: Windows Explorer, 7-Zip, WinRAR, bsdtar(1), unzip(1) didn't see anything unusual. Even in the extreme cases like:

$ du -h

$ bsdtar ftv
-rw-r--r-- 0 1000 100 1 Aug 25 21:58 q

that certainly may look unusual to an innocent user–a 4 gigabyte archive that unpacks into an exactly 1 byte file! The opposite of a zip bomb.

A Zip index is formally termed central directory. It consists of 2 main parts: ① central directory headers (CHDs) & ② end of central directory (EOCD) record. A CHD contain metadata about a particular file, EOCD–metadata about the index itself 1:

class Eocd < BinData::Record
endian :little

uint32 :signature, asserted_value: 0x06054b50
uint16 :disk
uint16 :disk_cd_start
uint16 :cd_entries_disk
uint16 :cd_entries_total
uint32 :cd_size
uint32 :cd_offset_start
uint16 :comment_len
string :comment, :read_length => :comment_len,
onlyif: -> { comment_len.nonzero? }

The thing of interest here is cd_offset_start (officially called offset of start of central directory 2), a 4-byte value that indicates how many bytes to skip after the first file entry in an archive.

Therefore, after inserting a blob, we need to update cd_offset_start, otherwise the zip file becomes broken.

Just because a user has no clue about the hidden blob whatsoever, doesn't mean specialized tools won't notice it. Say, we have an archive w/ 2 text files:

$ bsdtar ft
The Celebrated Jumping Frog of Calaveras County.txt
What You Want.txt

We inject a .png image to it:

$ zipography-inject blob1.png >

Whilst bsdtar is still none the wiser:

$ bsdtar ft
The Celebrated Jumping Frog of Calaveras County.txt
What You Want.txt

Hachoir correctly recognises it as an unparsed block:

  1. This is a DSL from BinData package that provides a declarative way to read/write structured binary data in Ruby.↩︎

  2. Field names in PKWARE's spec are quite verbose.↩︎

Wednesday, July 22, 2020

How to build Ruby in Windows natively without WSL, MSYS2 or Cygwin

Every Ruby release tarball contains file win32/README.win32. If you decide to distribute Ruby alongside your Windows app, you can either struggle with the instructions from that file or use MSYS2 (== the modern RubyInstaller). In the past there was Ruby-mswin32 project with an uninspiring motto The forever war against Windows ;-( [sic], but it has died of neglect.

When you ask anyone knowledgeable about compiling Ruby under Windows, they oft (always?) say it's unbearably difficult to get right. On hearing that, I, of course, knew I was destined to repeat the endeavour.

If you're going to do that blindly by installing VS2019 (the 'Community' edition, which is supposedly free) & by following the steps in win32/README.win32, you most probably come through, but end up with a crippled Ruby variant that has no openssl support whatsoever & hence you cannot run the gem command. Smashing.

After wasting time on that I searched for a binary version of openssl suitable for the VS, recompiled Ruby to make sure rubygems was working & decided that the process was indeed getting mighty wearisome.

Turns out, it can be simplified.

At the time of writing, there's exactly 1 post on the interwebs about this topic by some Japanese guy on a Japanese knowledge community platform in Japanese. Instead of building/finding dependencies manually (we need at least 3 of them: openssl, readline & zlib) we can employ vcpkg for that job.


  1. Install VS2019.

  2. Clone the vcpkg repo (say, to D:\opt\s\vcpkg).

  3. Open x64 Native Tools Command Prompt for VS 2019.

  4. Run bootstrap-vcpkg.bat inside the cloned vcpkg repo directory.

  5. Download & compile the dependencies:

     > vcpkg --triplet x64-windows install libxml2 libxslt openssl readline zlib
  6. Set 3 env variables:

     > set PATH=%PATH%;D:\opt\s\vcpkg\installed\x64-windows\bin
    > set INCLUDE=%INCLUDE%;D:\opt\s\vcpkg\installed\x64-windows\include
    > set LIB=%LIB%;D:\opt\s\vcpkg\installed\x64-windows\lib
  7. cd to the unpacked Ruby src directory & type:

     > win32\configure.bat --prefix=d:\opt\s\ruby

    then nmake & nmake install.

If you did everything correctly, even irb should work:

> irb -rfiddle -rfiddle/import
irb(main):001:1* module User32
irb(main):002:1* extend Fiddle::Importer
irb(main):003:1* dlload 'user32'
irb(main):004:1* extern 'int MessageBoxA(int, char*, char*, int)'
irb(main):005:0> end
=> #<Fiddle::Function:0x000000000676fbc8 ...>
irb(main):006:0> User32::MessageBoxA 0, RUBY_DESCRIPTION, "", 0
=> 1

By today's standards, the resulting full Ruby installation is pleasantly small:

$ du -shc vcpkg/installed/x64-windows/bin ruby/{bin,lib} --exclude '*.pdb'
7.0M    vcpkg/installed/x64-windows/bin
2.5M    ruby/bin
35M     ruby/lib
45M     total

Saturday, April 18, 2020

Custom GIMP UI font size in Windows

You'll need to create a new theme from an existing one. Here's an example for GIMP installed with scoop:

$ scoop list | grep gimp
gimp 2.10.18 [extras]
$ cd /cygdrive/c/Users/alex/scoop/apps/gimp/current/share/gimp/2.0/themes
$ diff -ur System MySystem
diff -ur System/gtkrc MySystem/gtkrc
--- System/gtkrc 2019-06-14 00:15:20.000000000 +0300
+++ MySystem/gtkrc 2020-04-18 17:55:00.549251800 +0300
@@ -38,7 +38,7 @@

# Uncommenting this line allows to set a different font for GIMP.
-# font_name = "sans 10"
+ font_name = "segoe ui 12"

GtkPaned::handle-size = 6
GimpDockWindow::default-height = 300
@@ -104,3 +104,9 @@

widget "*GimpDisplayShell.*" style "gimp-display-style"
+style "my-menu-font"
+ font_name="segoe ui 12"
+widget_class "*Menu*" style "my-menu-font"

Then start GIMP & open EditPreferencesInterfaceTheme.



Wednesday, March 4, 2020

Open a url selected from anywhere on your desktop

I'm sure something like this exists in gnome/kde, but there is nothing for fvwm.

The idea is quite simple: you select a url in a text editor, a terminal emulator or whatever, press a kbd shortcut & your default browser opens a new tab w/ it.

$ cat xprimary-xdg-open
# open up to 5 urls from the 1st 10K of the current selection via xdg-open

alert() { xmessage -center -button ok -default ok -timeout 2 "$*"; }

idx=0; for line in `xsel | head -c 10240 | tr '[:space:]' '\n' | egrep -a "$uri" | head -5 | sort -u`; do
xdg-open "$line" || alert Failed to xdg-open \`"$line"\` &

[ $idx -eq 0 ] && alert The PRIMARY selection contains no urls!

(It requires xsel, xorg-x11-apps & xdg-utils Fedora packages.)

Why so long a script if something like "xsel | xargs xdg-open" should suffice?

  • it reports if a selection was empty
  • it reports an error if xdg-open was unable to open a browser
  • you can open multiple urls concurrently
  • there is some protection against junk in the selection


$ head -c $((1024*1024)) < /dev/urandom | xsel
$ ./xprimary-xdg-open

& you get a gui error message:

instead of "omg, what was that" on the stderr. xmessage draws not exactly the prettiest dialog boxes, but who cares.

The last "important" question here is what shortcut to choose? I decided upon Win-Shift-C, for in the past it opened the nonsensical "charms" menu in Windows (does nothing in 1909) & wretched Linux desktops, of course, have no charms.

For ~/.fvwm/.fvwm2rc:

Key c A S4 Exec exec xprimary-xdg-open

Sunday, February 9, 2020

Building an rpm in the current directory without any build environment

Say you have a proper .spec file, e.g. from a Fedora rpm repo. The repo contains the latest patches for the package but you won't get them via dnf this week b/c of the ~slow testing process. Or you just want to apply a small fix to a package.

How to do this quickly w/o spending hours on a correct "build environment", "infrastructure", w/o re-reading Maximum RPM book & the wearisome packaging guidelines?

Say we want to amend the flite Fedora package. At the time of writing, its .spec file describes not only a prehistoric version of the program but a configuration w/ the most robotic voices possible. Let's at least fix the latter.

Clone and apply the following patch to it:

diff --git a/flite.spec b/flite.spec
index dc6dd28..ccf8251 100644
--- a/flite.spec
+++ b/flite.spec
@@ -3,2 +3,3 @@ Version: 1.3
Release: 35%{?dist}
+Epoch: 1000
Summary: Small, fast speech synthesis engine (text-to-speech)
@@ -52,3 +53,3 @@ cp -p %{SOURCE1} .
autoreconf -vif
-%configure --enable-shared --with-audio=alsa
+%configure --enable-shared --with-audio=alsa --with-vox=cmu_us_kal16
# This package fails parallel make (thus cannot be built using "_smp_flags")

To build an rpm, cd to the repo directory & run:

$ rpmbuild --load ~/lib/macros.spec -bb flite.spec
$ find -name \*rpm

That's it. It even automatically fetches a source tarball.

The only missing part is a mysterious ~/lib/macros.spec:

%_topdir                %(pwd)/_out
%_sourcedir %(pwd)
%debug_package %nil
%_disable_source_fetch %nil

That file is your whole "rpm build environment".

Wednesday, January 15, 2020

Automatic static assets dependency discovery

In the past when I wanted to grab static files from node_modules directory I'd script it in a makefile like this:

vendor.src := foo/bar.js foo/bar.css
vendor.dest := $(addprefix $(out)/vendor/, $(vendor.src))
$(out)/vendor/%: node_modules/%; $(copy)

define copy =
@mkdir -p $(dir $@)
cp $< $@

Then in some index.html I'd have:

<script src="vendor/foo/bar.js"></script>

& so on. This worked fine, but required a manual sync b/w the html file & its dependencies.

Then it dawned upon me (slowpoke.webp) that the value of vendor.src variable could be discovered automatically; all we need here is a decent html parser that is usable from the command line.

We can use nokogiri or a cli for cheerio library (disclamer).

vendor.src := $(shell adieu -pe '$$("link,script").map((_,e) => $$(e).attr("href") || $$(e).attr("src")).get().filter(v => /node_modules/.test(v)).join`\n`' src/index.html)
vendor.dest := $(addprefix $(out)/, $(vendor.src))
$(out)/node_modules/%: node_modules/%; $(copy)

vendor.src doesn't look very pretty any longer but now I can edit my index.html w/o worrying about the makefile.

Wednesday, April 10, 2019


Starting with version 73, Chrome has switched the required package format for extensions to crx3. Why new file format?

Date: Fri, 20 Jan 2017 17:10:24 -0800
From: Joshua Pawlicki <>
Subject: Intent to Implement CRX₃
Message-ID: <>

[...] Chrome extensions are currently packaged for installation/update
as signed zip files called CRX₂ files, using SHA1withRSA for the
signature algorithm. Many of the RSA keys used to sign the files are
insufficiently secure (too short). The CRX₂ format does not allow for
algorithm rotation, key rotation, or multiple proofs. The goal is to
address these issues and leave the door open to future improvements. [...]

Chrome even allows to create a .crx file from the command line (e.g., in Linux: google-chrome --pack-extension=ext-dir --pack-extension-key=file.pem).

What if you'd like to create .crx files on a server that doesn't have Chrome installed?

A brief intro to Crx3

The crx2 file format was very simple: you made an sha1 of a zip file, signed it with an RSA private key & prepended the public key & the signature to the zip archive.

Crx3 prepends a protobuf that can contain an unlimited number of public_key+signature tuples (also called proofs). If you create a .crx file by yourself for a Linux version of Chrome, only 1 proof is required. Extensions from the Chrome Web Store incorporate multiple proofs.

Switching to protobufs also means you need a proper protobuf parser to be able read the new file format.

crx3 file format diagram

To see this in action, let's create a noop extension that consists only from manifest.json file.

$ cat manifest.json
"manifest_version": 2,
"name": "foo",
"version": "1.2.3"
$ zip manifest.json
adding: manifest.json (deflated 25%)

Create an RSA public key in the PEM format:

$ openssl genpkey -algorithm RSA -pkeyopt rsa_keygen_bits:2048 -out private.pem

Then do npm -g i crx3-utils (a standard disclaimer) & make a .crx:

$ crx3-new private.pem < > foo.crx
$ file foo.crx
foo.crx: Google Chrome extension, version 3

You can now drop it into chrome://extensions/ page & Chrome should happily accept it.

To see what's inside the foo.crx, run:

$ crx3-info < foo.crx
id jnedgebbcnmoemphjanchkhfkjjhmael
header 593
payload 231
sha256_with_rsa 1 main_idx=0
sha256_with_ecdsa 0

payload is the size of the original zip archive.

id is the extension id that was calculated during the crx file creation. Here, the public key, from which the calculation was done, is in sha256_with_rsa list (alongside with a signature, both in a tuple under the index 0). If we add another tuple (proof) to the crx file, this time using a different private key, the id won't change, for it's permanently saved in SignedData protobuf structure (see the diagram above). This is very different from crx2, where you had to extract a public key first & then calculate the id from it.

In crx2, the signature was just an sha1 of a zip file. In crx3, each proof signs the following sequence of data:

  1. Magic number
  2. SignedData instance length
  3. SignedData (contains the id)
  4. Zip archive

Web Store

If we download whatever extension from the Web Store (say, Google Dictionary), it'll hold 3 proofs inside:

$ curl -sL '' > google-dictionary.crx

$ /crx3-info < google-dictionary.crx
id mgijmajocgfcbeboacabfgobmjgjcoja
header 1061
payload 44018
sha256_with_rsa 2 main_idx=1
sha256_with_ecdsa 1

sha256_with_rsa has an additional public_key+signature tuple, the public key from which is shared between all the extension from the Web Store. The original proof (from the 'developer' key) is shifted to the end of sha256_with_rsa list.

sha256_with_ecdsa is another tuple to which only Google has the private key.