Friday, February 26, 2016

Run Debian Chromium on Fedora

Just a quickie. If you have a bunch of Fedora 32bit VMs, then starting from March there won't be any new Chrome for them. Instead of ditching all those precious VMs, I thought of using a pre-compiled Chromium provided by Debian.

It actually works if you're willing to put up w/ a regular rigmarole of (a) finding out "what's the curren version of Chromium?" & (b) proper deb → rpm conversions. Here is a makefile that automates all that.

Monday, February 15, 2016

Pandoc MathJax Self-contained

If you've ever used MathJax, you've probably noticed that for everything it does it injects script tags w/ various modules, loads fonts on-demand, etc. This is the reason for why pandoc, for example, is unable to produce a truly stand-alone .html file w/ MathJax, where all formulas are pre-rendered or rendered on-the-fly but w/o any external requests.

At 1st I've tried to monkey patch MathJax.Ajax.Require() for dependency discovery & have generated 1 big file w/ all the required modules for PreviewHTML output format, like:

<script>
<% nm = ENV['MATHJAX_SRC'] || "node_modules/mathjax" -%>
<%= File.read File.join nm, "MathJax.js" %>
<%= File.read File.join nm, "jax/input/TeX/config.js" %>
...
</script>

It worked, served its purpose, but was a rough piece of horseplay.

What I really wanted is something like `pandoc file.md -t html5 -o - | mathjax-embed` that would dump a pre-rendered html suitable for the offline use.

Then I remembered that we can always render html (w/ the mathjax script tag) in phantomjs and save the modified DOM. The process should be quite simple: load html, inject a peace of JS w/ the mathjax config, inject a script tag w/ src=mathjax-entry-point, wait until it finishes transforming DOM, print.

Here is a small phantomjs-script that does that: https://github.com/gromnitsky/mathjax-embed.

Here is a rendered example (no JS required & no external resources).

1 caveat: it doesn't embed fonts, thus CommonHTML & HTML-CSS mathjax output formats won't look good. But it works fine for SVG & PreviewHTML ones.

Monday, February 8, 2016

Ruby mail & Base64 Content Transfer Encoding

If you need to parse emails that for some reason still use prehistoric charsets (like koi8-u), mail gem fails to decode bodies of such messages properly.

$ cat message.koi8u.mbox
From alice@example.com Mon Feb  8 22:26:51 2016
From: alice@example.com
To: bob@example.net
Subject: Kings
Date: Mon, 08 Feb 2016 20:26:51 +0000
MIME-Version: 1.0
Message-Id: <1@example.com>
Content-Transfer-Encoding: base64
Content-Type: text/plain; charset=koi8-u

7sHE18/SpiDX1sUg083F0svMzywgpiwg1NjNz8Ag0M/XydTJyiwK5NKmzcGk
LCDT1c3VpCC2pNLV 08HMyc0uCvcgy8XE0s/Xycgg0MHMwdTByCwgzc/XIM7
F08HNz9fJ1MnKLArkwdfJxCDQz8jPxNbB pCCmLCDPIMPB0iDOxdPJ1MnKLA
rzwc0g08/CpiDHz9fP0snU2DogIvEuLi4g7ckg0M/XxczJzSEK
$ irb
2.1.3 :001 > require 'mail'
true
2.1.3 :002 > m = Mail.read 'message.koi8u.mbox'
[...]
2.1.3 :003 > m.body.decoded
"\xEE\xC1\xC4\xD7\xCF\xD2\xA6 [...]\n"
2.1.3 :004 > m.body.decoded.encoding
#<Encoding:ASCII-8BIT>

I.e., the result is total garbage.

But as we can obtain a charset name from Mail::Message#charset method, we can just manually convert the string to UTF-8:

2.1.3 :005 > m.body.decoded.force_encoding(m.charset).encode 'utf-8'
"Надворі вже смеркло, і, тьмою повитий,\n
Дрімає, сумує Ієрусалим.\n
В кедрових палатах, мов несамовитий,\n
Давид походжає і, о цар неситий,\n
Сам собі говорить: \"Я... Ми повелим!\n"