Saturday, January 30, 2021

Fixing “30 seconds of code”

In the past, the JS portion of 30 seconds of code was a single, big README in a github repo. You can still browse an old revision, of course. It was near perfect for a cursory inspection or a quick search.

In full conformance with all that's bright must fade adage, the README was scraped away for an alternative version that looks like this:

Why, why did they do that?

Thankfully, they put each code "snippet" into a separate .md file (there are 511 of them), which means we can concatenate them in 1 gargantuan file & create a TOC. I thought about an absolute minimum amount of code one would need for that & came up with this:

$ cat Makefile
$(if $(i),,$(error i= param is missing))
out := _out

$(out)/%.html: $(i)/%.md
@mkdir -p $(dir $@)
echo '<h2 id="$(title)">$(title)</h2>' > $@
pandoc $< -t html --no-highlight >> $@

title = $(notdir $(basename $@))

$(out)/30-seconds-of-code.html: template.html $(patsubst $(i)/%.md, $(out)/%.html, $(sort $(wildcard $(i)/*.md)))
cat $^ > $@
echo '</main>' >> $@

.DELETE_ON_ERROR:

(i should be a path to a repo directory with .md files, e.g. make -j4 i=~/Downloads/30-seconds-of-code/snippets)

This converts each .md file to its .html counterpart & prepends template.html to the result:

What's in the template file?

  1. a TOC generator that runs once after DOM is ready;
  2. a handler for the <input> element that filters the TOC according to user's input;
  3. CSS for a 2-column layout.

There is nothing interesting about #3, hence I'm skipping it.

Items 1-2 could be accomplished using 3 trivial functions (look Ma, no React!):

$ sed -n '/script/,$p' template.html
<script>
document.addEventListener('DOMContentLoaded', main)

function main() {
let list = mk_list()
document.querySelector('#toc input').oninput = evt => {
render(list, evt.target.value)
}
render(list)
}

function render(list, filter) {
document.querySelector('#toc__list').innerHTML = list(filter).map( v => {
return `<li><a href="#${v}">${v}</a></li>`
}).join`\n`
}

function mk_list() {
let h2s = [...document.querySelectorAll('h2')].map( v => v.innerText)
return query => {
return query ? h2s.filter( v => v.toLowerCase().indexOf(query.toLowerCase()) !== -1) : h2s
}
}
</script>

<nav id="toc"><div><input type="search"><ul id="toc__list"></ul></div></nav>
<main id="doc">

This is all fine & dandy, but 30 seconds of code has many more interesting repos, like snippets of css or reactjs code. They share the same lamentable fate with the js one–once being in a single readme, they have converged lately on a single, badly-searchable website, that displays 1 recipe per user’s query.

The difference between the css/react snippets & the plain js ones is in a necessity of a preview: if you see a tasty recipe for a “Donut spinner”, you’d like to see how the donut spins, before copying the example into your editor.

In such cases, people oft resort to pasting code into one of “Online IDE”s & embedding the result into their tutorial. CodePen, for example, has even more convenient feature: you create a form (with a POST request) that holds a field with a json-formatted string which contains html/css/js assets. That way you can easily make a button “check this out on codepen”. The downside is that a user leaves your page to play with the code.

Another way to show previews alongside the docs is to create an iframe & inject all assets from a snipped into it–in this implementation you don’t rely on 3rd parties & the docs stay fully usable in off-line scenarios (nobody actually needs that, but it sounds useful to have as an option).

This requires greatly expanding the examples above: either we need 3 separate templates: one for js snippets, some other for css recipes & a disheartening one for reactjs chunks; or we force a single template act differently depending on a payload content.

For the latter approach, see this repo.

Wednesday, January 6, 2021

Twitter stats using gnuplot, json & make

Twitter allows to download a subset of user's activites as a zip archive. Unfortunately, there's no useful visualisations of the provided data, except for a simple list of tweets with a date filtering.

For example, what I expected to find but there were no signs of it:

  1. a graph of activities over time;
  2. a list of:
    1. the most popular tweets;
    2. users, to whow I reply the most.

Inside the archive there is data/tweet.js file that contains an array (assigned to a global variable) of "tweet" objects:

window.YTD.tweet.part0 = [ {
"tweet" : {
"retweeted" : false,
"source" : "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>",
"favorite_count" : "2",
"id" : "12345",
"created_at" : "Sat Jun 23 16:52:42 +0000 2012",
"full_text" : "hello",
"lang" : "en",
...
}
}, ...]

The array is already json-formatted, hence it's trivial to convert it to a proper json for filtering with json(1) tool.

Say we want a list of top 5 languages in thich tweets were written. A small makefile:

$ cat lang.mk
lang: tweets.json
json -a tweet.lang < $< | $(aggregate) | $(sort)
tweets.json: $(i)
unzip -qc $< data/tweet.js | sed 1d | cat <(echo [{) - > $@

aggregate = awk '{r[$$0] += 1} END {for (k in r) print k, r[k]}'
sort = sort -k2 -n | column -t
SHELL := bash -o pipefail

yields to:

$ make -f lang.mk i=1.zip | tail -5
cs 16
und 286
ru 333
en 460
uk 1075

(1.zip is the archive that Twitter permits us to download.)

To draw activity bars, the same technique is applied: we extract a date from each tweet object & aggregate results by a day:

2020-12-31 5
2021-01-03 10
2021-01-04 5

This can be fed to gnuplot:

$ make -f plot.mk i=1.zip activity.svg

This makefile has an embedded gnuplot script:

$ cat plot.mk
include lang.mk

%.svg: dates.txt
cat <(echo "$$plotscript") $< | gnuplot - > $@

dates.txt: tweets.json
json -e 'd = new Date(this.tweet.created_at); p = s => ("0"+s).slice(-2); this.tweet.date = [d.getFullYear(), p(d.getMonth()+1), p(d.getDate())].join`-`' -a tweet.date < $< | $(aggregate) > $@

export define plotscript =
set term svg background "white"
set grid

set xdata time
set timefmt "%Y-%m-%d"
set format x "%Y-%m"

set xtics rotate by 60 right

set style fill solid
set boxwidth 1

plot "-" using 1:2 with boxes title ""
endef

To list users, to whom one replies the most, is quite simple:

$ cat users.mk
users: tweets.json
json -e 'this.users = this.tweet.entities.user_mentions.map( v => v.screen_name).join`\n`' -a users < $< | $(aggregate) | $(sort)

include lang.mk

I'm not much of a tweeter:

$ make -f users.mk i=1.zip | tail -5
<redacted> 41
<redacted> 49
<redacted> 60
<redacted> 210
<redacted> 656

Printing the most popular tweets is more cumbersome. We need to:

  1. calculate the rating of each tweet (by a such a complex foumula as favorite_count + retweet_count);
  2. sort all the tweet objects;
  3. slice N tweet objects.

A Make recipe for it is a little too long to show here, but you can grab a makefile that contains the recipe + all the recipes shown above.