Saturday, February 21, 2015

A minimalistic node version manager

If you pay attention to nodejs world & suddenly find yourself using 3 version of node simultaneously, you probably may start thinking about a version manager.

There are some existing ones, like nvm & n. They are nice, but both are written in bash & may require a periodic update after a new node/iojs release.

What I want from the 'manager' is that it doesn't integrate itself w/ a shell & doesn't require a constant updating.

A 'non-updating' feature resolves in a drastic code simplification: if a version manager (VM) doesn't know how to install a new node version whatsoever, then you don't need to update its code (hopefully whatsoever too).

A non-bash requirement dates back to rvm, which has been redefining cd for us since 2009. It doesn't mean of course that a VM written in bash would obligatory modify built-in shell commands, but observing the rvm struggle w/ bash, have discouraged me from sh-like solutions.

The VM should be fast, so writing it in Ruby (unfortunately) is not an option, due to a small (but a noticeable) startup overhead that any ruby CLI util has. Ideally it also should have no dependencies.

This leaves us w/ several options. We can use mruby or plain C or, wait, there is Golang! In the past its selling point was a 'system' language feeling.

Well. I can tell that it's not as poignant as Ruby for sure, but it's hyper fast & quite consistent. It took me roughly a day to feel more or less comfortable w/ it, which is incomparable w/ a garbage like C++. Frankly I was surprised myself that it went so smooth.

Back to the YA version manager for node. It's called nodever, it uses a 'subshell' approach via installing system-wide wrappers & it's a tiny Go program.

Saturday, February 7, 2015

node.js 0.12, stdin & spawnSync

If you have in your code a quick hack like this:

stdin = fs.readFileSync('/dev/stdin').toString()

& it works fine & nothing really happens bad, so you may start wondering one day why is it considered by everyone as a temporal solution?

Node's readFileSync() uses stat(2) to get the size of a file it tries to read. By definition, you can't know ahead the size of stdin. As one dude put it on SO:

Imagine stdin is like a water tap. What you are asking is the same as "How much water is there in a tap?".

by using stat(2) readFileSync() will read up to what lenght value the kernel will lie/guess about /dev/stdin.

Another issues comes w/ testing. If you have a CL utility & want to write an acceptance test for it using 'new' node 0.12 child_process.spawnSync() API, expect funny errors.

Suppose we have a node version of cat that's written in a dumb 'synchronous' way. Call it cat-1.js:

#!/usr/bin/env node

var rfs = require('fs').readFileSync

if (process.argv.length == 2) {
        process.stdout.write(rfs('/dev/stdin'))
} else {
        process.argv.slice(2).forEach(function(file) {
                process.stdout.write(rfs(file))
        })
}

Now we write a simple test for it:

var assert = require('assert')
var spawnSync = require('child_process').spawnSync

var r = spawnSync('./cat-1.js', { input: 'hello' })
assert.equal('hello', r.stdout.toString())

& run:

$ node test-cat-1-1.js

assert.js:86
  throw new assert.AssertionError({
        ^
AssertionError: 'hello' == ''
    at Object.<anonymous> (/home/alex/lib/writing/gromnitsky.blogspot.co
m/posts/2015-02-07.1423330840/test-cat-1-1.js:5:8)

What just happened? (I've cut irrelevant trace lines.) Why the captured stdout is empty? Lets change the test to:

var assert = require('assert')
var spawnSync = require('child_process').spawnSync

var r = spawnSync('./cat-1.js', { input: 'hello' })
console.error(r.stderr.toString())

then run:

$ node test-cat-1-2.js
fs.js:502
  return binding.open(pathModule._makeLong(path), stringToFlags(flags),
mode);
                 ^
Error: ENXIO, no such device or address '/dev/stdin'
    at Error (native)
    at Object.fs.openSync (fs.js:502:18)
    at fs.readFileSync (fs.js:354:15)

At this point unless you want to dive into libuv internals, that quick hack of explicitly reading /dev/stdin should be changed to something else.

In the past node maintainers disdained the stdin sync read & called it an antipattern. The recommended way was to use streams API, where you employed process.stdin as a readable stream. Still, what if we really want a sync read?

The easiest way is to make a wrapper around readFileSync() that checks filename argument & invokes a real readFileSync() when it's not equal to /dev/stdin. For example, lets create a simple module readFileSync:

var fs = require('fs')

module.exports = function(file, opt) {
        if ( !(file && file.trim() === '/dev/stdin'))
                return fs.readFileSync(file, opt)

        var BUFSIZ = 65536
        var chunks = []
        while (1) {
                try {
                        var buf = new Buffer(BUFSIZ)
                        var nbytes = fs.readSync(process.stdin.fd, buf, 0, BUFSIZ, null)
                } catch (err) {
                        if (err.code === 'EAGAIN') {
                                // node is funny
                                throw new Error("interactive mode isn't supported, use pipes")
                        }
                        if (err.code === 'EOF') break
                        throw err
                }

                if (nbytes === 0) break
                chunks.push(buf.slice(0, nbytes))
        }

        return Buffer.concat(chunks)
}

It's far from ideal, but at least it doesn't use stat(2) for determining stdin size.

We modify out cat version to use this module:

#!/usr/bin/env node

var rfs = require('./readFileSync')

if (process.argv.length == 2) {
        process.stdout.write(rfs('/dev/stdin'))
} else {
        process.argv.slice(2).forEach(function(file) {
                process.stdout.write(rfs(file))
        })
}

& modify the original version of the acceptance test to use it too:

var assert = require('assert')
var spawnSync = require('child_process').spawnSync

var r = spawnSync('./cat-2.js', { input: 'hello' })
assert.equal('hello', r.stdout.toString())

& run:

$ node test-cat-2-1.js

Yay, it doesn't throw up an error & apparently works!

To be sure, generate a big file, like 128MB:

$ head -c $((128*1024*1024)) < /dev/urandom > 128M

then run:

$ cat 128M | ./cat-2.js > 1
$ cmp 128M 1
$ echo $?
0

Which should return 0 if everything was fine & no bytes were lost.