Thursday, October 24, 2013

Multi-Lingual Interface With Jekyll

Imagine you have a site in N languages, for example, in English & Ukrainian. The content of articles is different & cannot be auto-translated, but we can ensure that the GUI on each part of the site is localized. All menus, buttons, tooltips, etc can be edited or localized without modifying the app source code.

Let's start with the example.

$ jekyll --version
jekyll 1.2.1

$ jekyll new blog && cd blog && mkdir _plugins

$ for i in en uk
  do
  (mkdir $i \
  && cd $i && ln -s ../css ../_l* ../_plugins . \
  && cp -r ../*yml ../_posts ../*html .); \
  done

$ rm -rf _config.yml _posts index.html

For each section we copied _posts, _config.yml & index.html, because for each site they all are different, and we symlinked css & _layouts directories because for each site they will be the same.

The site's structure looks like this:

_layouts/
|__ ..
|__ default.html
|__ post.html
_plugins/
|__ ..
css/
|__ ..
|__ main.css
|__ syntax.css
en/
|__ ..
|__ _layouts@ -> ../_layouts/
|__ _plugins@ -> ../_plugins/
|__ _posts/
|__ css@ -> ../css/
|__ _config.yml
|__ index.html
uk/
|__ ..
|__ _layouts@ -> ../_layouts/
|__ _plugins@ -> ../_plugins/
|__ _posts/
|__ css@ -> ../css/
|__ _config.yml
|__ index.html
.gitignore

Now, install jekyll-msgcat:

$ gem install jekyll-msgcat

Create _plugins/req.rb file & add to it 1 line:

require 'jekyll/msgcat'

Add to uk/_config.yml:

msgcat:
  locale: uk
  # may be 'domain' or 'nearby'
  deploy: nearby

Then open _layouts/default.html, find the line

<a class="extra" href="/">home</a>

& replace it with:

<a class="extra" href="/">{{ 'Return to Home' | mc }}</a>

(Quotes are essential.)

As you see, we are using some unknown Liquid filter 'mc'. If you check the uk site

$ (cd uk; jekyll serve)

& go to http://127.0.0.1:4000/, nothing will change, everything would be in English as before. To automatically substitute 'Return to Home' string with something else we need to create a message catalog.

In our case, the message catalog is just a .yaml file:

$ cat uk/_msgcat.yaml
uk:
  'Return to Home': На головну сторiнку

What is handy about this is that if the string in the message catalog isn't provided or if there is no _msgcat.yaml file at all, the default English string would be used. Kill jekyll's server & start it again to test.

Links to Localized Versions

The other problem you may have is how to generate a link from a current page to the same page in other language.

If you choose to host each site on a separate subdomain, e.g. en.example.com & uk.example.com, set the value of``msgcat.deploy`` key in site's _config.yml to domain. If you like a scheme without subdomains & prefer example.com/blog/en && example.com/blog/uk, set the key's value to nearby.

Make sure you have url & baseurl in _config.yml. In Liquid templates use cur_page_in_another_locale filter. For example, in _layouts/default.html:

{{ 'en' | cur_page_in_another_locale }}
{{ 'uk' | cur_page_in_another_locale }}

will generate in en site (msgcat.deploy == domain):

<a href='#' class='btn btn-primary btn-xs disabled'>en</a>
<a href='http://uk.example.com/index.html' class='btn btn-primary btn-xs '>uk</a>

or for msgcat.deploy == nearby:

<a href='#' class='btn btn-primary btn-xs disabled'>en</a>
<a href='/blog/uk/index.html' class='btn btn-primary btn-xs '>uk</a>

If you don't like injected names of Bootstrap's CSS classes, use the filter with an empty parameter:

{{ 'en' | cur_page_in_another_locale: "" }}
{{ 'uk' | cur_page_in_another_locale: "" }}

Or provide your own class name(s) instead of the empty string.

Friday, March 1, 2013

Creating Emacs Multi-file Packages

(This text assumes your familiarity with the difference between simple vs. multi-file packages in Emacs, how to create them, etc.)

After writing NAME-pkg.el, creating tar file & successfully installing a package from your local test archive, you may notice a small problem: the package meta information (its version, name, etc) appears in 2 or 3 places. Take, for example, a version number:

  • it's sitting somewhere in the code as a variable value;
  • it exists in NAME-pkg.el;
  • it's stored in Makefile because your target must be aware of the output file name (which must contain the version number).

Some even prefer to include it in README.

In other package systems like npm, this is a non-issue, because their package.json file that contains all the meta can be a first class citizen in the libraries that npm delivers. It's trivial to parse it & there are nive CLI tools like jsontool that can be used in Makefiles to extract any data from package.json.

Of course we can 'parse' our NAME-pkg.el file too. This snippet will read foobar-pkg.el file and return the version string from it:

(nth 2 (package-read-from-string
      (with-temp-buffer
        (insert-file-contents
         "foobar-pkg.el")
        (buffer-string))))

But it won't solve the problem with Makefile. For instance, you'll need to write a custom CLI util only to grab package's name & version from NAME-pkg.el.

meta.json

Instead we'll take another path & store all information about our package in a .json file. JSON can be easily parsed in elisp & with jsontool's help we can extract all data within Makefile.

meta.json may look like this:

{
    "name" : "foobar",
    "version" : "0.0.1",
    "docstring" : "Free variables and bound variables",
    "reqs" : {
        "emacs" : "24.3"
    },
    "repo" : {
        "type": "git",
        "url" : "git://example.com/foobar.git"
    },
    "homepage" : "http://example.com",
    "files" : [
        "*.el",
        "README",
        "meta.json"
    ]
}

If you're not familiar with jsontool, install it via npm -g jsontool & play:

$ json name < meta.json
foobar
$ json files < meta.json | json -a
*.el
README
meta.json
$ json -a -d- name version < meta.json
foobar-0.0.1

It's very handy.

Getting Meta Into Elisp

That .json file can be parsed once while our package is loading into Emacs. We can wrap that in a library, for example, foo-metadata.el:

(require 'json)

(defvar foo-meta (json-read-file
                 (concat (file-name-directory load-file-name) "/meta.json")))

(defconst foo-meta-version (cdr (assoc 'version foo-meta)))
(defconst foo-meta-name (cdr (assoc 'name foo-meta)))

(provide 'foo-metadata)

Then you just write (require 'foo-metadata) in your code.

Package Generation

Consider the minimal multi-file structure of some Foobar project:

foobar/
|__ ..
|__ bin/
|   |__ ..
|   |__ foo-make-pkg
|__ Makefile
|__ fb-bar.el
|__ fb-foo.el
|__ fb-foobar.el
|__ meta.json

Notice that file foobar-pkg.el is missing. Instead we have strange bin/foo-make-pkg utility that generates it. If we write it properly enough we can reuse it in another emacs project:

:; exec emacs -Q --script "$0" -- "$@" # -*- mode: emacs-lisp; lexical-binding: t -*-

(setq
 debug-on-error t                     ; show stack stace
 argv (cdr argv))                     ; remove '--' from CL arguments

(require 'json)

(when (not (= 2 (length argv)))
  (message "Usage: %s meta.json some-pkg.el" (file-name-base load-file-name))
  (kill-emacs 1))

(setq data (json-read-file (car argv)))

(setq reqs (cdr (assoc 'reqs data)))
(when reqs
  (let (rlist)
    (dolist (idx reqs)
      (push (list (car idx) (cdr idx)) rlist))
    (setq reqs `(quote ,rlist))
    ))

(with-temp-file
    (nth 1 argv)
  (insert (prin1-to-string
           (list 'define-package
                 (cdr (assoc 'name data))
                 (cdr (assoc 'version data))
                 (cdr (assoc 'docstring data))
                 reqs))))

Test it by running:

$ bin/foo-make-pkg meta.json foobar-pkg.el && cat !#:1
(define-package "foobar" "0.0.1" \
    "Free variables and bound variables" (quote ((emacs "24.3"))))

To bring all together we need 2 targets in Makefile: foobar-pkg.el that generates that file & a phony target package that creates elpa-compatible tar.

.PHONY: clean package

JSON := json
METADATA := meta.json
PKG_NAME := $(shell $(JSON) -a -d- name version < $(METADATA))

foobar-pkg.el: meta.json
    bin/foo-make-pkg $@

package: foobar-pkg.el
    $(TAR) --transform='s,^,$(PKG_NAME)/,S' -cf $(PKG_NAME).tar \
        `$(JSON) files < $(METADATA) | $(JSON) -a`

clean:
    rm foobar-pkg.el $(PKG_NAME).tar

Recall that with meta.json we have 1 definitive source of all project metadata, so when you'll need to update the version number or the project dependencies or the contents of the tar or whatever--you'll edit only 1 file.

There is, of course, another route--even without any file generation. For example, you can gently parse foobar-pkg.el in elisp & have an utility that from static foobar-pkg.el produces JSON, which goes to jsontool input.

Thursday, February 28, 2013

Emacs, ERT & Structuring Unit Tests

ERT framework, that everyone is using this days in Emacs, provide very little guidance on how to organize & structure unit tests.

Running tests in the Emacs you are working in is quote idiotic. Not only you can easily pollute editor's global namespace in case of mistyping, but unit tests in such mode cannot be reliable at all, because it's possible to create unwanted dependencies on a data structures that weren't properly destroyed in previous tests invocations.

Emacs batch mode

The only one right way to execute tests is to use emacs batch mode. The idea is: your Makefile contains test target which goes to test directory, which contains several test_*.el files. Each test_*.el file can be run independently & has a test selector (a regexp) that you may optionally provide as a command line parameter.

For example, consider some Foobar project:

foobar/
|__ ..
|__ test/
|   |__ ..
|   |__ test_bar.el
|   |__ test_foo.el
|   |__ test_utils.el
|__ Makefile
|__ foo-bar.el
|__ foo-foo.el
|__ foo-foobar.el
|__ foo-utils.el

To make this work, each test_* file must know where to find foo-*.el libraries & how to run its tests. Ideally it should not depend on a current directory from which user actually runs it.

test_utils.el script then looks like:

:; exec emacs -Q --script "$0" -- "$@"

(setq tdd-lib-dir (concat (file-name-directory load-file-name) "/.."))
(push tdd-lib-dir load-path)
(push (file-name-directory load-file-name) load-path)

(setq argv (cdr argv))

(require 'foo-utils)

(ert-deftest ignorance-is-strength()
  (should (equal (foo-utils-agenda) "war is peace")))

(ert-run-tests-batch-and-exit (car argv))

Here is quite a header before the ert-deftest definition.

1st line is a way to tell your kernel & bash to run emacs with current file as an argument. -Q option forces Emacs not to read your ~/.emacs file, not to process X resource, etc. This helps (a) to start Emacs as quickly as possible & (b) to force your code not to depend on your local customizations.

Next 3 lines modify load-path list which is used by Emacs to search for files when you 'require' or 'load' something. We add to that list a parent directory, where our *.el files are. Note that load-file-name contains an absolute path to the current test_utils.el file.

Next line removes '--' cell from argv list, so that (car argv) will give you 1st command line parameter passed to the script.

(require 'foo-utils) line loads ../foo-utils.el file (if you have provided 'foo-utils in it, of course).

Next 2 lines are usual ERT test definition with 1 assertion in this example.

The last line is a ERT command that runs your unit tests. Notice its argument--it allows you to optionally run the script as:

$ ./test_utils.el regexp

to filter out unmatched ert-deftest definitions.

Makefile

You can add to it 2 useful targets: test & compile. The last one transforms .el files to .elc & sometimes produces useful info about unused variables, etc:

.PHONY: test compile clean

ELC := $(patsubst %.el,%.elc,$(wildcard *.el))

%.elc: %.el
    emacs -Q -batch -L `pwd` -f batch-byte-compile $<

test:
    @for idx in test/test_*; do \
        printf '* %s\n' $$idx ; \
        ./$$idx ; \
        [ $$? -ne 0 ] && exit 1 ; \
    done; :

compile: $(ELC)

clean:
    rm $(ELC)

Hints

Try to make every test non-interactive. For example, if your command ask user for confirmation via (y-or-n-p), Emacs even in batch mode stops and waits for input from the terminal. If you need to answer "yes", just monkey patch the function:

(setq tdd-y-or-n nil) ;; by default say "no"
(defun y-or-n-p (prompt)
  tdd-y-or-n)

and then write an assert as:

(let ((tdd-y-or-n t))
  (should (freedom-is-slavery)))

You can monkey patch any elisp function except those which are compiled in (e.g. come from .c files & are 'primitive' in Emacs terminology).

Unfortunately, famous (message) function is built-in & cannot be monkey patched. If you use it heavily in the code, your non-interactive tests will fill the stderr with garbage that will distract you. It's better to use a global (to your project namespace) flag & a wrapper for (message):

(defconst foo-meta-name "foobar")
(defvar foo-verbose 1)

(defun foo-warn (level str &rest args)
"Print a message via (message) according to LEVEL."
(when (<= level foo-verbose)
  (if (/= 0 level) (setq str (concat foo-meta-name ": " str)))
  (message (apply 'format str args))
  ))

Then use (foo-warn 1 "hi, mom") in the code instead of (message). In .el libraries foo-verbose variable can be equal to 1, but in your tests set it to -1 to prevent printing to stderr.

Friday, January 25, 2013

ssh command quoting hell

When you type

$ ssh user@host 'cat /tmp/foo.txt'

cat /tmp/foo.txt part of that string is evaluated twice: 1) by your current shell as a single quoted string, 2) by a shell on a remote host.

Lets assume you want to write a script that backups some directory from a remote machine. A naive version:

$ cat mybackup.sh
#!/bin/sh

[ -z "$1" -o -z "$2" ] && exit 1

tcd=$1
tdir=$2
ssh user@host "tar cvf - -C $tcd $tdir | gzip" > foo.tar.gz

and if you run it like this:

$ ./mybackup.sh /home joe

And if everything goes ok, you'll get foo.tar.gz which will contain joe's home directory files. But what if $1 or $2 arguments contain spaces and/or quotes? I'll tell you:

$ ./mybackup.sh /home/joe 'tmp/foo "bar'
bash: -c: line 0: unexpected EOF while looking for matching `"'
bash: -c: line 1: syntax error: unexpected end of file

This is a bash error from a remote host because it tries to run

tar czv -C /home/joe tmp/foo "bar | gzip

and "bar contains an unmached quote. Obvously this is not the command you had in mind.

How can we fix that? Another naive approach would be to single-quote some variables in the script:

ssh user@host "tar cvf - -C '$tcd' '$tdir' | gzip" > foo.tar.gz

And this will work for our example but will fail if tmp/foo "bar directory would have a name tmp/foo 'bar (with a single quote instead of a double).

To make it work regardless of such shades we need somehow to transform $1 and $2 script arguments to quoted strings. Such transformed strings shall be a safe choice for substrings that represent to-be-executed commands on the remote host.

One nuance: transforming must be done not by the rules of /bin/sh or your current local shell, but by the rules of user's shell on a remote host. (See do_child() function in session.c of openssh source: it extracts user's shell name from users db on a remote machine & constructs arguments for execve(2) as "/path/to/shell_name", "shell_name", "-c", "foo", "bar".)

If the remote shell is a sh-derived one, the trasformation function can look like:

sq() {
    printf '%s\n' "$*" | sed -e "s/'/'\\\\''/g" -e 1s/^/\'/ -e \$s/\$/\'/
}

(Taken from of http://unix.stackexchange.com/a/4774.)

Then, a final version of the 'backup' script would be:

#!/bin/sh

sq() {
    printf '%s\n' "$*" | sed -e "s/'/'\\\\''/g" -e 1s/^/\'/ -e \$s/\$/\'/
}

[ -z "$1" -o -z "$2" ] && exit 1

tcd=$1
tdir=$2
out=`basename "$tdir"`.tar.gz

cmd="tar cvf - -C `sq $tcd` `sq $tdir` | gzip"
echo "$cmd"
ssh user@host "$cmd" > "$out"

Hint: when in doubt, run (openssh) ssh with -v option and search for 'debug1: Sending command' string in the output.