Thursday, January 18, 2018

There is no price for good advice

From The Design & Evolution of C++ by Bjarne Stroustrup:

'In 1982 when I first planned Cfront, I wanted to use a recursive descent parser because I had experience writing and maintaining such a beast, because I liked such parsers' ability to produce good error messages, and because I liked the idea of having the full power of a general-purpose programming language available when decisions had to be made in the parser.

However, being a conscientious young computer scientist I asked the experts. Al Aho and Steve Johnson were in the Computer Science Research Center and they, primarily Steve, convinced me that writing a parser by hand was most old-fashioned, would be an inefficient use of my time, would almost certainly result in a hard-to-understand and hard-to-maintain parser, and would be prone to unsystematic and therefore unreliable error recovery. The right way was to use an LALR(1) parser generator, so I used Al and Steve's YACC.

For most projects, it would have been the right choice. For almost every project writing an experimental language from scratch, it would have been the right choice. For most people, it would have been the right choice. In retrospect, for me and C++ it was a bad mistake.

C++ was not a new experimental language, it was an almost compatible superset of C - and at the time nobody had been able to write an LALR(1) grammar for C. The LALR(1) grammar used by ANSI C was constructed by Tom Pennello about a year and a half later - far too late to benefit me and C++. Even Steve Johnson's PCC, which was the preeminent C compiler at the time, cheated at details that were to prove troublesome to C++ parser writers. For example, PCC didn't handle redundant parentheses correctly so that int(x); wasn't accepted as a declaration of x.

Worse, it seems that some people have a natural affinity to some parser strategies and others work much better with other strategies. My bias towards topdown parsing has shown itself many times over the years in the form of constructs that are hard to fit into a YACC grammar. To this day [1993], Cfront has a YACC parser supplemented by much lexical trickery relying on recursive descent techniques. On the other hand, it is possible to write an efficient and reasonably nice recursive descent parser for C++. Several modern C++ compilers use recursive descent.'