Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
ocamlyacc: named attributes
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2007-01-14 (07:07)
From: skaller <skaller@u...>
Subject: Re: [Caml-list] ocamlyacc: named attributes
On Sun, 2007-01-14 at 01:40 +0100, Till Varoquaux wrote:
> Hmm,
> Probably a dumb question, but: why not use Menhir? It's supposedly
> strictly more powerfull than ocamlyacc and has this feature.

Two reasons. The first is political: Menhir isn't part of the
standard Ocaml distribution. This means clients of my system
will have to download and install and additional package
before they can build mine. That includes on Windows.
If a binary wasn't available they'd have to build from source.

This may not be a big deal if it is pure Ocaml, which from
memory it is, right? However then I have to include the sources
in my own package, which introduces a repository synchronisation
problem .. and unless Menhir has a liberal licence .. also a 
licencing problem.

In addition there is a reliability and responsibility issue:
the Ocaml team is paid to maintain Ocaml.

My system is itself an unreliable programming language,
so I don't need any additional complicating factors:
clients already need to install Ocaml, Python, and C++
before they can build it.

The second reason is technical: Menhir generates Ocaml code
which executes the parser, whereas Ocamlyacc is table driven.
This may or may not matter: its hard to say. 

I do some really nasty tricks with Ocamlyacc/lex combo to 
embed a recursive descent parser in my language, allowing the 
end user to define their own statements and expressions.
In effect this makes the system dynamically open recursive.

Now actually, Menhir may handle this *better* than Ocamlyacc,
since at least some of the trickery is required to work around
the very poor 'yacc' idea that you have to declare %tokens in
your grammar specification, which then generates the token
type .. together with the failure of Ocamlyacc to make any
provision for adding specification to the mli file.

These restrictions make Ocamlyacc hard to use properly in
more advanced cases, and worse, the interface of parsers
is plainly incorrect, since it accepts a lexer and lex buffer
and undesirable coupling of parsers with lex buffers .. when
in principle they're unrelated. This is convenient for
toy parsers to find the location of a syntax error, but 
inconvenient for production parsers.

As I recall Menhir actually solves some of these problems.

So IF Menhir were in the standard distribution I might 
consider it. Note however I do not want LR(1) parsing:
my grammar has an independent specification of LALR(1)
and strictly unambiguous (with no %prec notations).
[Of course the user defined statements break this]

In the distant future Felix may be bootstrapped,
so the 'upgrade' to Menhir may involves some work which
is lost: Felix itself already has a far better parser technology
than either Ocamlyacc or Menhir: it uses Elkhound which is a 
GLR parser, and parsers can be embedded directly in the language
without the need for external yacc/lex files (in particular,
the constructions can be nested anywhere, so that they can even
access enclosing function scope).

Elkhound can also generate Ocaml parsers, so if I were to upgrade
it would have to be considered an option, particularly as the
whole Elkhound sources are already packaged with Felix.

Sorry for the long winded reply .. but there is an underlying
theme: the 'Open Source' movement is severely compromised by 
build and licence issues: neither of these things are about
technical merits. 

John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: