Version française
Home     About     Download     Resources     Contact us    
Browse thread
Re: [Caml-list] Ocamlyacc vs stream parser
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Diego Olivier Fernandez Pons <Diego.FERNANDEZ_PONS@e...>
Subject: Re: [Caml-list] Ocamlyacc vs stream parser

Michal Moskal a écrit :
> Oh, so by proof-by-book you're right :-) But in practice LARL(1)
> seems more usefull for parsing, at parsing least programming
> languages.

It is easier to be right when one uses up-to-date books. Even if it is
an excellent book, the Dragon is almost 20 years old, which in
computer science is quite a lot.

Anyway, that is not the main point.

The question is 'are LR parsers really more powerfull/difficult to
implement/ difficult to hack/difficult to understand... than LL
parsers' ? And why ?

There are three fundamental parts in a parser :
- the control automaton
- the desambiguisation technique
- the non-determinism handling system

LL parsers are traditionaly compiled to push-down automata whereas LR
parsers usualy use LR(0) automata. In the first case the stack is
explicit and transitions have the form (q, E, t, E', q') which means q
-> q' is possible if (E, t) is the current stack * string, and if the
transition is taken, (E, t) will be removed and (E', epsilon) pushed.
In the second case the stack is implicit and contains the list of the
previously visited nodes. When a transition (q, t, q') is taken, q is
pushed in the stack. When a reduction takes place A -> abc, 3 symbols
of the stack are popped.

This may seem a big difference, but both stack-automata are
equivalent. You can transform one into the other and then write a LL
parser with a LR(0) control automata, and conversely a pda LR parser.
In fact, the Dragon book (chapter 4.4 of the french edition at least)
shows a LL parser controled by a LR(0) automaton. Wim Pijls wrote in
1993 a paper titled 'Unifying LL and LR parsing' in which he shows
that 'traversing the LR(0) automaton in a special way is equivalent to
LL(1) parsing'. (More recent papers 'LR, LC and LL parsing, some new
points of view'. All papers available on citeseer)

Then, LR and LL have the same (theoretical) power, which is the
ability of parsing algebraic (ie context-free) grammars.

Of course... but conflicts ?

When you clearly separate the control automaton and the
desambiguisation mechanism, you notice that all usual algorithms
(LL(k), LR(k), LALR(k), SLR(k)) use the same principle : approximating
algebraic languages by rational ones.

Let's take an LL conflict example :

sentence to parse 'abaa'

S -> aX | aY | cZ
X -> b...
Y -> c...
Z -> d...

Which one to chose, X or Y ?

For an LR algorithm,

a shift/reduce conflict

X -> a
Y -> aa

a reduce/reduce conflict

X -> a
Y -> a

Every non-terminal X in a grammar generates a context-free language
L(X) which can be approximated (upper bound) by a rational language
R(X). The idea is that checking if a sentence can be derivated from a
rational language is linear while it is cubic for the algebraic case.

(LL example)

L(X) is included in aE*
L(Y) is included in bE*

When your entery is ab... if it can be derived by the grammar, the
only way is to be derived by X (S -> aX -> ab...)

Both LL(k) and LR(k) use a k-prefix rational approximation technique.
That is why their tables occupate so much space (the cartesian product
of the control automaton and a k-trie). LALR and SLR follow the same
idea but merge some states according to some given rules.

Then, there is no reason for LR, LALR and SLR to be more conceptually
difficult than LL.

Notice that TAG (Tree Adjoint Grammars) and HPSG (Head driven Phrase
Structure Grammar) try to compute more discriminating approximations
using the properties of some particular terminals (named 'foot' in the
first, 'head' in the latter).

The difference of power can be explained by a few simple arguments :
- LL(1) should be compared to LR(0), not to LR(1)
- top-down algorithms elagate, bottom-up algorithms memorize. But the
approximation is essentially an elagation technique. Then, LR take
more advantage of this orthogonality. But the same power could be
achieved for LL parsers if they were added memorization (LL + CPS,

Finally, the non-determinism handling question... There are some
conflicts you just cannot solve (otherwhise you would have proven
rational = algebraic). There are two techniques : breadth first search
(GLR) and depth-first search (backtracking). Both require memorization
not to compute several times the same thing.

Of course, Masaru Tomita named in 1986 his parsing technique (using
ideas of Bernard Lang in 1974, theirselves taked from Jan Earley 1970)
General LR. But it could have been General LL too, since graph
traversals are generic algorithms.

Brian Rogoff a écrit :
> I happen to think that recursive descent is the best way to write
> parsers, but note that recursive descent parsers are capable of
> parsing non-LL(1) grammars, even without the fairly obvious hacks.

Michal Moskal a écrit :
> But keyword parser build with camlp4 libraries can be modified at
> runtime

The ease of implementation is another classical discussion. The
parsing algorithm (ascending or descending) is orthogonal to its
recursive functions/table implementation or the static/dynamic data
structures used.

Most of the people think 'huge tables' as soon as they hear LR(1).
But this is only the case if you make the two following design choices
- collapsing the control automaton and the desambiguisation automata
- representing graphs by tables

You can of course implement a LR(0) automaton with recursive functions
(recursive ascent), using function calls and exceptions (or any
equivalent technique...). LR(0) has in general a quite moderated
number of states.

You can represent all your graphs (stack or desambiguisation automata)
with purely functional data structures, leading to a parser which can
be modified at runtime.

More over, separating correctly the three parts of your parsers allow
you to use different representation/optimisation techniques for every
element :  non-deterministic bit atomata for desambiguisation automata
of less than 31 states, matrix representation for dense graphs, ...
instead of one monolithic 'table compression' technique. And all this
algorithms can be hidden behind a library, not to be seen by the
low-level students, casual users, common programmers...

Last point, as said by Luc Maranget, exhaustiveness can be computed on
pda or lr(0) automata. Not having anything to do with conflict
resolution, there is no need to work on the desambiguisation automata.

        Diego Olivier

To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: