Version française
Home     About     Download     Resources     Contact us    
Browse thread
Ocamlyacc question
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: skaller <skaller@u...>
Subject: Ocamlyacc question
Suppose I have a typical grammar with a non-terminal 'expr' which
parses an expression. I would like to parse the longest prefix
of an input token stream leaving the first 'bad' token in the
stream, so I could also use recursive descent.

Specifically, my problem is to take my existing 'whole
language' parser and split it up so I can do somthing like:

1. if next is one of a set of (dynamically determined)
keywords, execute the parser for the corresponding statement,
otherwise apply the ocamlyacc default parsing rule for one
statement.

2. Statement parsers can call a function to parse expressions.
The expression parser also has to be extensible, using some
hybrid of the ocamlyacc grammar and a table driven precedence
parser. Some expression constructs cannot be done with a simple
precedence parser (if/then/elif/else, let/in, match  .. etc).

The problem, as usual, is to make the Ocamlyacc grammar
obey the open/closed principle (open for extension, closed
for compilation) which I guess has to be done with recursion,
and I just can't see how to do this so that 'expr' will process

	a + b )

and return AST for a+b, leaving ) in the input. Can an
'error' production be used to achieve this?

An LALR pushdown machine seems quite powerful: already
a FSA based pushdown machine (called a Recursive Transition
Network or RTN) can parse any context free language,
and be driven by a meta grammar (a grammar enhanced by
Kleene closure operator *, such as EBNF). I believe 
Python actually uses such a system. It is appealing because
it can be very efficient (compared to an RD parser), but is
still extensible (though not as much as pure RD). It also
allows opening up 'more parts of the language' to extension,
such as user defined operators and keyword introduced
constructions, whilst still enforcing some structure.

My pragmatic problem is to modify my existing parser
without too much impact (I prefer not to rewrite the whole
thing for another tool).

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net