Browse thread
Ocamlyacc question
- skaller
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | skaller <skaller@u...> |
| Subject: | Ocamlyacc question |
Suppose I have a typical grammar with a non-terminal 'expr' which parses an expression. I would like to parse the longest prefix of an input token stream leaving the first 'bad' token in the stream, so I could also use recursive descent. Specifically, my problem is to take my existing 'whole language' parser and split it up so I can do somthing like: 1. if next is one of a set of (dynamically determined) keywords, execute the parser for the corresponding statement, otherwise apply the ocamlyacc default parsing rule for one statement. 2. Statement parsers can call a function to parse expressions. The expression parser also has to be extensible, using some hybrid of the ocamlyacc grammar and a table driven precedence parser. Some expression constructs cannot be done with a simple precedence parser (if/then/elif/else, let/in, match .. etc). The problem, as usual, is to make the Ocamlyacc grammar obey the open/closed principle (open for extension, closed for compilation) which I guess has to be done with recursion, and I just can't see how to do this so that 'expr' will process a + b ) and return AST for a+b, leaving ) in the input. Can an 'error' production be used to achieve this? An LALR pushdown machine seems quite powerful: already a FSA based pushdown machine (called a Recursive Transition Network or RTN) can parse any context free language, and be driven by a meta grammar (a grammar enhanced by Kleene closure operator *, such as EBNF). I believe Python actually uses such a system. It is appealing because it can be very efficient (compared to an RD parser), but is still extensible (though not as much as pure RD). It also allows opening up 'more parts of the language' to extension, such as user defined operators and keyword introduced constructions, whilst still enforcing some structure. My pragmatic problem is to modify my existing parser without too much impact (I prefer not to rewrite the whole thing for another tool). -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net