Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
"nested" parsers
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 1999-12-03 (18:29)
From: skaller <skaller@m...>
Subject: Re: "nested" parsers
Hendrik Tews wrote:
> Georges Mariano writes:
>    Date: Thu, 02 Dec 1999 13:25:04 +0000
>    Subject: "nested" parsers
>    It appears that we can divide the language L in a few "sub"-languages.
>    Let's say that  L3 <: L2 <: L1 = L   ('<:' included in )
>    And we would like to have one entry point for each language
>    in our parser.
> ocamlyacc generates a parsing function for echa start symbol that
> you declare in the grammar file. Therefore the easiest way (if
> possible) is that you rewrite your grammar, such that for each of
> the languages Ln you have one metasymbol, which generates this
> language. Then you include several start directives in the header
> of the grammar file.

	But it isn't clear how to invoke these sublanguages
recursively, from with an action associated with a reduce
operation for the very sublanguage which we wanted a separate
parser for.

	For example, the Python language can conveniently
be divided into two distinct languages: a statement language
and an expression language. Certainly, I can have a non-terminal
for statements, and one for expressions, and make both entry
points --- and in my parser for Python I do just that.

	But that isn't what I want to do. 
What I actually want is a recursive transition machine corresponding
to a meta-grammar (RHS of productions can be regular expressions;
for example, BNF]

	This can be organised by a finite state automaton in which
a non-terminal transition pushes down the whole automaton,
and starts a new one corresponding to the RHS of the production
defining the non-terminal labelling the arc being followed.

	Chosing the right arc is the hard bit. If the first sets
of all the arcs out of a node are known, the number of trial
parses can be reduced -- to one, if the first sets are disjoint.

John Skaller,
10/1 Toxteth Rd Glebe NSW 2037 Australia
voice: 61-2-9660-0850