Browse thread
ocamllex and python-style indentation
-
Andrej Bauer
- yoann padioleau
- Andreas Rossberg
- Martin Jambon
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Martin Jambon <martin.jambon@e...> |
| Subject: | Re: [Caml-list] ocamllex and python-style indentation |
Andrej Bauer wrote:
> My parsing powers are not sufficient to easily come up with
> lexer/parser for a simple language that uses python-style indentation
> and newline rules. Does anyone have such a thing lying around, written
> in ocamllex/yacc or menhir? I would appreciate a peek to see how
> you've dealt with it.
>
> For example, suppose we want just a very simple fragment of Python
> involving True, False, conditional statements, variables, and
> assignments, such as:
>
> if True:
> x = 3
> y = (2 +
> 4 + 5)
> else:
> x = 5
> if False:
> x = 8
> z = 2
>
> How would I go about writing a lexer/parser for such a thing in ocaml?
I would use a first pass that converts the input lines into this imaginary
structure:
{
if True:
;
{
x = 3
;
y = (2 +
;
{
4 + 5)
}
}
;
else:
;
{
x = 5
;
if False:
;
{
x = 8
;
z = 2
}
}
}
You could create a generic tool that parses a file into this:
type t = Line of loc * string | Block of loc * t list
but as suggested by Yoann, the next step should probably be to flatten this
into a stream by introducing artificial tokens:
type gen_token =
Open of loc (* fake "{" *)
| Close of loc (* fake "}" *)
| Separator of loc (* fake ";" *)
| Line of loc * string
then parse each Line into a list of tokens and flatten the result into one
single token stream:
type token =
OPEN_BLOCK of loc (* fake "{" *)
| CLOSE_BLOCK of loc (* fake "}" *)
| SEPARATOR of loc (* fake ";" *)
| ... (* your language-specific tokens here *)
The token stream could then be processed by ocamlyacc/menhir.
That's the approach I would follow if I had to solve this problem again.
Martin
--
http://mjambon.com/