Browse thread
ocamllex and python-style indentation
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Martin Jambon <martin.jambon@e...> |
| Subject: | Re: [Caml-list] Re: ocamllex and python-style indentation |
Sylvain Le Gall wrote:
> Hello,
>
> On 01-07-2009, Andreas Rossberg <rossberg@mpi-sws.org> wrote:
>> Mike Lin wrote:
>>> OK, now I'm curious :) how does your lexer match balanced parentheses,
>>> or in this case comments?
>>>
>> Easily, with a bit of side effects (I think that's roughly how all ML
>> compilers do it):
>>
>> ------------------------------------------------
>> let error l s = (* ... *)
>> let commentDepth = ref 0
>> let start = ref 0
>> let loc length = let pos = !start in (pos, pos+length)
>>
>> rule lex =
>> parse eof { EOF }
>> (* | ... *)
>> | "{-" { start := pos lexbuf;
>> lexNestComment lexbuf }
>>
>> and lexNestComment =
>> parse eof { error (loc 2) "unterminated comment" }
>> | "(*" { incr commentDepth;
>> lexNestComment lexbuf }
>> | "*)" { decr commentDepth;
>> if !commentDepth > 0
>> then lexNestComment lexbuf
>> else lex lexbuf }
>> | _ { lexNestComment lexbuf }
>> ------------------------------------------------
>>
>> If you also want to treat strings in comments specially (like OCaml),
>> then you need to do a bit more work, but it's basically the same idea.
>>
>
> May I recommend you to write this in a more simple way:
>
> -------------------------------------------------------------------------
> rule lex =
> parse eof { () }
> | "(*" { start := pos lexbuf; lexNestComment lexbuf; lex lexbuf }
>
> and lexNestComment =
> parse eof { error (loc 2) "unterminated comment" }
> | "(*" { lexNestComment lexbuf }
> | "*)" { () }
> | _ { lexNestComment lexbuf }
> -------------------------------------------------------------------------
>
> I think it works the same way, except that it uses less global
> variables.
You can even get rid of global variables completely:
rule lex x = parse
eof { () }
| "(*" { x.start <- pos lexbuf; lexNestComment x lexbuf; lex x lexbuf }
and lexNestComment x = parse
eof { error (loc x 2) "unterminated comment" }
| "(*" { lexNestComment x lexbuf }
| "*)" { () }
| _ { lexNestComment x lexbuf }
Martin
--
http://mjambon.com/