Version française
Home     About     Download     Resources     Contact us    
Browse thread
Re: [Caml-list] Matching start of input in lexer created with ocamllex
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: David Allsopp <dra-news@m...>
Subject: Re: [Caml-list] Matching start of input in lexer created with ocamllex
> Hi,
>
> I'd like to match the beginning of input (or beginning of line) in my
> lexer.  Is there an easy way to do that?
>
> I have a lexer that looks something like this (simplified):
>
> rule initial = parse
>   | '!' [' ' '\t']* "for" { FOR (current_loc ()) }
>   | ident as id { IDENT (id, current_loc ()) }
>   | '!' { BANG (current_loc ()) }
>
> The !for token should only be matched at the beginning of a
> line/input.  However, in the above lexer, there's nothing that
> prevents !for from being matched in the middle of an input string.
> This causes a problem: An input string containing !forbidXyz will be
> lexed FOR, IDENT "bidXyz".  I'd like to lex it as BANG, IDENT
> "forbidXyz".

Ocamllex doesn't have a notion for beginning of line. Three possible
solutions:

1. You can simulate it with a bool ref parameter to your lexer that gets set
to true by each rule to indicate that you're no longer at the beginning of a
line - the "!for" rule than raises Failure if it matches when this ref is
true. Slightly tedious for code maintenance... 
2. You use two lexers - one with the "!for" rule and one without and call
one lexer from the other (not very nice, because ocamllex doesn't support
code reuse between lexers so you'll be duplicating a lot of code).
3. Pre-process the input to ocamllex to include a special character that
cannot appear in your text and place that at the beginning of the line (e.g.
one of the control characters in 0..31).

HTH,


David