Version française
Home     About     Download     Resources     Contact us    
Browse thread
Matching start of input in lexer created with ocamllex
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: skaller <skaller@u...>
Subject: Re: [Caml-list] Matching start of input in lexer created with ocamllex
On Thu, 2007-04-05 at 17:37 +0300, Janne Hellsten wrote:
> Hi,
> 
> I'd like to match the beginning of input (or beginning of line) in my
> lexer.  Is there an easy way to do that?
> 
> I have a lexer that looks something like this (simplified):
> 
> rule initial = parse
>   | '!' [' ' '\t']* "for" { FOR (current_loc ()) }
>   | ident as id { IDENT (id, current_loc ()) }
>   | '!' { BANG (current_loc ()) }
> 
> The !for token should only be matched at the beginning of a
> line/input.  However, in the above lexer, there's nothing that
> prevents !for from being matched in the middle of an input string.
> This causes a problem: An input string containing !forbidXyz will be
> lexed FOR, IDENT "bidXyz".  I'd like to lex it as BANG, IDENT
> "forbidXyz".

I do something like this:

let table = ["for", FOR; "while", WHILE]
..
| space-not-newline + { WHITE }
| newline { NEWLINE }
| ident as id { try assoc id table with Not_found -> IDENT id }

An alternative to the WHITE and NEWLINE tokens is a tail
recursive call to the lexer:

| space + { initial lexbuf }

which just skips over the spaces.


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net