English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
Parameterised lexer
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2008-09-16 (09:38)
From: David Allsopp <dra-news@m...>
Subject: RE: [Caml-list] Parameterised lexer
Definitely not possible (directly) with ocamllex - what you're suggesting
would involve recompiling the automaton on each call which isn't how
ocamllex works. Don't know about ulex.

But: do you know enough about the kind of expressions that param could be to
use one regexp that would cover them all (e.g. ['x'|'y'|'z'] for the example
below)? You could then have a lexer action of the form:

rule token param = parse
  reg-exp-for-params {if Str.string_match param (Lexing.lexeme lexbuf) 0
                      then () (* Code *)
                      else failwith "lexing: empty token"}
| rest-of-the-lexer


-----Original Message-----
From: caml-list-bounces@yquem.inria.fr
[mailto:caml-list-bounces@yquem.inria.fr] On Behalf Of Dario Teixeira
Sent: 14 September 2008 21:53
To: caml-list@yquem.inria.fr
Subject: [Caml-list] Parameterised lexer


Is it possible to write a ocamllex/ulex scanner where a regexp is a
to the lexer function?  I'm looking for something like the (invalid) ulex
code below demonstrates ("param" is the parameter):

let regexp alpha = ['a'-'z' 'A'-'Z']
let regexp whitespace = [' ' '\t' '\n']
let regexp param1 = 'x'
let regexp param2 = 'y'
let regexp param3 = 'z'

let rec token param = lexer
        | param         ->      Printf.print "*";
                                token param lexbuf
        | alpha+        ->      Printf.printf "%s" (Ulexing.utf8_lexeme
                                token param lexbuf
        | whitespace+   ->      Printf.printf " ";
                                token param lexbuf
        | eof           ->      Printf.printf "EOF\n"

let main () =
        let lexbuf = Ulexing.from_utf8_channel stdin
        in token param1 lexbuf

let _ = Printexc.print main ()

Thanks in advance for your help!
Kind regards,
Dario Teixeira


Caml-list mailing list. Subscription management:
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs