[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2001-09-10 (15:31) |
From: | Xavier Leroy <Xavier.Leroy@i...> |
Subject: | Re: [Caml-list] lexer disambiguation? |
> since the lexer looks like an ordinary ocaml function (more or less), does > the disambiguation boil down to: > > 1. the longest series of bytes that matches a single rule > 2. match the first rule in the function that matches #1 I'm not sure which lexer you're talking about. Lexers generated by ocamllex do indeed implement the behavior you describe: longest match + first rule if several rules matches the same maximal-length substring. (But they sure don't look like ordinary OCaml functions: they just call an underlying table-driven DFA engine that does all the hard work!) Lexers written using stream parsers behave like all stream parsers: they select the first pattern that matches the beginning of the stream, then "commit" to this pattern, matching the remainder of the pattern without backtracking. This "commit" behavior is different from regular pattern-matching on (say) lists, which backtracks as necessary. The OCaml lexer (used by the compilers and the toplevel), as well as the generic lexer in module Genlex, also implement the longest-match rule, so that for instance abcd is one identifier, not four identifiers a, b, c, and d. I hope this answers your question. - Xavier Leroy ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr