Browse thread
Re: The lexer hack
-
Jeff Shaw
-
Dario Teixeira
-
Francois Pottier
-
Dario Teixeira
- Martin Jambon
-
Dario Teixeira
-
Francois Pottier
-
Dario Teixeira
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2009-11-11 (11:04) |
From: | Martin Jambon <martin.jambon@e...> |
Subject: | Re: [Caml-list] Re: The lexer hack |
Dario Teixeira wrote: > Hi, > >> Interesting. Have you confirmed that this works? I am slightly >> worried by the fact that an LR parser reads one token ahead, >> i.e. one token past BEGIN_VERB might already have been read >> before the enter_verb semantic action is executed. If that is >> so, then this token would be read while the lexer is still in >> the wrong mode. > > Yes, I was just thinking about that as well... :-) > I think I can pile another hack on top of the dummy action: > dummy tokens to take care of the readahead issue. Though > this has the potential to get comically silly pretty quickly! > > I'll report later... If the lexer to use can be determined by only one token (BEGIN_VERB), I think you can change the state in the lexer like this: rule token state = parse "" { match !state with `Normal -> normal_token state lexbuf | `Verbatim -> verbatim_token state lexbuf } and normal_token state = parse ... | "\\begin{verbatim}" { state := `Verbatim; BEGIN_VERB } and verbatim_token state = parse ... { RAW (...) } | "\\end{verbatim}" { state := `Normal; END_VERB } An even simpler option, if possible in your case, is to use a single token for the whole verbatim section: rule token = parse ... | "\\begin{verbatim}" { finish_verbatim lexbuf } and finish_verbatim = shortest _* as s "\\end{verbatim}" { RAW s } Martin -- http://mjambon.com/