Browse thread
Re: The lexer hack
-
Jeff Shaw
-
Dario Teixeira
-
Francois Pottier
-
Dario Teixeira
- Martin Jambon
-
Dario Teixeira
-
Francois Pottier
-
Dario Teixeira
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Martin Jambon <martin.jambon@e...> |
| Subject: | Re: [Caml-list] Re: The lexer hack |
Dario Teixeira wrote:
> Hi,
>
>> Interesting. Have you confirmed that this works? I am slightly
>> worried by the fact that an LR parser reads one token ahead,
>> i.e. one token past BEGIN_VERB might already have been read
>> before the enter_verb semantic action is executed. If that is
>> so, then this token would be read while the lexer is still in
>> the wrong mode.
>
> Yes, I was just thinking about that as well... :-)
> I think I can pile another hack on top of the dummy action:
> dummy tokens to take care of the readahead issue. Though
> this has the potential to get comically silly pretty quickly!
>
> I'll report later...
If the lexer to use can be determined by only one token (BEGIN_VERB), I think
you can change the state in the lexer like this:
rule token state = parse
"" { match !state with
`Normal -> normal_token state lexbuf
| `Verbatim -> verbatim_token state lexbuf
}
and normal_token state = parse
...
| "\\begin{verbatim}" { state := `Verbatim; BEGIN_VERB }
and verbatim_token state = parse
... { RAW (...) }
| "\\end{verbatim}" { state := `Normal; END_VERB }
An even simpler option, if possible in your case, is to use a single token for
the whole verbatim section:
rule token = parse
...
| "\\begin{verbatim}" { finish_verbatim lexbuf }
and finish_verbatim = shortest
_* as s "\\end{verbatim}" { RAW s }
Martin
--
http://mjambon.com/