English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
The lexer hack
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2009-11-10 (14:42)
From: Dario Teixeira <darioteixeira@y...>
Subject: The lexer hack

I'm creating a parser for a LaTeX-ish language that features verbatim blocks.
To handle them I want to switch lexers on-the-fly, depending on the parsing
context.  Therefore, I need the state from the (Menhir generated) parser
to influence the lexing process (I believe this is called the "lexer hack"
in compiler lore).

Presently I am doing this by placing a module between the lexer and the
parser, listening in on the flow of tokens, and using a crude state machine
to figure out the parsing context.  This solution is however error-prone
and a bit wasteful, since I'm reimplementing by hand stuff that should be
the sole competence of the parser generator.

Anyway, since I'm sure this problem pops up often, does someone have any
alternative suggestions?  I would preferably keep Menhir, but I'll switch
if some other generator offers a better approach(*).

Thanks + best regards,
Dario Teixeira

(*) I've looked into Dypgen, and its partial actions may offer a way out.
    Does someone have any experience with those and with real-world usage
    of Dypgen in general?  (In other words, is it stable enough for
    production use?)