Browse thread
Hacking the lexer in the new camlp4
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | skaller <skaller@u...> |
| Subject: | Re: [Caml-list] Hacking the lexer in the new camlp4 |
On Thu, 2007-03-29 at 17:29 +0200, Nicolas Pouillard wrote: > On 3/29/07, Harrison, John R <john.r.harrison@intel.com> wrote: > > | > In the current camlp4, the only way I found to do this was basically > > | > to copy the existing lexer and edit it. Although it works, it's ugly > > | > and invariably means that I've had to change something with almost > > | > every new version of camlp4. Does the new camlp4 offer a nicer way > > of > > | > changing the lexer? > > | > > | How did you that in the previous one without copy/paste the old lexer? > Ok, so the new one is build with ocamllex, so it's not really extensible. This doesn't follow entirely. There are at least two ways to extend ocamllex lexers. 1. Recursively process a given lexeme. 2. Dispatch the error case with enough information to start another lexer. Method 1 can either tokenise the given lexeme, or simply use it as a trigger to start another lexer. Method 2 is just a special case of method 1. All you really need is to pass the lexer a class with an overridable method for each lexemical class the lexer recognizes, which accepts the state data, and returns a list of tokens. Ocamllex currently ensures that the state of the buffer is ready to process the next character after the lexeme just decoded, even if it had to overshoot to get there. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net