Version française
Home     About     Download     Resources     Contact us    
Browse thread
Hacking the lexer in the new camlp4
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: skaller <skaller@u...>
Subject: Re: [Caml-list] Hacking the lexer in the new camlp4
On Thu, 2007-03-29 at 17:29 +0200, Nicolas Pouillard wrote:
> On 3/29/07, Harrison, John R <john.r.harrison@intel.com> wrote:
> > | > In the current camlp4, the only way I found to do this was basically
> > | > to copy the existing lexer and edit it. Although it works, it's ugly
> > | > and invariably means that I've had to change something with almost
> > | > every new version of camlp4. Does the new camlp4 offer a nicer way
> > of
> > | > changing the lexer?
> > |
> > | How did you that in the previous one without copy/paste the old lexer?

> Ok, so the new one is build with ocamllex, so it's not really extensible.

This doesn't follow entirely. There are at least two ways to 
extend ocamllex lexers.

1. Recursively process a given lexeme. 

2. Dispatch the error case with enough information to start
another lexer.

Method 1 can either tokenise the given lexeme, or simply
use it as a trigger to start another lexer. Method 2 is
just a special case of method 1.

All you really need is to pass the lexer a class with
an overridable method for each lexemical class the lexer
recognizes, which accepts the state data, and returns
a list of tokens.

Ocamllex currently ensures that the state of the buffer
is ready to process the next character after the lexeme
just decoded, even if it had to overshoot to get
there.


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net