Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] On ocamlyacc and ocamllex
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2001-09-23 (17:44)
From: Christian Lindig <lindig@e...>
Subject: Re: [Caml-list] On ocamlyacc and ocamllex
On Sun, Sep 23, 2001 at 07:27:36PM +0300, Vesa Karvonen wrote:
> From: "Christian Lindig" <>
> > The particular problem can be solved outside of Lex and Yacc: in the
> > Quick C-- compiler we have a mutable data type that
> > records the connection between character positions and
> > (file,line,column) triples.
> This is basically the same technique that I have been using. The problem is
> that the map has to be global, because the only context passed to the lexer
> actions is the lexbuf. 

You can pass the map to the lexer such that it does not has to be

    rule token = parse
        eof         { fun map -> P.EOF          }
      | ws+         { fun map -> token lexbuf map }
      | tab         { fun map -> tab lexbuf map; token lexbuf map }
      | nl          { fun map -> nl lexbuf map ; token lexbuf map }
      | nl '#'      { fun map -> line lexbuf map 0; token lexbuf map }

The lexer built from the above specification takes a lexbuf and map as

> Furthermore, the records need to be manually removed (in order to save
> memory) after a file has been processed completely and the recorded
> connections for the file are no longer needed. 

I assume that in a functional programming style without a global mutable
value the garbage collector will remove the map once I cannot access it
any longer.

> The basic idea was to put the token type definition into a separate
> module.  Instead of two source files, you would have three source
> files:
>     lexer.mll parser.mly

> In parser.mly there would be code that would tell ocamlyacc to look at
> for the token type.

Now you would have to keep the token type and the grammar up to dateup
to date manually.  The parser generator also needs more informations
than just the token types: precedences, associativity, and return types
are tied to a token - where do you keep them?. I still think that
generating the token type from the grammar is the easiest way. 

-- Christian

Christian Lindig          Harvard University - DEAS   33 Oxford St, MD 242, Cambridge MA 02138
phone: +1 (617) 496-7157
Bug reports:  FAQ:
To unsubscribe, mail  Archives: