Version française
Home     About     Download     Resources     Contact us    
Browse thread
another question on lexer function
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Pierre Weis <Pierre.Weis@i...>
Subject: Re: another question on lexer function
> is it forbidden  to use the ";;" token  in a make_lexer function ?

You cannot use an arbitrary sequence of characters as a declared
keyword: the sequences you write must be recognized as an identifier
by the ``next_token'' function inside make_lexer, (or be a single
non-alphanumeric character). The next_token function recognizes two
kinds of identifiers: regular ones (roughly speaking sequences of
alphanumeric characters), and special ones or symbols (sequences of
non alphanumeric characters, such as ++). Once an identifier is
recognized, it is compared with the list of declared keywords: if
found in the list, a Kwd token is emitted, otherwise an Ident token is
returned.

Since the sequence ;; is recognized as two single characters by
next_token (same treatment as for parens, brackets or commas).  But
the rule for single non-alphanumeric characters is that they must have
been declared as keywords, or it is a lexical error. Since ; has not
been declared as a keyword of the lexer, an error occurs.

Now, if you want to deal with a token ``;;'', you may:
 1) Declare ";" as a keyword, then interpret to successive `;' tokens as a
    ";;" in your grammar rule.
 2) Adapt the genlex module to your specific needs. I strongly
    recommend this solution, in particular if you want to understand
    stream parsing, or if you will use the lexer in an intensive way.

Best regards,

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis