[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Xavier Leroy <Xavier.Leroy@i...> |
| Subject: | Re: GenLex stream parsers too eager? |
> It appears that the Genlex derived parsers always eagerly tokenize > negaitve integer and float constants. This causes incorrect behavior > in closely spaced code (no-spaces): > > a-2*c --> parses as "a", "-2" ,"*", "c" instead of "a","-","2","*","c" > Right. This is a classic compiler problem: one can either tokenize negative integer literals in the lexer (-?[0-9]+), which causes the weird behavior above for expressions without spaces, or have the lexer tokenize only positive integer literals ([0-9]+) and add a special case in the parser to recognize "-" followed by an integer literal. Genlex is very simple-minded and follows the former approach. The Caml compilers follow the latter. (The latter approach has its own problems. For instance, in Caml, it parses "f -1" as "f minus 1", not as "f applied to the integer -1", like many users expect.) > Any suggestions? (Perhaps I should be using OCAMLLEX and OCAMLYACC instead?) You'll have to write your own lexer, indeed. You can either use ocamllex to generate it, or start with the source code of the Genlex module and customize it to your needs. Best regards, - Xavier Leroy