Browse thread
Case-insensitive lexing
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Martin Jambon <martin.jambon@e...> |
| Subject: | Re: [Caml-list] Case-insensitive lexing |
On Fri, 23 Feb 2007, Joel Reymont wrote:
>
> On Feb 23, 2007, at 3:50 PM, Christian Lindig wrote:
>
> > I assume you define identifier to a sequence of upper and lower
> > characters. What is wrong with this solution and what would you
> > have liked to see instead?
>
> I was hoping not having to write this (from a Lex file):
>
> A [Aa]
> B [Bb]
> ..
> Y [Yy]
> Z [Zz]
>
> {A}{B}{O}{V}{E} { return(ABOVE); }
> {A}{G}{O} { return(AGO); }
> {A}{L}{E}{R}{T} { return(ALERT); }
>
> Or even uglier
>
> {N}{U}{M}{E}{R}{I}{C}{A}{R}{R}{A}{Y}{R}{E}{F} ('ARRAY-NUM-
> REF, yytext)
> {T}{R}{U}{E}{F}{A}{L}{S}{E}{A}{R}{R}{A}{Y} ('ARRAY-NUM,
> yytext)
> {T}{R}{U}{E}{F}{A}{L}{S}{E}{A}{R}{R}{A}{Y}{R}{E}{F} ('ARRAY-NUM-
> REF, yytext)
The ocamllex version would look like:
let a = ['A' 'a']
let b = ['B' 'b']
...
a b o v e { ABOVE }
| a g o { AGO }
| a l e r t { ALERT }
Note that in micmatch--which is not ocamllex but supports the same core
syntax--there is a tilde operator which makes the regexp case-insensitive
according to latin1. It doesn't work with every language or encoding
though.
Martin
--
Martin Jambon
http://martin.jambon.free.fr