Version française
Home     About     Download     Resources     Contact us    
Browse thread
Case-insensitive lexing
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Martin Jambon <martin.jambon@e...>
Subject: Re: [Caml-list] Case-insensitive lexing
On Fri, 23 Feb 2007, Joel Reymont wrote:

>
> On Feb 23, 2007, at 3:50 PM, Christian Lindig wrote:
>
> > I assume you define identifier to a sequence of upper and lower
> > characters. What is wrong with this solution and what would you
> > have liked to see instead?
>
> I was hoping not having to write this (from a Lex file):
>
> A       [Aa]
> B       [Bb]
> ..
> Y       [Yy]
> Z       [Zz]
>
> {A}{B}{O}{V}{E}                 { return(ABOVE); }
> {A}{G}{O}                       { return(AGO); }
> {A}{L}{E}{R}{T}                 { return(ALERT); }
>
> Or even uglier
>
> {N}{U}{M}{E}{R}{I}{C}{A}{R}{R}{A}{Y}{R}{E}{F}         ('ARRAY-NUM-
> REF, yytext)
> {T}{R}{U}{E}{F}{A}{L}{S}{E}{A}{R}{R}{A}{Y}            ('ARRAY-NUM,
> yytext)
> {T}{R}{U}{E}{F}{A}{L}{S}{E}{A}{R}{R}{A}{Y}{R}{E}{F}   ('ARRAY-NUM-
> REF, yytext)

The ocamllex version would look like:

let a = ['A' 'a']
let b = ['B' 'b']
...

  a b o v e                 { ABOVE }
| a g o                     { AGO }
| a l e r t                 { ALERT }


Note that in micmatch--which is not ocamllex but supports the same core
syntax--there is a tilde operator which makes the regexp case-insensitive
according to latin1. It doesn't work with every language or encoding
though.


Martin

--
Martin Jambon
http://martin.jambon.free.fr