Version française
Home     About     Download     Resources     Contact us    
Browse thread
About the O'Reilly book on the web
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Till Varoquaux <till.varoquaux@g...>
Subject: Re: [Caml-list] About the O'Reilly book on the web
hello!
On 11/28/06, Philippe Wang <lists@philippewang.info> wrote:
> Hello,
...
> # "[0-9]+\.[0-9]+\.[0-9]+";;
> - : string = "[0-9]+\\.[0-9]+\\.[0-9]+"
>
> You can probably avoid warnings by backslashing your backslashes...
>
> Still I believe the OCaml Team should find another way to express
> regular expressions, because if \. and \\. both mean \\. then it is a
> very bad idea...
Although I do agree the problem seems a little more complicated:
we are used to a more or less standard regexp syntax where special
chars can be escaped by \, this obviously clashes with escaping
characters in string if we pass strings to the function defining the
regular exceptions....
I would recommend treating all warnings as errors:
-warn-error A
to avoid such conflicts.

As far as I'm concerned I find the problem to be more complicated:
regular expressions are not syntaxily checked nor are they typed
checked when specified through strings. Some languages intergrate them
as first class values  thus allowing these verifications. Another
solution would be to build them using an Ocaml recursive sum type.
Although this would solve the syntax problem it would make regexp very
tedious to write. A library offering both options can be found at:
http://www.lri.fr/~marche/regexp/

Ideally one would want to precompile regular expression from strings
to actual constructed types using a preprocessor (e.g. camlp4). It
seems Francois Potier was one of the first to try such an approach:
[http://caml.inria.fr/pub/ml-archives/caml-list/2001/07/30b327c7c4b0fa5ace86dbf258e2c5d1.en.html]
I'm pretty sure this has been done in other libraries (regexp-pp  for
instance). Actual type-checking might prove a little harder to get
working.
Cheers,
Till
P.S.:Je confirme: j'ai bien recu ton mail ;-)...