Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
About the O'Reilly book on the web
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2006-11-28 (23:07)
From: Philippe Wang <lists@p...>
Subject: Re: [Caml-list] About the O'Reilly book on the web

> Although I do agree the problem seems a little more complicated:
> we are used to a more or less standard regexp syntax where special
> chars can be escaped by \, this obviously clashes with escaping
> characters in string if we pass strings to the function defining the
> regular exceptions....
> I would recommend treating all warnings as errors:
> -warn-error A
> to avoid such conflicts.

I am not used to using regexp with Caml... (although I write some 
scripts with OCaml instead of using Bash or Perl, or even Php, sometimes...)
But writing systematically the double backslash for a single regexp 
backslash and a quadruple backslash for a backslashed backslash... Well 
I wouldn't do it!

> As far as I'm concerned I find the problem to be more complicated:
> regular expressions are not syntaxily checked nor are they typed
> checked when specified through strings. Some languages intergrate them
> as first class values  thus allowing these verifications.

Last semester, with some friends we wrote a Caml compiler (that keeps 
the type informations at runtime), and one idea (which we did not 
implement because of some lack of time) was first class regexp, a bit 
like in Perl, while mixing it with the match-with-like syntax... (If 
only we could have as much time as we want or need...)

It could be something "usable", like :

matchr (* the matchr keyword is an example *)
   "some string"
| "PLOP 42 - \([0-9]\)" -> Int (int_of_string $1)
| "PLOP 43 - \(.*\)" -> $1

Then you can close it like in Coq with an "end" keyword, or just keep 
the OCaml syntaxe... (implicit closing, whatever)

Well, of course, bad thing could be that $ is a character for infix 
operators (but whatever it's not so important)

Then the question is still "What do we want OCaml to be?"
Do we want to make regexps easy to use with OCaml ? And are we ready to 
make OCaml bigger ("just") for that ?

> Another
> solution would be to build them using an Ocaml recursive sum type.
> Although this would solve the syntax problem it would make regexp very
> tedious to write. A library offering both options can be found at:

It looks really ... "not funny" I would say!

> Ideally one would want to precompile regular expression from strings
> to actual constructed types using a preprocessor (e.g. camlp4). It
> seems Francois Potier was one of the first to try such an approach:
> [] 
> I'm pretty sure this has been done in other libraries (regexp-pp  for
> instance). Actual type-checking might prove a little harder to get
> working.

Actually I don't really see the types problems...
If everything in return with $n has type string, then there is no 
matter... We can also easily detect that $4 does not exist for a regexp 
such as "\(PLOP[4-2]\) 42.*" :-p
(or decide that $4 would be an empty string, but it would seem a bit dirty)

Anyways, it would probably be more "a good thing" than "a bad thing"...


> P.S.:Je confirme: j'ai bien recu ton mail ;-)...

PS : Je n'ai toujours pas compris comment ça marche...
  Parce que le mail qui est passé est celui qui a été envoyé avec la 
mauvaise adresse o_O
  Peut-être que je comprendrai un jour prochain...
(donc je retente avec la "mauvaise adresse" puisque ça semble mieux