Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Markus Mottl <markus@m...>
Subject: Re: [Caml-list] standard regex package
On Fri, 24 Aug 2001, Xavier Leroy wrote:
> The last time this topic came up on this list, I said that we aren't
> opposed to put PCRE in the OCaml distribution (provided Markus agrees
> with that, of course).

No objection on my side. That's why I have LGPLed it.

> BUT: in the name of backward compatibility, we must have an
> Str-compatible interface to this library (same functions and same
> regexp syntax as in Str), in addition to the native PCRE interface.

As was mentioned in our last discussion on this topic, backwards
compatibility would require writing a stateful interface around the
PCRE, conversion functions for regular expressions and compatible
implementations of the other functions. Is this really necessary? Why
not just keep the old Str-module and deprecate its use? Of course, if
the strange behaviour of Str wrt. large regexps is severe, somebody would
have to do it if debugging Richard Stallman's code is not an option... ;)

> I think it can be done, but the replies I got to this request were of
> the form "I don't have time to do this".

Ahem, well, what concerns me, this is unfortunately the case right now. I
really need to get on with my actual project (a machine learning system).

What about the many new heros on this list? This would be a good exercise!

> Also: the PCRE interface is quite heavyweight, with a zillion options
> whose purpose are not always clear to me.  This can be a bit frightening
> and will need a lot of carefully worded documentation to explain that
> most of these options are useless 99% of the time :-)  This is not a
> criticism towards Markus' work, more like a criticism towards Perl's
> and PCRE's "creeping featuritism" syndrom.

I agree. The reasons why I made it rather heavyweight are that hardly
anybody could argue that Perl or the PCRE support features he needs but
are not supported by this library, thus easing the change to OCaml. I
was also practicing writing C-interfaces at this time so I thought I'd
implement all PCRE-functions for practice.

I would certainly not have any objections against making the library more
lightweight: probably many functions could be removed without hesitation
(e.g. information on patterns). It may also be worthwhile to reconsider
the way labels and optional arguments are used, though the latter only
look evil in the interface but are extremely convenient to use once one
gets the scheme behind, which is invariant throughout the library.

Markus Mottl

Markus Mottl                                   
Austrian Research Institute
for Artificial Intelligence        
Bug reports:  FAQ:
To unsubscribe, mail  Archives: