Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2001-08-23 (16:07)
From: Neale Pickett <neale-caml@w...>
Subject: Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
Frank Atanassow writes:

> Ocaml does not purport to have no side-effects. It has plenty of
> side-effects.  You must be thinking of Haskell or Miranda.

That's probably half of my problem, then :-)

> I'm pretty sure there is no such optimization, but I'm not sure what
> you're talking about here. Anyway, if an optimization affected the
> behavior of a program, it would not be an optimization but rather an
> compiler bug.

Having slept on it, I think what I was experiencing might be linked with
the fact that the Str library is apparently non-reentrant and my
approach to using the regexp parts of Str.  What I ran into was, I
think, a bug in either the Str library or its documentation.

Originally, I was trying to do something like this:

# let string_lines =
    let sep = Str.regexp "^[ \t\n]*\\(.+\\)" in
    let rec f = function
      | [] -> []
      | s :: rest -> if (Str.string_match sep s 0) then
          (Str.matched_group 1 s) :: (f rest)
          f rest
  string_lines ["  hello"; "  dromedaries"];;
Uncaught exception: Invalid_argument "String.sub".

(Apologies if this is inelegant, I'm just starting out.)

Alain Frisch <> points out:

> This is wrong; with the current OCaml implementation, the right
> operand of (::) is called first; so (Str.matched_group 1 s) is called
> after subsequent calls to Str.string_match, which is obviously
> incorrect.

I contest that this is obvious.  s is a different string each time f is
called, and so even though I do call Str.string_match multiple times,
it's with a different s.  The manual for the Str libary says only that I
must pass in the same s as was given to string_match, which implies that
s is somehow keyed to its matches.  It sounds as though I shouldn't do
the following:

  Str.string_match sep s 0;
  Str.string_match sep s' 0;
  print_string (Str.matched_group 1 s);

If this is the case, why does Str.matched_group even bother requiring
the original string?

I may be missing some crucial aspect to OCaml, and if so, I apologize
for this excercise in my own ignorance.  With my current understanding
of the language, though, it looks as though to use the regexp parts of
Str, I need to understand the underlying implementation of the library,
or at least know not to call string_match as above.  If the former, I
would consider this a bug; if the latter, it should just be added to the
documentation.  Either way, it's confusing.

> If I understand you correctly (but I don't think I do):

> # Str.split (Str.regexp "[ \t\n]+") "  abc def  ghi j";;
> - : string list = ["abc"; "def"; "ghi"; "j"]

This is, in fact, exactly what I was trying to do.  I wanted to code it
as a recursive function to show a friend the difference between
functional and procedural programming, got caught up in the exception,
and forgot what my original intent was.  Thank you!
Bug reports:  FAQ:
To unsubscribe, mail  Archives: