Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
Re: [Caml-list] Pattern matching and strings
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2002-10-04 (09:07)
From: Pierre Weis <pierre.weis@i...>
Subject: Re: [Caml-list] Pattern matching and strings (and a mini-bug in Scanf)
> I meant what I wrote. The %s conversion stops reading at the 
> first whitespace character. However, ocaml does not like 
> the  "%[^]" which, in my opinion, is to be considered a 
> mini-bug. "%[^]" should be interpreted as "the set of all 
> characters except none", which is "the set of all 
> characters", which can also be expressed, more verbosely, as 
> "%[\000-\255]". By the same standards, "%[]" is rejected, 
> when it should be interpreted as "the set containing no 
> characters", or more verbosely "%[^\000-\255]"
> Alex

This is not a mini-bug, this is a carefully crafted feature and
thoroughly considered design decision :(

"%[]" is not allowed because we need to allow the matching of the
closing bracket. Hence, if ']' just follows the opening bracket (or
the ^ character, in case of negative range) it is considered as a
plain ']' (or ascii code 93) to be matched.

So "%[]]" means matching ']' (well, more precisely zero or more ']');
more generally, "%[]range]" means matching ']' or range; "%[^]range]"
means matching any character different from ']' and not belonging to

Admitedly, we could have made a different decision, such as a special
escape for ']' in a range that would have made possible your
interpretation of "%[]". However, I considered that the matching of an
empty range of characters (or conversely a full range of characters)
was way less useful and frequent than matching a ']'. Hence, the more
complex expressions "%[^\000-\255]" and "%[\000-\255]" for the seldom
used ranges.

To end with, I want also to elaborate a bit on the last sentence of
the Scanf documentation because many people misunderstand it:

   Note: the [scanf] facility is not intended for heavy duty
   lexical analysis and parsing.

As you said ``I meant what I wrote''. I did not mean that [scanf] is
slow (it is not). I did not mean that it is not powerful (since it is
much more powerful than the corresponding C facility). I meant that it
is perfectly ok to write complex input reading functions (up to, say,
a polymorphic list scanner for instance), but [scanf] is not the right
tool to write a full-fledge Caml parser.

On the other hand, writing a polymorphic list scanner using lex and
yacc seems to me way more difficult than the 10 or 15 lines of code
you need to implement the polymorphic list scanner, if you use the
[Scanf] module facilities :)

May be this note should be removed, since, at last, it just means ``Use
the right tool for the work at hand'', which evidently goes without
saying when you are talking to people that already chose Objective
Caml to write their programs !

Pierre Weis

INRIA, Projet Cristal,,

To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: