English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
int_of_string bug
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2007-03-30 (06:23)
From: skaller <skaller@u...>
Subject: Re: [Caml-list] int_of_string bug
On Fri, 2007-03-30 at 15:59 +1000, Erik de Castro Lopo wrote:
> skaller wrote:
> > On Thu, 2007-03-29 at 21:26 -0400, Yaron Minsky wrote:
> > > # int_of_string "1073741824";;
> > > - : int = -1073741824
> > > # int_of_string "1073741825";;
> > > Exception: Failure "int_of_string".
> Thats the behaviour on 32 bit systems.
> > # int_of_string "1073741824";;
> > - : int = 1073741824
> > # int_of_string "1073741825";;
> > - : int = 1073741825
> But 64 bit systems get it right.

The point being .. the behaviour for large values is
platform independent anyhow, so in the abstract
you can say the behaviour is undefined for large values,
where 'large' isn't specified.

If you want to get it RIGHT: if you have a user input string
possibly containing digits, and you want to convert it,
you must already write a parser to parse the input,
so you won't be using int_of_string anyhow.

If the input was generated (say by another Ocaml program),
then it will already be correct.

In the Felix compiler, after lexing 'string of digits'
I use the Big_int module to convert to an integer:
that behaviour is platform independent.

If I really want an int (say for indexing), and there's
a risk of the conversion overflowing .. there's a risk
that even without overflowing the data is wrong and will
blow up, eg .. I'm not going to be indexing arrays
with max_int elements .. :)

If I really want to check, I'll use an application specific
bound such as 16000, and check the big_int against that
before converting. Thus, all the operations are deterministic
and platform independent if you do things properly.

So the 'bug' in string_of_int is just an inconvenience.

IMHO there is a 'bug' in some Ocaml documentation, where the
abstract language is not clearly distinguished from the
implementation. Throwing exceptions on error should generally
NOT be considered a specified part of the language.

Undefined behaviour is sometimes the right specification because it
allows superior optimisation and prevents programmers
relying on exceptions. This doesn't prevent the implementation
throwing them, it just means catching them locally in your
code is a bug (because you can't be sure they will be thrown).

Bounds violations are a good example of this, and indeed
since Ocaml allows -unsafe switch to disable bound checks
you'd better NOT rely on catching them. The same applies
to match failures -- use a wildcard if you want to catch
unmatched cases (otherwise be willing to sketch a proof
to your boss that there can't be a violation :)

John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net