Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] string_of_float less accurate than sprintf "%f" ?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Xavier Leroy <xavier.leroy@i...>
Subject: Re: [Caml-list] string_of_float less accurate than sprintf "%f" ?
> while doing some time measurements with Unix.gettimeofday() I
> discovered a problem with string_of_float:
> # string_of_float 123456789.123456789;;
> - : string = "123456789.123"
> OK, just it may by just an inaccuracy of the float type. However,
> sprintf returns a different result:
> # sprintf "%f" 123456789.123456789;;
> - : string = "123456789.1234567"

This is not an inaccuracy of the underlying float value, which is
indeed double-precision, but a consequence of the definition of
string_of_float, which is essentially sprintf "%.12g".  So, you get a
different rounding than sprintf "%f".  Here, %f is more precise, but
not always (try with 1e-12 for instance).

> My application needs to be fast (that's why I am using OCaml :-) and
> sprintf is of course slower than string_of_float.

Not by much.  I recommend that you use sprintf with the floating-point
format appropriate for your application, e.g. "%.3f" for printing
times with a millisecond precision.  

Now, you might wonder why string_of_float doesn't "do the right thing"
and prints its float argument with as many digits as necessary to ensure
e.g. float_of_string(string_of_float f) = f.  The main reason is
pragmatic: OCaml's float-to-string conversions are built on top of the
sprintf() function from the C runtime library, and the latter doesn't
provide a "print a float with enough digits to represent it exactly"
format.  David Chase mentioned some third-party source that does this
(thanks for the pointer); I wish the C library would provide something
like this.

There might be a more philosophical issue behind this.  For a
numerical analyst, or physicist, or experimental scientist in general,
floating-point numbers are just approximate measurements of
experimental measures, or results of computations on these approximate
measurements.  Hence, there is no such thing as "the" string
representation of a floating-point value: not all digits are meaningful,
and how many significant digits to print depends on the physical
problem being modeled and solved.  With this viewpoint,
string_of_float doesn't make any sense, and you should always use
sprintf with the float format appropriate for your problem.

Then, there is the computer engineering viewpoint on floating-point
numbers, which are collections of (say) 64 bits with well-defined (if
a bit convoluted) operations on them such as addition, multiplication,
etc, as specified in IEEE 754.  From this viewpoint, it makes sense to
have conversions to and from strings that are exact, i.e. without
information loss.  (It is feasible, but much harder than it sounds;
there was two full papers at PLDI 94 (?) on this problem.)

I'm not taking sides here, just noticing that Java takes the computer
engineering viewpoint and C (and Caml, by inheritance of
implementation :-) takes the physicist's viewpoint...

- Xavier Leroy
To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: