New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Big_int.float_of_big_int is slightly unportable (was: Int64.to_float is slightly unportable) #5021
Comments
Comment author: @xavierleroy Hi Pascal, As far as I know, all x86-64 code generators (gcc, ocamlopt, etc) use the "cvtsi2sdq" instruction to convert from 64-bit int to 64-bit float, and this instruction rounds according to the current rounding mode. So, this behavior is not new with Snow Leopard; the same code is generated e.g. in MacOS 10.4 or in Linux. I haven't yet checked what other processors do. The workaround you propose is clever (as usual) but I'm concerned about two things: 1- It's slower than the default long->double conversion. Probably not a big deal for Int64.to_float and Nativeint.to_float, but a bit of a concern for floatofint. 2- Cross-platform compatibility is good, but compatibility between ocamlopt-generated code and C compiled code on the same platform is also desirable. If the C compiler chooses to round towards zero, your code sequence behaves differently. What do you think? |
Comment author: pascal_cuoq I am really sorry, I think I got sidetracked when I was investigating the portability issue I was having. I have not been able to produce any problem with Int64.to_float, which rounds according to rounding mode everywhere I tried. It is Big_int.float_of_big_int I meant to complain about. Here is this time a bona fide bug report: conv.ml: external round_up: unit -> unit = "round_up" ;; let b = 2000000001L ;; let b = Big_int.big_int_of_int64 (Int64.mul b b) ;; round_up () ;; Format.printf "%Ld@." (Int64.bits_of_float (Big_int.float_of_big_int b)) ;; stub.c: #include <fenv.h> #include <caml/mlvalues.h> value round_up(value dummy) Commandline: ocamlopt nums.cmxa stub.o conv.ml Result on Snow Leopard: Result on Leopard on an Intel mac (and pretty much everywhere else where people run Frama-C's regression tests): Note that Mac OS X started to use the FPU rounding mode for displaying floating-point numbers in Snow Leopard, unlike every other platform. This other change may be part of the same wave. According to Stephen Canon from stackoverflow.com, Snow Leopard's behavior is correct and every other platform is wrong to systematically use round-to-nearest for displaying floats. |
Comment author: @xavierleroy If you look at the definition of float_of_big_int, you'll see that it is really a hack: let float_of_big_int bi = So, what you observe is the behavior of float_of_string, nee strtod(). And, no, I'm not surprised that the last bit of the mantissa differs from platform to platform; I wouldn't even be surprised if several low bits were wrong... If you're using float_of_big_int in a context where every bit matters, it's time to reimplement it better, perhaps along the lines you mentioned. (Extract the top two 32-bit ints from the bigint, then convert to float, then combine and scale by the appropriate power of 2.) Sounds like a job for a floating-point expert :-) Are you interested in giving it a try? |
Comment author: pascal_cuoq You are not giving the current implementation of float_of_big_int enough credit. The only two behaviors we have seen so far are "using the current rounding mode", and "using nearest-even", which is better than what you get if you adapt my suggestion (which was originally for int64s only). First, a bit of context: I had solved my problem by setting the rounding mode to nearest-even before calling float_of_big_int. But that's not good enough for going into OCaml, because OCaml currently does not depend on the C99-only, sometimes missing, fenv.h functions, because a programmer who sets the rounding mode does not expect float_of_big_int to change it, because setting the rounding mode can be a bit slow, ... Also for context, I am only interested in converting to big_ints those floats that would fit into an int64, so I still had plenty of circumvention possibilities, I realize now that I understand what was happening. Now, let us assume someone wants to implement a better version, and can choose a (reasonable) specification as long as he documents it. The better version is not allowed to get or set the current rounding mode, but it can specify that it does not take it into account and always rounds to nearest or truncates.
That still doesn't mean it can't be done, and I will give it a try, but the more I think about it, the more the current implementation seems elegant. |
Comment author: pascal_cuoq Xavier said: "Sounds like a job for a floating-point expert" Well, let me put it this way: if you give yourself the objective of producing a conversion function from Ocaml nats to IEEE 754 doubles, IEEE 754 doubles are going to be the least of your problems. In the reasoning above, the wisest path was to backtrack to the point where we were using floating-point addition, and to decide not to use it. From there it is simply a matter of specifying that the function always rounds towards zero and the rest is simple. The function below seems to do the job for nats at least for 64-bit OCaml. It tried to write it so it would work in 32-bit. I do not know if OCaml supports architectures where doubles and ints have a different endianness, but this function surely won't work there. Please, please, please, if you decide to use it for a float_of_big_int lookalike, give it a name other than "float_of_big_int" and preserve the old function for a while, otherwise it will be impossible to write code that is portable across 3.11 and 3.12. I vaguely inferred the documentation of Nat functions from the type and random google results. If I may, keeping around the non-nativeint digit access functions when the digits are 64 (or presumably 32) bit is particularly vicious. open Nat;; let float_of_nat_truncate n = |
Comment author: pascal_cuoq The function above does not handle overflows for nats of more than ~ 308 decimal digits. |
Comment author: @xavierleroy If you're (mostly) happy with the current implementation of float_of_big_int, we can go on with it for a while, it's not urgent to reimplement. I still think a better implementation is possible (better = more predictable w.r.t. rounding + faster), but at this point it's just a feature wish. GMP takes the same approach as the implementation you proposed: synthesize the IEEE float directly, and always round toward zero. But such rounding is inconsistent with other int-to-float conversions (that's the starting point of this PR, isn't it?).
The latter issue is easily avoided by left-shifting the big int so that the most significant bit is 1, then converting the top 64 bits to a float, then compensating for the shift via exponent scaling. The former issue (with nearest-even) is more troublesome...
Left-shifting ensures that we would combine at most 2 float values (if we convert the top 64 bits of the bigint as two 32-bit conversions combined). |
Comment author: @xavierleroy Fixed 2015-07-24 by commit 16247: complete reimplementation of float_of_big_int that does not go through strings. |
Original bug ID: 5021
Reporter: Pascal Cuoq
Status: closed (set by @xavierleroy on 2017-03-07T17:36:32Z)
Resolution: fixed
Priority: normal
Severity: feature
Version: 3.11.2
Fixed in version: 4.03.0
Category: otherlibs
Tags: patch
Related to: #6896
Monitored by: @ygrek "Julien Signoles" @dbuenzli
Bug description
Int64.to_float is implemented in byterun/ints.c as:
CAMLprim value caml_int64_to_float(value v)
{
int64 i = Int64_val(v);
return caml_copy_double(I64_to_double(i));
}
where I64_to_double is defined by
#define I64_to_double(x) ((double)(x))
This is implementation-defined when the int64 cannot be represented exactly as a double.
Unfortunately, Mac OS X Snow Leopard has made the choice of using the FPU rounding mode for the choice of the resulting double, whereas most other platforms seem always to use the nearest-even double regardless of the FPU rounding mode. This means OCaml programs behave differently on Snow Leopard than elsewhere..
I think that one way to obtain the same behavior everywhere is to mask the int64 with 0x00000000FFFFFFFF and with 0xFFFFFFFF00000000, convert the two results to doubles separately (the result of each conversion is then specified because each conversion is exact), and add the two resulting doubles. This gives FPU-rounding everywhere, at a slight cost for all platforms, including the one that didn't need it. I can't think of a better way at the moment.
The same applies to native ints when those are 63-bit.
The text was updated successfully, but these errors were encountered: