Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special floating-point values aren't converted to strings correctly under Windows #4688

Closed
vicuna opened this issue Jan 8, 2009 · 22 comments
Assignees
Labels

Comments

@vicuna
Copy link

vicuna commented Jan 8, 2009

Original bug ID: 4688
Reporter: @dra27
Assigned to: @alainfrisch
Status: closed (set by @xavierleroy on 2017-09-24T15:32:15Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 3.11.0+beta
Category: ~DO NOT USE (was: OCaml general)
Related to: #5443 #5739 #7069 #7186 #7218
Monitored by: @gasche @protz @hcarty

Bug description

Because of a lack of ANSI C99 (?) support in the MSVC runtime, using Printf.printf "%f" renders incorrect strings for the floating point special values neg_infinity, infinity and nan. Although this is technically speaking a bug in the MS libraries rather than OCaml, it would be helpful if the Windows Port of OCaml worked around this limitation.

The top level displays the values correctly because oprint.ml doesn't go via printf (possibly because this has been noticed before?).

Additional information

Both on Windows Vista:

a) MinGW Port

nan;;

  • : float = nan

Printf.printf "%f\n" nan;;

1.#QNAN0

  • : unit = ()

b) MSVC64 Port

nan;;

  • : float = nan

Printf.printf "%f" nan;;

1.#SNAN0

  • : unit = ()

File attachments

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @gasche

I've just been hit by this issue.

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @gasche

There also is a problem with precision modifiers: it has been reported that Printf.sprintf "%.*f" 1 nan (or infinity or neg_infinity) returns the empty string. On my GNU/Linux machine it returns the same sane thing as "%f", and I think that is the expected behavior.

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: thelema

Batteries has failed some tests under windows related to this, there's definitely some interest in getting this issue resolved.

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @protz

http://www.mingw.org/wiki/C99 contains some insights on the C99 issue for mingw-compiled programs under windows.

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @protz

After investigation, it turns out that this is the behavior of the underlying libc printf function that you're observing. It turns out that the MSVCRT DLL (Microsoft's implementation of the C library) prints floats that way. Both mingw32 and MSVC generate programs linke against MSVCRT, so fixing this would require writing a compatibility layer in the OCaml runtime. I don't think it's reasonable to expect that from OCaml.

Thanks,

jonathan

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @dra27

With respect, it is entirely reasonable to expect this of OCaml - the limitations of an older-standard C runtime should not get in the way of OCaml's specified behaviour.

More importantly, this problem also affects Pervasives.string_of_float as it's the caml_format_float primitive (byterun/floats.c) which is broken. float_of_string (string_of_float nan) should never raise an exception! This primitive is already a wrapper around printf so I fail to see the problem. I've just rapidly patched and tested byterun/floats.c in my 3.12.0 installation here as:

CAMLprim value caml_format_float(value fmt, value arg)
{
#define MAX_DIGITS 350
/* Max number of decimal digits in a "natural" (not artificially padded)
representation of a float. Can be quite big for %f format.
Max exponent for IEEE format is 308 decimal digits.
Rounded up for good measure. */
char format_buffer[MAX_DIGITS + 20];
int prec, i;
char * p;
char * dest;
value res;
double d = Double_val(arg);

if (isfinite(d)) {
prec = MAX_DIGITS;
for (p = String_val(fmt); *p != 0; p++) {
if (*p >= '0' && *p <= '9') {
i = atoi(p) + MAX_DIGITS;
if (i > prec) prec = i;
break;
}
}
for( ; *p != 0; p++) {
if (*p == '.') {
i = atoi(p+1) + MAX_DIGITS;
if (i > prec) prec = i;
break;
}
}
if (prec < sizeof(format_buffer)) {
dest = format_buffer;
} else {
dest = caml_stat_alloc(prec);
}
sprintf(dest, String_val(fmt), d);
res = caml_copy_string(dest);
if (dest != format_buffer) {
caml_stat_free(dest);
}
}
else
{
if (isnan(d))
{
res = caml_copy_string("nan");
}
else
{
if (d > 0)
{
res = caml_copy_string("inf");
}
else
{
res = caml_copy_string("-inf");
}
}
}
return res;
}

It would obviously be quite reasonable to wrap the test in the relevant #ifdefs so that only the Windows ports use it (I expect #ifdef WIN32 would be enough). The above patch fixes both Pervasives.string_of_float and also the %f format specifier. Pervasives.float_of_string works anyway because C89 does specify "-inf", "inf" and "nan" as the string representations for strtod.

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @protz

You wrote:

It would obviously be quite reasonable to wrap the test in the relevant #ifdefs so that only the Windows ports use it (I expect #ifdef WIN32 would be enough). The above patch fixes both Pervasives.string_of_float and also the %f format specifier. Pervasives.float_of_string works anyway because C89 does specify "-inf", "inf" and "nan" as the string representations for strtod.

While I can see why you may feel that way, the problem is manifold:

  • subtle differences between windows and linux will probably remain, even if we are to integrate your patch,
  • if we go that route, we'll end up writing a float manipulation library, that worksaround every single glitch in both glibc and MSVCRT.

It is indeed desirable to have a way to manipulate floats correctly, and convert them back and forth "the right way", but again, I do not think this is the role of OCaml. Your modification is trivial to write in OCaml (I think), and by patching byterun.c, you're placing a maintenance burden on OCaml, which I don't think we want, sorry :).

Cheers,

jonathan

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @dra27

Ultimately, it's your call! I build from sources anyway so can patch whatever I need to. I find the logic a little strange in the light of things like the thousand lines of C used to workaround the Win32 select function in the Unix module (although I appreciate the maintenance burden that's causing - see at least PR #5327-5239).

Resolution would be clearer as "won't fix" rather than "not fixable", FWIW.

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @alainfrisch

Honestly, I think this should be fixed. The fix is much simpler than other workarounds implemented for Windows, it should should not be difficult to maintain, and several people have been bitten by the bug.

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @protz

Yes, I was thinking about it in the subway and while fixing Printf seems too big of a task, fixing string_of_float seems reasonable.

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @alainfrisch

I'm glad you reconsider (part) of the decision.

Too big of a task? Doesn't the provided patch fix Printf?

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @gasche

When we found the bug in Batteries, I implemented a fix for our function using Printf internally (as string_of_float does) using classify_float:
https://gitorious.org/ocaml-batteries/gasche-batteries/commit/9a90298bac711b757897485108289e5ca1ebe421

Just my two cents. The Printf patch looks rather reasonable and I wouldn't mind the issue being fixed once and for all, but you never know with C code and I definitely understand the cautiousness.

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @dra27

Yes - the patch does fix both Printf.printf and Pervasives.string_of_float (technically I've only tested it in bytecode, but the file is shared between both). Printf.printf is only related to the C function by its design (it's definitely not a wrapper around C's one) - the actual function is largely implemented in OCaml, it uses that C stub to format float values hence the exposed C89/C99 issue.

@vicuna
Copy link
Author

vicuna commented Dec 16, 2011

Comment author: @protz

Yep, sorry I misread the code. As I said, after thinking about it, your patch is definitely short, so no objections for me :).

@vicuna
Copy link
Author

vicuna commented Dec 18, 2011

Comment author: @xavierleroy

This is one of these small issues where it is hard to tell what the right action is.

C99 fully specifies the handling of infinities and NaNs in string->float and float->string conversions (strtod, scanf, printf, etc).

C89 doesn't say anything about infinities and NaNs, both in the language and in the library functions. There is, however, a reasonable expectation that whatever printf generates, strtod can read back.

From the PR, I understand that Microsoft's standard C library does not conform to C99 (12 years later...) and moreover that the "reasonable expectation" above doesn't hold. This is incredibly sloppy and low-quality. One more reason to use the Cygwin port of OCaml: at least it benefits from a quality, non-Microsoft C standard library.

Yes, we've worked hard to emulate some useful Unix features under Win32, but that's normal given that Win32 is a different OS that isn't expected to conform to any Unix standard. In contrast, Microsoft's C implementation is expected to conform to the C standards, however, and it is not normal for us to work around Microsoft's failures in this department.

Also, note that the proposed fix can also create new problems. Assume a pre-C99 C library that satisfies the "reasonable expectation" above. In the current situation, float_of_string can handle whatever is produced by string_of_float or by sprintf "%g". If we change the way sprintf prints infinities and NaNs, but leave float_of_string unchanged, we break this property. And, no, I'm not suggesting that float_of_string should also be changed to special-case infinities and NaNs, because I think this is going too far.

So, I'm in favor of a "won't fix". One change that we should consider, though, is to ensure that the %F printf format prints infinities and NaNs in Caml syntax, like the toplevel does (file oprint.ml).

@vicuna
Copy link
Author

vicuna commented Dec 18, 2011

Comment author: @protz

(sorry, edited the wrong bug)

@vicuna
Copy link
Author

vicuna commented Dec 19, 2011

Comment author: @dra27

Xavier - (I'm only commenting further because of the part on %F - if this is closed and done then a new PR may be needed for that instead)

[There's an over-arching reason not to use the Cygwin port, sadly - namely its non-permissive licence! It would be nice to be able to GPL everything, but...]

Fixing "%F" would involve coding this exact change for a specific branch of Printf but leaving the others broken which really does sound like cutting off your nose to spite your face.

But: "%F" does not produce a valid caml lexeme on any platform for the special values (I can't tell from the docs whether that's by design) - oprint.ml's handling correctly produces nan, neg_infinity and infinity whereas printf.ml's "%F" on Linux produces nan, inf and -inf the latter two of which are syntax errors, not valid ocaml lexemes. Granted, though, if special handling for float special values is required for "%F" on all platforms then I just undermined my own argument against fixing "%F" only ;)

The problem you propose with sane pre-C99 libraries is reasonbly solved by guarding the change with an appropriate #ifdef, which I had suggested (I don't know whether #ifdef WIN32 is too broad - it might cause Cygwin to include it).

Indeed, in the perfect build-world, the configure script (which would handle the Windows ports too) would contain a test of the C compiler's C89/C99 handling of printf and strtod resulting in a flag in s.h being set and the appropriate code being brought into play...

@vicuna
Copy link
Author

vicuna commented Dec 19, 2011

Comment author: @alainfrisch

After some internal discussion, it seems we could reach a middle ground. Could you produce a patch with the proper #ifdefs so that it applies only to the MSVC and Mingw ports? If there is absolutely no chance to break the Cygwin version (and of course, the other Unix ports), I think we can get the patch through.

@vicuna
Copy link
Author

vicuna commented Jan 13, 2012

Comment author: @dra27

Patch attached - it works by declaring a new variable in s.h for the native Windows ports. Cygwin generates s.h using configure (not s-nt.h) and I have tested that the alternate code is definitely not compiled when building for Cygwin.

@vicuna
Copy link
Author

vicuna commented Jan 13, 2012

Comment author: @alainfrisch

Did you try your patch with the MSVC port? I don't think isfinite and isnan exist for MSVC. One could use _finite and _isnan or call caml_classify_float (or duplicate its code).

@vicuna
Copy link
Author

vicuna commented Jan 13, 2012

Comment author: @dra27

Whoops - see the attached patch appended "v2". I've made sure that the Cygwin port doesn't use the code (by building with syntax errors inserted inside the #ifdefs) and compiled and tested both the MinGW and MSVC ports with this new patch.

@vicuna
Copy link
Author

vicuna commented Jan 16, 2012

Comment author: @alainfrisch

David, thanks for your work! I've integrated your latest patch.

@vicuna vicuna added the bug label Mar 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants