Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0004688OCamlOCaml generalpublic2009-01-08 13:592012-08-25 18:50
Reporterdra 
Assigned Tofrisch 
PrioritynormalSeverityminorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version3.11.0+beta 
Target VersionFixed in Version 
Summary0004688: Special floating-point values aren't converted to strings correctly under Windows
DescriptionBecause of a lack of ANSI C99 (?) support in the MSVC runtime, using Printf.printf "%f" renders incorrect strings for the floating point special values neg_infinity, infinity and nan. Although this is technically speaking a bug in the MS libraries rather than OCaml, it would be helpful if the Windows Port of OCaml worked around this limitation.

The top level displays the values correctly because oprint.ml doesn't go via printf (possibly because this has been noticed before?).
Additional InformationBoth on Windows Vista:

a) MinGW Port
# nan;;
- : float = nan
# Printf.printf "%f\n" nan;;
1.#QNAN0
- : unit = ()

b) MSVC64 Port
# nan;;
- : float = nan
# Printf.printf "%f" nan;;
1.#SNAN0
- : unit = ()
TagsNo tags attached.
Attached Filespatch file icon ocaml-3.12.0-PR4688.patch [^] (1,357 bytes) 2012-01-13 15:51 [Show Content]
patch file icon ocaml-3.12.0-PR4688-v2.patch [^] (1,590 bytes) 2012-01-13 21:39 [Show Content]

- Relationships
related to 0005443closedlefessan Writing an integer to a file as a string requires allocation 
related to 0005739resolved Printf.printf "%F" (-.nan) returns -nan 

-  Notes
(0006317)
gasche (developer)
2011-12-16 05:30

I've just been hit by this issue.
(0006319)
gasche (developer)
2011-12-16 10:05

There also is a problem with precision modifiers: it has been reported that `Printf.sprintf "%.*f" 1 nan` (or `infinity` or `neg_infinity`) returns the empty string. On my GNU/Linux machine it returns the same sane thing as "%f", and I think that is the expected behavior.
(0006323)
thelema (reporter)
2011-12-16 16:10

Batteries has failed some tests under windows related to this, there's definitely some interest in getting this issue resolved.
(0006325)
protz (manager)
2011-12-16 16:24

http://www.mingw.org/wiki/C99 [^] contains some insights on the C99 issue for mingw-compiled programs under windows.
(0006327)
protz (manager)
2011-12-16 16:58

After investigation, it turns out that this is the behavior of the underlying libc printf function that you're observing. It turns out that the MSVCRT DLL (Microsoft's implementation of the C library) prints floats that way. Both mingw32 and MSVC generate programs linke against MSVCRT, so fixing this would require writing a compatibility layer in the OCaml runtime. I don't think it's reasonable to expect that from OCaml.

Thanks,

jonathan
(0006329)
dra (reporter)
2011-12-16 17:55

With respect, it is entirely reasonable to expect this of OCaml - the limitations of an older-standard C runtime should not get in the way of OCaml's specified behaviour.

More importantly, this problem also affects Pervasives.string_of_float as it's the caml_format_float primitive (byterun/floats.c) which is broken. float_of_string (string_of_float nan) should never raise an exception! This primitive is already a wrapper around printf so I fail to see the problem. I've just rapidly patched and tested byterun/floats.c in my 3.12.0 installation here as:

CAMLprim value caml_format_float(value fmt, value arg)
{
#define MAX_DIGITS 350
/* Max number of decimal digits in a "natural" (not artificially padded)
   representation of a float. Can be quite big for %f format.
   Max exponent for IEEE format is 308 decimal digits.
   Rounded up for good measure. */
  char format_buffer[MAX_DIGITS + 20];
  int prec, i;
  char * p;
  char * dest;
  value res;
  double d = Double_val(arg);

  if (isfinite(d)) {
    prec = MAX_DIGITS;
    for (p = String_val(fmt); *p != 0; p++) {
      if (*p >= '0' && *p <= '9') {
        i = atoi(p) + MAX_DIGITS;
        if (i > prec) prec = i;
        break;
      }
    }
    for( ; *p != 0; p++) {
      if (*p == '.') {
        i = atoi(p+1) + MAX_DIGITS;
        if (i > prec) prec = i;
        break;
      }
    }
    if (prec < sizeof(format_buffer)) {
      dest = format_buffer;
    } else {
      dest = caml_stat_alloc(prec);
    }
    sprintf(dest, String_val(fmt), d);
    res = caml_copy_string(dest);
    if (dest != format_buffer) {
      caml_stat_free(dest);
    }
  }
  else
  {
    if (isnan(d))
    {
      res = caml_copy_string("nan");
    }
    else
    {
      if (d > 0)
      {
        res = caml_copy_string("inf");
      }
      else
      {
        res = caml_copy_string("-inf");
      }
    }
  }
  return res;
}

It would obviously be quite reasonable to wrap the test in the relevant #ifdefs so that only the Windows ports use it (I expect #ifdef WIN32 would be enough). The above patch fixes both Pervasives.string_of_float and also the %f format specifier. Pervasives.float_of_string works anyway because C89 does specify "-inf", "inf" and "nan" as the string representations for strtod.
(0006331)
protz (manager)
2011-12-16 18:09

You wrote:
> It would obviously be quite reasonable to wrap the test in the relevant #ifdefs so that only the Windows ports use it (I expect #ifdef WIN32 would be enough). The above patch fixes both Pervasives.string_of_float and also the %f format specifier. Pervasives.float_of_string works anyway because C89 does specify "-inf", "inf" and "nan" as the string representations for strtod.

While I can see why you may feel that way, the problem is manifold:
- subtle differences between windows and linux will probably remain, even if we are to integrate your patch,
- if we go that route, we'll end up writing a float manipulation library, that worksaround every single glitch in both glibc and MSVCRT.

It is indeed desirable to have a way to manipulate floats correctly, and convert them back and forth "the right way", but again, I do not think this is the role of OCaml. Your modification is trivial to write in OCaml (I think), and by patching byterun.c, you're placing a maintenance burden on OCaml, which I don't think we want, sorry :).

Cheers,

jonathan
(0006333)
dra (reporter)
2011-12-16 18:28

Ultimately, it's your call! I build from sources anyway so can patch whatever I need to. I find the logic a little strange in the light of things like the thousand lines of C used to workaround the Win32 select function in the Unix module (although I appreciate the maintenance burden that's causing - see at least PR 0005327-5239).

Resolution would be clearer as "won't fix" rather than "not fixable", FWIW.
(0006336)
frisch (developer)
2011-12-16 19:43

Honestly, I think this should be fixed. The fix is much simpler than other workarounds implemented for Windows, it should should not be difficult to maintain, and several people have been bitten by the bug.
(0006337)
protz (manager)
2011-12-16 19:50

Yes, I was thinking about it in the subway and while fixing Printf seems too big of a task, fixing string_of_float seems reasonable.
(0006338)
frisch (developer)
2011-12-16 19:58

I'm glad you reconsider (part) of the decision.

Too big of a task? Doesn't the provided patch fix Printf?
(0006339)
gasche (developer)
2011-12-16 20:06

When we found the bug in Batteries, I implemented a fix for our function using Printf internally (as string_of_float does) using classify_float:
  https://gitorious.org/ocaml-batteries/gasche-batteries/commit/9a90298bac711b757897485108289e5ca1ebe421 [^]

Just my two cents. The Printf patch looks rather reasonable and I wouldn't mind the issue being fixed once and for all, but you never know with C code and I definitely understand the cautiousness.
(0006341)
dra (reporter)
2011-12-16 20:36

Yes - the patch does fix both Printf.printf and Pervasives.string_of_float (technically I've only tested it in bytecode, but the file is shared between both). Printf.printf is only related to the C function by its design (it's definitely not a wrapper around C's one) - the actual function is largely implemented in OCaml, it uses that C stub to format float values hence the exposed C89/C99 issue.
(0006342)
protz (manager)
2011-12-16 20:45

Yep, sorry I misread the code. As I said, after thinking about it, your patch is definitely short, so no objections for me :).
(0006368)
xleroy (administrator)
2011-12-18 09:42

This is one of these small issues where it is hard to tell what the right action is.

C99 fully specifies the handling of infinities and NaNs in string->float and float->string conversions (strtod, scanf, printf, etc).

C89 doesn't say anything about infinities and NaNs, both in the language and in the library functions. There is, however, a reasonable expectation that whatever printf generates, strtod can read back.

From the PR, I understand that Microsoft's standard C library does not conform to C99 (12 years later...) and moreover that the "reasonable expectation" above doesn't hold. This is incredibly sloppy and low-quality. One more reason to use the Cygwin port of OCaml: at least it benefits from a quality, non-Microsoft C standard library.

Yes, we've worked hard to emulate some useful Unix features under Win32, but that's normal given that Win32 is a different OS that isn't expected to conform to any Unix standard. In contrast, Microsoft's C implementation is expected to conform to the C standards, however, and it is not normal for us to work around Microsoft's failures in this department.

Also, note that the proposed fix can also create new problems. Assume a pre-C99 C library that satisfies the "reasonable expectation" above. In the current situation, float_of_string can handle whatever is produced by string_of_float or by sprintf "%g". If we change the way sprintf prints infinities and NaNs, but leave float_of_string unchanged, we break this property. And, no, I'm not suggesting that float_of_string should also be changed to special-case infinities and NaNs, because I think this is going too far.

So, I'm in favor of a "won't fix". One change that we should consider, though, is to ensure that the %F printf format prints infinities and NaNs in *Caml* syntax, like the toplevel does (file oprint.ml).
(0006372)
protz (manager)
2011-12-18 11:00

(sorry, edited the wrong bug)
(0006386)
dra (reporter)
2011-12-19 13:51

Xavier - (I'm only commenting further because of the part on %F - if this is closed and done then a new PR may be needed for that instead)

[There's an over-arching reason not to use the Cygwin port, sadly - namely its non-permissive licence! It would be nice to be able to GPL everything, but...]

Fixing "%F" would involve coding this exact change for a specific branch of Printf but leaving the others broken which really does sound like cutting off your nose to spite your face.

But: "%F" does not produce a valid caml lexeme on *any* platform for the special values (I can't tell from the docs whether that's by design) - oprint.ml's handling correctly produces nan, neg_infinity and infinity whereas printf.ml's "%F" on Linux produces nan, inf and -inf the latter two of which are syntax errors, not valid ocaml lexemes. Granted, though, if special handling for float special values is required for "%F" on *all* platforms then I just undermined my own argument against fixing "%F" only ;)

The problem you propose with sane pre-C99 libraries is reasonbly solved by guarding the change with an appropriate #ifdef, which I had suggested (I don't know whether #ifdef WIN32 is too broad - it might cause Cygwin to include it).

Indeed, in the perfect build-world, the configure script (which would handle the Windows ports too) would contain a test of the C compiler's C89/C99 handling of printf and strtod resulting in a flag in s.h being set and the appropriate code being brought into play...
(0006391)
frisch (developer)
2011-12-19 17:07

After some internal discussion, it seems we could reach a middle ground. Could you produce a patch with the proper #ifdefs so that it applies only to the MSVC and Mingw ports? If there is absolutely no chance to break the Cygwin version (and of course, the other Unix ports), I think we can get the patch through.
(0006669)
dra (reporter)
2012-01-13 15:52

Patch attached - it works by declaring a new variable in s.h for the native Windows ports. Cygwin generates s.h using configure (not s-nt.h) and I have tested that the alternate code is definitely not compiled when building for Cygwin.
(0006670)
frisch (developer)
2012-01-13 16:13

Did you try your patch with the MSVC port? I don't think isfinite and isnan exist for MSVC. One could use _finite and _isnan or call caml_classify_float (or duplicate its code).
(0006672)
dra (reporter)
2012-01-13 21:41

Whoops - see the attached patch appended "v2". I've made sure that the Cygwin port doesn't use the code (by building with syntax errors inserted inside the #ifdefs) and compiled and tested *both* the MinGW and MSVC ports with this new patch.
(0006688)
frisch (developer)
2012-01-16 11:26

David, thanks for your work! I've integrated your latest patch.

- Issue History
Date Modified Username Field Change
2009-01-08 13:59 dra New Issue
2009-04-27 17:00 doligez Status new => acknowledged
2011-12-16 05:30 gasche Note Added: 0006317
2011-12-16 10:05 gasche Note Added: 0006319
2011-12-16 16:10 thelema Note Added: 0006323
2011-12-16 16:24 protz Note Added: 0006325
2011-12-16 16:58 protz Note Added: 0006327
2011-12-16 16:58 protz Status acknowledged => resolved
2011-12-16 16:58 protz Resolution open => not fixable
2011-12-16 16:58 protz Assigned To => protz
2011-12-16 17:55 dra Note Added: 0006329
2011-12-16 18:09 protz Note Added: 0006331
2011-12-16 18:28 dra Note Added: 0006333
2011-12-16 19:43 frisch Note Added: 0006336
2011-12-16 19:50 protz Note Added: 0006337
2011-12-16 19:58 frisch Note Added: 0006338
2011-12-16 20:06 gasche Note Added: 0006339
2011-12-16 20:36 dra Note Added: 0006341
2011-12-16 20:45 protz Note Added: 0006342
2011-12-18 09:42 xleroy Note Added: 0006368
2011-12-18 10:59 protz Status resolved => acknowledged
2011-12-18 10:59 protz Resolution not fixable => open
2011-12-18 11:00 protz Status acknowledged => resolved
2011-12-18 11:00 protz Resolution open => won't fix
2011-12-18 11:00 protz Note Added: 0006372
2011-12-19 13:51 dra Note Added: 0006386
2011-12-19 17:07 frisch Note Added: 0006391
2011-12-19 17:08 frisch Status resolved => assigned
2011-12-19 17:08 frisch Resolution won't fix => open
2011-12-29 23:21 doligez Relationship added related to 0005443
2012-01-13 15:51 dra File Added: ocaml-3.12.0-PR4688.patch
2012-01-13 15:52 dra Note Added: 0006669
2012-01-13 16:13 frisch Note Added: 0006670
2012-01-13 21:39 dra File Added: ocaml-3.12.0-PR4688-v2.patch
2012-01-13 21:41 dra Note Added: 0006672
2012-01-16 10:54 frisch Assigned To protz => frisch
2012-01-16 11:26 frisch Note Added: 0006688
2012-01-16 11:26 frisch Status assigned => resolved
2012-01-16 11:26 frisch Resolution open => fixed
2012-08-25 18:50 gasche Relationship added related to 0005739


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker