Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007892OCamllanguage featurespublic2019-01-03 14:302019-01-07 16:21
Reportersfuric 
Assigned To 
PrioritynormalSeverityminorReproducibilityN/A
StatusnewResolutionopen 
PlatformOSOS Version
Product Version4.07.0 
Target VersionFixed in Version 
Summary0007892: OCaml lacks a proper implementation of min and max for floats
DescriptionThere is an old issue (0005781) concerning IEEE 754 conformance of min and max with float arguments. Many advanced floating point users would expect min and max functions to have this behavior

min nan neg_infinity = min neg_infinity nan = neg_infinity

and

max nan infinity = max infinity nan = infinity

The rationale is: if partial evaluation of a two-argument float function yields a constant one-argument function for all floats but nans, then it should also return the same value when evaluated with nan. Indeed, nan is not only a value representing erroneous calculations; it also represents "absence of value". This explains the current (and correct, i.e., expected) behavior of ** in OCaml:

1.0 ** nan = 1.0

and

nan ** 0.0 = 1.0

Having a separate Float module is an opportunity to provide correct implementations of min and max that do not interfere with polymorphic min and max provided by Pervasives.
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0019524)
dbuenzli (reporter)
2019-01-03 22:03

This has been added in https://github.com/ocaml/ocaml/pull/1794 [^] and will be part of 4.08. See the {min,max}_num functions in that PR.
(0019525)
sfuric (reporter)
2019-01-04 12:01
edited on: 2019-01-04 12:11

I was not taking about the {min, max}_num functions but about the minm and maxm functions as Kahan (the "father" of IEEE 754) calls them. He says:

"maxm(+Infinity, NaN) must be +Infinity, NOT NaN, to honor the rule that
if a function f(x, y) has the property for some value X that f(X, y) is independent of y , be it finite or infinite, then that f(X, NaN) must be the same as f(X, y) .".

Notice that [min_num 1.0 nan] returns [1.0] while [min 1.0 nan] returns [nan]. I incorrectly used "absence of value" above instead of "inability to deliver a meaningful result", sorry, my fault. I had a look at the current implementation of {min, max} in Float and I disagree with the treatment of NaNs. It does not follow Kahan recommendation (however I like the proper treatment of [+0] and [-0], despite it is not mandatory, since it honors [f (min x y)] <= [f (max x y)] when [f] is monotically increasing).

A last remark: many comments in Float say "[nan] is returned". This is a bit confusing IMO because it gives the wrong impression of a unique [nan] value, which goes against the idea of having NaNs possibly containing user defined error codes (there are reserved bits for that purpose). IMO documentation should follow IEEE 754 recommendations by saying that the same NaN is returned in case it is unique and if returning [nan] is appropriate, and by saying that any of the NaNs is returned in case both arguments are NaNs (IEEE 754 does not say more however it is possible the return the greater error code in order to honor "commutativity" when appropriate).

(0019526)
dbuenzli (reporter)
2019-01-04 13:33

I don't understand what your objections are here. Basically depending on context you want two kinds of min,max function:

1. Those that treat NaN as a problem in your computation and propagate the NaNs. If that is the case you can use Float.{min,max}.

2. Those that treat NaN as absence of value ({min,max}Num in IEEE 754). If that is the case you can use Float.{min,max}_num.

What is exactly bothering you here, the names, the implementations ?

Regarding your last remark the dev team decided it was not worth the trouble bothering end users with NaN subtelties see https://caml.inria.fr/mantis/view.php?id=4948 [^]
(0019527)
sfuric (reporter)
2019-01-04 15:58

Yes, depending on context I want two kinds of {min,max} functions.
But I don't want to choose between one which ALWAYS returns a NaN whenever one of its arguments is a Nan and one which NEVER returns a NaN whenever one of its arguments is not a NaN.
The "correct" (i.e. expected by advanced users) behavior, as explained in my first note (partial evaluation of a two-argument function) and in my last note (see quotation from Kahan) is:

- [min neg_infinity x] = [min x neg_infinity] = [neg_infinity] WHATEVER THE VALUE OF [x], otherwise the result is the one returned by the current implementation of [Float.min].
- [max infinity x] = [max x infinity] = [infinity] WHATEVER THE VALUE OF [x], otherwise the result is the one returned by the current implementation of [Float.max].

This is clearly a highly questionable choice, at least from our functional programmer point of view, but this is an expectation from many advanced users (see the behavior of ** in my first note).

Regarding the remark about "NaN subtelties", this is really a pity to choose the wrong option since implementation is correct (e.g. it does what IEEE 754 suggests)...
(0019529)
dbuenzli (reporter)
2019-01-04 19:34

Okay so that's yet another possible definition and I can see the point of the definition (for reference the full Kahan quote seems to stem from here [1]).

However you mention "this is an expectation from many advanced users" but as far as I remember the things we reviewed during GPR1794 no one (libraries, languages) seemed to have this particular "expected" definition.

It was either the 1 or 2 (under the name nan{min,max}) I mentioned. One should be careful in introducing yet another definition since this could introduce subtle result discrepancies when porting numerical algorithms from other languages.

Do you have references to systems that actually implement the semantics your mention ?

[1] https://github.com/JuliaLang/julia/issues/7866#issuecomment-53079589 [^]
(0019534)
sfuric (reporter)
2019-01-07 16:21

I cannot give you a reference of a system that implements this semantics. This is a typical problem with IEEE 754 software implementation: everybody has a good reason not to follow the spirit of original designers as far as Infs and NaNs are concerned.
Also, if your argument is to "follow the masses" then the winner is, by far, the polymorphic {min, max} pair!
My suggestion is to follow the convention already adopted for ** (i.e. when partial evaluation yields a constant function, then NaNs are ignored), because it is common and because OCaml already adopted it (BTW this avoids having to remember which convention is followed by which function).

- Issue History
Date Modified Username Field Change
2019-01-03 14:30 sfuric New Issue
2019-01-03 22:03 dbuenzli Note Added: 0019524
2019-01-04 12:01 sfuric Note Added: 0019525
2019-01-04 12:11 sfuric Note Edited: 0019525 View Revisions
2019-01-04 13:33 dbuenzli Note Added: 0019526
2019-01-04 15:58 sfuric Note Added: 0019527
2019-01-04 18:44 dbuenzli Note Added: 0019528
2019-01-04 18:45 dbuenzli Note Deleted: 0019528
2019-01-04 19:34 dbuenzli Note Added: 0019529
2019-01-04 20:44 sfuric Note Added: 0019530
2019-01-07 11:38 sfuric Note Edited: 0019530 View Revisions
2019-01-07 16:20 sfuric Note Deleted: 0019530
2019-01-07 16:21 sfuric Note Added: 0019534


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker