Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comparison operations on abstract types and max/min are not specialized #7440

Closed
vicuna opened this issue Dec 23, 2016 · 4 comments
Closed

Comments

@vicuna
Copy link

vicuna commented Dec 23, 2016

Original bug ID: 7440
Reporter: markghayden
Status: acknowledged (set by @xavierleroy on 2017-01-14T15:23:38Z)
Resolution: open
Priority: normal
Severity: feature
Platform: AMD64
OS: MacOS
OS Version: 10.12.2
Version: 4.04.0
Target version: later
Category: middle end (typedtree to clambda)
Duplicate of: #7441 #7442
Related to: #5592
Monitored by: @gasche @ygrek

Bug description

Specialization of integer and floating point comparisons (presumably also char and bool) seems to be brittle.

See the examples below where in many cases the comparison is not specialized and so external C call is made to the runtime comparison routine, losing an easy optimization opportunity to replace C call with assembly.

Also, Pervasives.min/max seem to never be specialized, even when inlined (which they should probably always be). To get specialization, you appear to need to write own functions, with type-casts.

Actual testing was done with 4.05.0 trunk.

(*********************************************************************)

let float_zero = 0.0 ;;

let int_zero = 0 ;;

(* Is specialized
*)
let float_test0 v =
v >= float_zero
;;

(* Is specialized
*)
let int_test0 v =
v >= int_zero
;;

(*********************************************************************)

module AbsFloat : sig type t val zero : t end = struct
type t = float
let zero = 0.0
end

(* Not specialized
*)
let float_test1 v =
v >= AbsFloat.zero
;;

(*********************************************************************)

module AbsInt : sig type t val zero : t end = struct
type t = int
let zero = 0
end

(* Not specialized
*)
let int_test1 v =
v >= AbsInt.zero
;;

(*********************************************************************)

(* Add inline for good measure.
*)
let [@inline] min x y = if x <= y then x else y ;;
let [@inline] max x y = if x >= y then x else y ;;

(* Not specialized
*)
let float_max0 (v0:float) (v1:float) =
max v0 v1
;;

(* Is specialized
*)
let float_max1 (v0:float) (v1:float) =
if v0 >= v1 then v0 else v1
;;

(* Not specialized
*)
let int_max0 (v0:int) (v1:int) =
max v0 v1
;;

(* Is specialized
*)
let int_max1 (v0:int) (v1:int) =
if v0 >= v1 then v0 else v1
;;

(*********************************************************************)

Steps to reproduce

Compile attached ml file with various examples.

File attachments

@vicuna
Copy link
Author

vicuna commented Jan 14, 2017

Comment author: @xavierleroy

Your three reports (#7440, #7441, #7442) are manifestations of the same fact: the OCaml compiler performs type-based optimizations first, then erases types, then performs all the other optimizations. This is very unlikely to change in the near future, as it would require a total rewrite of the compiler.

@vicuna
Copy link
Author

vicuna commented Jan 14, 2017

Comment author: markghayden

Thanks for the extra information. I guess if the Ocaml compiler was not designed to use type information for optimization in later optimization stages, that is the way it is.

Since there is little prospect of change, could limitations like these be documented? For instance, some developers might think "Array.fold_left (+.) 0.0 a" is good programming style, without understanding the resulting program will be 400-1000% slower than a hand-written loop (in addition to flushing the minor heap).

It seems as though for the foreseeable future, best-practices for developing efficient software with Ocaml include:

  • Avoid use of abstract data types
  • Avoid use of polymorphic array iterators, including all iterators in the Array module.
  • Avoid use of polymorphic comparisons and similar functions such as min/max

Maybe there are some other suggestions the Ocaml team is aware of?

I think consideration should be given for addressing these at some point. In addition to the obvious direct effect on performance, these likely have follow-on effects when inlining is considered. For instance, when a call to caml_compare is specialized with an integer comparison or a polymorphic Array.get (which tests for type of array and makes possible allocation of boxed float) is replaced with instructions for simple array lookup, that makes the resulting code more attractive for inlining.

@vicuna
Copy link
Author

vicuna commented Jan 15, 2017

Comment author: dobenour

xleroy: Would it be possible to add a partial monomorphization pass, when the whole program is available?

By partial monomorphization I mean a pass that monomorphizes everything it can. It can't get everything (due to polymorphic recursion, among other things). But it can get a lot of it.

This is an area where GHC's strongly typed Core Wins Big, but it does so via hand-written rewrite rules.

In Rust, using iterators doesn't cost ANY performance – in fact, in helps performance by eliminating bounds checks that aren't needed. This is possible because of monomorphization.

@github-actions
Copy link

github-actions bot commented May 9, 2020

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant