Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[patch] Avoid boxing float/int32/int64 when doing direct call #5894

Closed
vicuna opened this issue Jan 17, 2013 · 10 comments
Closed

[patch] Avoid boxing float/int32/int64 when doing direct call #5894

vicuna opened this issue Jan 17, 2013 · 10 comments

Comments

@vicuna
Copy link

vicuna commented Jan 17, 2013

Original bug ID: 5894
Reporter: @chambart
Status: acknowledged (set by @damiendoligez on 2013-06-19T18:48:03Z)
Resolution: open
Priority: normal
Severity: feature
Target version: later
Category: back end (clambda to assembly)
Tags: patch
Related to: #5917
Monitored by: meyer tgazagna meurer @diml @jmeber @hcarty

Bug description

Functions taking floating points parameters always receive them boxed.
When a direct call is done, it is possible to specialise the call to
pass those parameters in registers. This also applies to boxed integers.
The return value can also be unboxed.

This allow avoiding all allocations in a function like

let (+) = Nativeint.add
let (-) = Nativeint.sub
let rec fib n =
if n < 2n then 0n + 1n else fib(n-1n) + fib(n-2n)

Additional information

This patch take care of not unboxing parameters when it is possible that this
could increase the number of allocation.

For instance a function like this

let bad (x:float) =
ignore [x] (* force boxing of x *)

bad will not be specialised because its parameter is used boxed inside its body.

In a case like:

let rec f x y =
let _ = x +. 1. in
let _ = y +. 1. in
h 1. x

and g a b =
let _ = a+.1. in
let _ = b+.1. in
f b 1.

and h c d =
let _ = c +. 1. in
let _ = d +. 1. in
let _ = g 1. c in
let _ = g 1. d in
bad d;

only the y parameter of f and a of g will be unboxed.

The return value will be unboxed if every possible returned value is unboxed.

For instance

x +. y is considered unboxed

but

if b
then x +. y
else List.hd l

is considered boxed because List.hd return an already boxed value.

Current problems:

  • Constants are considered boxed because they are already boxed at
    compile time, but this may not be the right default choice. It can still be
    circumvented by things like 0n + 1n in the fibonacci example.

  • Specialised functions code is duplicated:

    • It is possible to know that some functions will never be used in direct call:
      those can be compiled only one time
    • specialisation could be forbidden over a certain size
    • the generic version of function specialised only for parameters (not return
      value) could be compiled by adding a stub before the specialised version.

File attachments

@vicuna
Copy link
Author

vicuna commented Jan 17, 2013

Comment author: @gasche

Have you tested the efficiency of this optimization on some of the float-heavy benchmark code that started flowing in (from Alain for example)?

Specialised functions code is duplicated

It should be possible to have the boxed version just unbox, and then call the unboxed version; that would result in little additional compiled code. But this is only efficient if the inliner reliably removes this indirection layer (which amounts to unboxing at call-site).

Can the inliner be relied upon for this? How much does this degrade the performances in the case where no inlining can happen anyway (no .cmx, functor application...)?

(A way to work around the "not sure if boxing will be needed" problem is to specialize to a function that takes two arguments, the boxed and the unboxed data. This is better than the boxing version because you don't have to unbox at definition site, could actually not have much overhead if the unboxed data is passed efficiently, and guarantees that no additional allocation is necessary -- it's essentially a specialization of the technique of "On the runtime complexity of type-directed unboxing", Garrigue and Minamide, 1998. I don't expect it to work in practice, but you never know.)

@vicuna
Copy link
Author

vicuna commented Jan 18, 2013

Comment author: @alainfrisch

This is great news. I fully support this project!

will not be specialised because its parameter is used boxed inside its body.

I'm not sure this is a good criterion, although it is in line with the current intra-function behavior (don't unbox if there is a potential use site in boxed form). It could very well be the case that the parameter might be used boxed inside the body, but only in rare cases (e.g. to print an error message if some invariant is broken). If this disable the optimization, it means the caller might need to box the parameter.

Have you tried to combine your patch with the strategy I propose to deal with boxing (#5204), i.e. box "lazily" (on demand + memoization)?

@vicuna
Copy link
Author

vicuna commented Jan 21, 2013

Comment author: @chambart

On Alain's bench on my computer:
without patch:
time: 21.0s
minor_words: 2243147047
promoted_words: 450541

with patch (and a few forgoten cases present in my git branch):
time: 20.1s
minor_words: 1007129106
promoted_words: 238960

Notice that more than half the running time is taken in __ieee754_exp

The majority of other float heavy benchmarks I encountered, where written with
a big while loop such that nothing was boxed.

I will update the patch with a few more things: function parameters
where not considered as unboxable float so a function like

let f x = let a = if Random.bool () then x else x in a +.a

was compiled allocating x and loading from it.

This is fixed in my git branch:
https://github.com/chambart/ocaml/tree/float_args

testing it with opam:

opam remote add comp https://github.com/chambart/opam-compilers-repository.git
opam switch install trunk+floatarg

Gabriel:
No the inliner cannot (yet) generate direct calls so a code like:

let apply f x = f x
let g x = apply (+.) x

will generate a generic call to (+.) I have another patch in preparation
for that kind of things.

The main interest of avoiding boxing is to avoid some allocation, passing both
parameters would loose that. The cost of loading from the boxed value is not
that much and you would still have to pay for if at each function call (reloading
spilled registers).

Alain:
I didn't merge both patch yet, I think that I will need to change the
stategy a bit. I am not completely sure of which cases will not be worth
unboxing, maybe when we can show that every possible path in the function
use the value boxed.

Off topic: How do I remove an attached file ? Is the bug reporter
supposed to have the rights to do that ?

@vicuna
Copy link
Author

vicuna commented Dec 4, 2015

Comment author: @alainfrisch

Will the many changes since Jan. 2013, and the ones coming with flambda, this is going to take some work bringing this up to date. But it is certainly something worth pushing forward.

@vicuna
Copy link
Author

vicuna commented Sep 6, 2017

Comment author: @alainfrisch

I've started to work on a similar approach: #1322

@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label Apr 26, 2021
@gasche
Copy link
Member

gasche commented Apr 26, 2021

So here is my understanding of the history of this work:

My conclusions from this history are as follows:

  • @chambart is surprisingly bad at getting his work merged upstream
  • @alainfrisch is surprisingly persistent to get his work merged upstream
  • we still don't have any unboxing of parameters to direct function calls (only for static/local functions and in general static raises), as proposed by @chambart in this pre-github issue and again by @alainfrisch in Unboxed calling convention #1322.

I wonder if @alainfrisch is still planning to work on unboxing of direct call parameters? The criticisms in #1322 were varied:

  • This is less useful in presence of inlining (and now static function optimization). (I don't find this argument fully convincing.)
  • The optimization duplicate function bodies, maybe it should be opt-in to avoid bad consequences.
  • @lpw25 criticized ( Unboxed calling convention #1322 (comment) ) some of the recent changes to let unboxing (not Keep more type information in Lambda #2156, which came only later; possibly Better unboxing strategy #336 ? ) which caused pessimizations for certain programs (as Xavier was warning about in the discussion), and I don't know if the issue has been addressed and is now resolved.

@alainfrisch
Copy link
Contributor

@gasche : you're (unsurprisingly) good at reconstructing and summarizing the history, thanks!

No, I don't have immediate plans to resume working on this topic. Mostly by lack of time; I still believe that supporting numerical code without forcing boxing on function boundaries is important. Btw, in the same vein, it could be good to support specialized calling conventions e.g. for functions returning multiple values (unallocated tuples). So perhaps a more general design would be needed.

Note that #1322 was already adapted to use an opt-in strategy (through attributes); also, it would also be possible to avoid duplication of bodies by having the generic version be implemented as a wrapper around the specialized one (except that it could add extra boxing).

@github-actions github-actions bot removed the Stale label Apr 28, 2021
@github-actions
Copy link

github-actions bot commented May 2, 2022

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label May 2, 2022
@github-actions github-actions bot closed this as completed Aug 3, 2022
@bluddy
Copy link
Contributor

bluddy commented Aug 11, 2022

Coming back to this, I'd really love to see this work continue. I've come to realize that relying on flambda isn't enough - we need the normative performance to be good. Having bad floating-point performance by default is not a good place to be for a general purpose language.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants