Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intermediate float boxing with inlined functions that contain a branch #6512

Closed
vicuna opened this issue Aug 11, 2014 · 6 comments
Closed

intermediate float boxing with inlined functions that contain a branch #6512

vicuna opened this issue Aug 11, 2014 · 6 comments
Assignees
Milestone

Comments

@vicuna
Copy link

vicuna commented Aug 11, 2014

Original bug ID: 6512
Reporter: drichman
Assigned to: @alainfrisch
Status: closed (set by @xavierleroy on 2017-02-16T14:16:20Z)
Resolution: fixed
Priority: normal
Severity: feature
Version: 4.02.0+beta1 / +rc1
Target version: 4.03.0+dev / +beta1
Fixed in version: 4.03.0+dev / +beta1
Category: middle end (typedtree to clambda)
Monitored by: @diml @chambart @alainfrisch

Bug description

I've found a few examples where a float, loaded from a record that is using the only-contains-floats optimisation, is boxed unnecessarily.

I think it happens when both in-lining and a branch are involved (an if, match, (Core) Option.value ...); see the MWEs below.

This is relevant due to this fairly common pattern:

type t = { mutable a : float }

let update t v = t.a <- Float.max t.a v

which still allocates 2 words after Float.max has been in-lined.

Thanks,
Daniel

Steps to reproduce

type t = { mutable a : float }

let float_max (x : float) (y : float) = if x > y then x else y

let update t v = t.a <- float_max t.a v

update, after in-lining, contains code that boxes and then immediately un-boxes a float.

Another example:

type t = { mutable a : float }

let at_least_5 x = if x < 5. then x else 5.

let update t = t.a <- at_least_5 t.a

Finally:

type t = { mutable a : float }

let func3 t =
  let r = t.a in
  if r = 5. then 1 else 2

let func4 t =
  let r = if t.a <= 0. then t.a else t.a in
  if r = 5. then 1 else 2

The ASM generated for func3 loads (%rax) into %xmm0;

the ASM generated for func4 loads (%rax) into %xmm0 (duplicate code in each branch), boxes it, and then immediately unboxes it into %xmm1.

File attachments

@vicuna
Copy link
Author

vicuna commented Aug 11, 2014

Comment author: @diml

A "let" is not unboxed if it binds something too complicated, such as an ifthenelse construct.

I attached a patch to unbox the ifthenelse case. It can be extended to other constructions.

@vicuna
Copy link
Author

vicuna commented Aug 11, 2014

Comment author: drichman

(The full definition of Float.max is here: https://github.com/janestreet/core_kernel/blob/master/lib/float.ml#L504; it's slightly longer than the dummy I've used in my MWE, in case this is an issue.)

@vicuna
Copy link
Author

vicuna commented Aug 12, 2014

Comment author: drichman

I think there may actually be two separate feature requests here, sorry.

The first: unboxing with an ifthenelse, which I believe the above patch solves

MWE (as above):

type t = { mutable a : float }

let func3 t =
  let r = t.a in
  if r = 5. then 1 else 2

let func4 t =
  let r = if t.a <= 0. then t.a else t.a in
  if r = 5. then 1 else 2

And unboxing in the case of inlining; MWE:

type t = { mutable a : float }

let id x = x

let update t = t.a <- id t.a

With the overall goal of making the abovementioned use of Float.max unboxed.

@vicuna
Copy link
Author

vicuna commented Aug 15, 2014

Comment author: @chambart

It is a bit more intricate than that.

Inlining Float.max in your update function give something equivalent to writing:

let update t v =
t.a <-
let a = t.a in
if a > v
then a
else v

That won't be unboxed because transl is working bottom up:

At first it converts the if
if a > v
then a
else v

to something like that:

if Cload(a) > Cload(v)
then a
else v

Then the let and it recognize that t.a is unboxable (with transl_unbox_let):

let a_unboxed = Cload(t, 0) in
let a_boxed = Box(a_unboxed) in
if a_unboxed > Cload(v)
then a_boxed
else v

And finaly it tries to unbox this whole expression, amounting here to adding loads:

let a_unboxed = Cload(t, 0) in
let a_boxed = Box(a_unboxed) in
if a_unboxed > Cload(v)
then Cload(a_boxed)
else Cload(v)

In that particular case, it would have been better if the inlined version was:

let update t v =
let a = t.a in
t.a <-
if a > v
then a
else v

because in that case it would first have generated

t.a <-
if Cload(a) > Cload(v)
then Cload(a)
else Cload(v)

then tried to unbox 'a' everywhere and, there would not have been any use of
the boxed version anymore, so no allocation. But moving the let would not be a
reasonable approach: it improves that case, but it can degrade others.

There may be multiple approaches to improve that case:

  • try to generate the right unboxed access from the start: add a parameter to
    transl to tell wether we are generating an unboxed expression.
    That would change quite a lot of code

  • Have a smarter unbox_float function with an environment mapping variables to
    their unboxed version which get populated for expressions like
    Clet(var, Calloc(other_var), body)
    To do that we must add a way to distinguish between alloc for boxing and other
    allocations (for a reference).
    Then, when potential acces to the boxed version of the variable have been
    removed, the dead let should be removed.

  • That way, it looks like a job for the CSE. That would also need some way to
    distinguish boxing from other allocations.

  • And at last, my prefered would be to handle boxing and unboxing earlier in the
    compilation chain, at the Flambda level, by adding some Box and Unbox primitives.
    (I didn't do something like that in Flambda yet because it would be quite heavy
    to maintain the diff on cmmgen.ml between my branch and the trunk)

@vicuna
Copy link
Author

vicuna commented Sep 2, 2014

Comment author: @alainfrisch

FWIW, the more_unboxing branch in the SVN uses a very different unboxing strategy, which compiles all examples as expected (no boxing at all).

@vicuna
Copy link
Author

vicuna commented Dec 2, 2015

Comment author: @alainfrisch

There have been many changes in how unboxing is performed. Now, all examples in this ticket are boxing free.

@vicuna vicuna closed this as completed Feb 16, 2017
@vicuna vicuna added this to the 4.03.0 milestone Mar 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants