Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006512OCamlOcaml optimizationpublic2014-08-11 17:522014-09-02 10:57
Reporterdrichman 
Assigned To 
PrioritynormalSeverityfeatureReproducibilityalways
StatusacknowledgedResolutionopen 
PlatformOSOS Version
Product Version4.02.0+beta1 / +rc1 
Target Version4.03.0+devFixed in Version 
Summary0006512: intermediate float boxing with inlined functions that contain a branch
DescriptionI've found a few examples where a float, loaded from a record that is using the only-contains-floats optimisation, is boxed unnecessarily.

I think it happens when both in-lining and a branch are involved (an if, match, (Core) Option.value ...); see the MWEs below.

This is relevant due to this fairly common pattern:

type t = { mutable a : float }

let update t v = t.a <- Float.max t.a v


which still allocates 2 words after Float.max has been in-lined.

Thanks,
Daniel
Steps To Reproduce
type t = { mutable a : float }

let float_max (x : float) (y : float) = if x > y then x else y

let update t v = t.a <- float_max t.a v


update, after in-lining, contains code that boxes and then immediately un-boxes a float.

Another example:

type t = { mutable a : float }

let at_least_5 x = if x < 5. then x else 5.

let update t = t.a <- at_least_5 t.a


Finally:

type t = { mutable a : float }

let func3 t =
  let r = t.a in
  if r = 5. then 1 else 2

let func4 t =
  let r = if t.a <= 0. then t.a else t.a in
  if r = 5. then 1 else 2


The ASM generated for func3 loads (%rax) into %xmm0;

the ASM generated for func4 loads (%rax) into %xmm0 (duplicate code in each branch), boxes it, and then immediately unboxes it into %xmm1.
TagsNo tags attached.
Attached Filesdiff file icon unbox-ifthenelse.diff [^] (638 bytes) 2014-08-11 18:00 [Show Content]

- Relationships

-  Notes
(0012000)
dim (developer)
2014-08-11 18:04

A "let" is not unboxed if it binds something too complicated, such as an ifthenelse construct.

I attached a patch to unbox the ifthenelse case. It can be extended to other constructions.
(0012001)
drichman (reporter)
2014-08-11 18:33

(The full definition of Float.max is here: https://github.com/janestreet/core_kernel/blob/master/lib/float.ml#L504; [^] it's slightly longer than the dummy I've used in my MWE, in case this is an issue.)
(0012002)
drichman (reporter)
2014-08-12 13:07
edited on: 2014-08-12 13:07

I think there may actually be two separate feature requests here, sorry.


The first: unboxing with an ifthenelse, which I believe the above patch solves

MWE (as above):

type t = { mutable a : float }

let func3 t =
  let r = t.a in
  if r = 5. then 1 else 2

let func4 t =
  let r = if t.a <= 0. then t.a else t.a in
  if r = 5. then 1 else 2


And unboxing in the case of inlining; MWE:

type t = { mutable a : float }

let id x = x

let update t = t.a <- id t.a


With the overall goal of making the abovementioned use of Float.max unboxed.

(0012012)
chambart (developer)
2014-08-15 19:32

It is a bit more intricate than that.

Inlining Float.max in your update function give something equivalent to writing:

let update t v =
  t.a <-
    let a = t.a in
    if a > v
    then a
    else v

That won't be unboxed because transl is working bottom up:

At first it converts the if
  if a > v
  then a
  else v

to something like that:

  if Cload(a) > Cload(v)
  then a
  else v

Then the let and it recognize that t.a is unboxable (with transl_unbox_let):

  let a_unboxed = Cload(t, 0) in
  let a_boxed = Box(a_unboxed) in
  if a_unboxed > Cload(v)
  then a_boxed
  else v

And finaly it tries to unbox this whole expression, amounting here to adding loads:

  let a_unboxed = Cload(t, 0) in
  let a_boxed = Box(a_unboxed) in
  if a_unboxed > Cload(v)
  then Cload(a_boxed)
  else Cload(v)


In that particular case, it would have been better if the inlined version was:

let update t v =
  let a = t.a in
  t.a <-
    if a > v
    then a
    else v

because in that case it would first have generated

  t.a <-
    if Cload(a) > Cload(v)
    then Cload(a)
    else Cload(v)

then tried to unbox 'a' everywhere and, there would not have been any use of
the boxed version anymore, so no allocation. But moving the let would not be a
reasonable approach: it improves that case, but it can degrade others.


There may be multiple approaches to improve that case:
- try to generate the right unboxed access from the start: add a parameter to
  transl to tell wether we are generating an unboxed expression.
  That would change quite a lot of code

- Have a smarter unbox_float function with an environment mapping variables to
  their unboxed version which get populated for expressions like
    Clet(var, Calloc(other_var), body)
  To do that we must add a way to distinguish between alloc for boxing and other
  allocations (for a reference).
  Then, when potential acces to the boxed version of the variable have been
  removed, the dead let should be removed.

- That way, it looks like a job for the CSE. That would also need some way to
  distinguish boxing from other allocations.

- And at last, my prefered would be to handle boxing and unboxing earlier in the
  compilation chain, at the Flambda level, by adding some Box and Unbox primitives.
  (I didn't do something like that in Flambda yet because it would be quite heavy
  to maintain the diff on cmmgen.ml between my branch and the trunk)
(0012072)
frisch (developer)
2014-09-02 10:57

FWIW, the more_unboxing branch in the SVN uses a very different unboxing strategy, which compiles all examples as expected (no boxing at all).

- Issue History
Date Modified Username Field Change
2014-08-11 17:52 drichman New Issue
2014-08-11 18:00 dim File Added: unbox-ifthenelse.diff
2014-08-11 18:04 dim Note Added: 0012000
2014-08-11 18:33 drichman Note Added: 0012001
2014-08-12 13:07 drichman Note Added: 0012002
2014-08-12 13:07 drichman Note Edited: 0012002 View Revisions
2014-08-12 13:07 drichman Note Edited: 0012002 View Revisions
2014-08-15 19:32 chambart Note Added: 0012012
2014-08-29 14:30 doligez Severity feature => minor
2014-08-29 14:30 doligez Category OCaml backend (code generation) => Ocaml optimization
2014-08-29 14:30 doligez Target Version => 4.02.1+dev
2014-09-01 15:23 xleroy Severity minor => feature
2014-09-01 15:23 xleroy Status new => acknowledged
2014-09-01 15:23 xleroy Target Version 4.02.1+dev => 4.03.0+dev
2014-09-02 10:57 frisch Note Added: 0012072


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker