Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007017OCamlback end (clambda to assembly)public2015-10-13 18:402018-11-27 18:03
Reporterfrisch 
Assigned To 
PrioritynormalSeverityfeatureReproducibilityhave not tried
StatusconfirmedResolutionopen 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0007017: Allow unboxing across "static fail"
DescriptionCurrently, arguments of "jumps" in cmm (i.e. Cexit/Ccatch) are always assumed to be generic values; they cannot transport e.g. unboxed floats. Here is an example of code which illustrates the addition of float boxing to cross those jumps:

  let f xs ys =
    match xs, ys with
    | [|a|], b
    | b, [|a|] -> let _ = b.(0) +. a in ()
    | _ -> ()


This produces:

(function camlFoo__f_1324 (xs/1325: val ys/1326: val)
 (catch
   (if (!= (or (>>u (load int (+a xs/1325 -4)) 10) 1) 3)
     (if (!= (or (>>u (load int (+a ys/1326 -4)) 10) 1) 3) 1a
       (let a/1345 (load float64u ys/1326)
         (exit 2 (alloc 2301 a/1345) xs/1325)))
     (let a/1344 (load float64u xs/1325)
       (exit 2 (alloc 2301 a/1344) ys/1326)))
 with(2 a/1327:val b/1328:val)
   (let match/1346 (+f (load float64u b/1328) (load float64u a/1327)) 1a)))


I attach a patch that: (i) adds support for passing other "types" of data in Ccatch/Cexit parameters; (ii) detect when, for a given Ccatch and a given parameter position, all corresponding Cexit sites pass a "boxing expression" (float or boxed integers), and in this case, uses an unboxed parameter instead. I have only been able to test it on float unboxing (with the code above), and I don't know if it would currently apply to boxed integers in practice. That said, various optimization proposals are based on using Ccatch/Cexit, and would certainly benefit from the optimization. I think in particular of 0004800 (where gasche proposes to use Ccatch/Cexit to avoid allocating some tuples; but without the current patch, gasche's proposal forces to box floats); and of 0006242, which compiles local functions used in tail position into Ccatch/Cexit.

With the patch, the cmm for the example above becomes

(function camlFoo__f_1324 (xs/1325: val ys/1326: val)
 (catch
   (if (!= (or (>>u (load int (+a xs/1325 -4)) 10) 1) 3)
     (if (!= (or (>>u (load int (+a ys/1326 -4)) 10) 1) 3) 1a
       (let a/1345 (load float64u ys/1326) (exit 2 a/1345 xs/1325)))
     (let a/1344 (load float64u xs/1325) (exit 2 a/1344 ys/1326)))
 with(2 a/1347:float b/1328:val)
   (let match/1346 (+f (load float64u b/1328) a/1347) 1a)))
Tagsoptimization, patch
Attached Filesdiff file icon unbox_across_jumps.diff [^] (9,669 bytes) 2015-10-13 18:41 [Show Content]

- Relationships
related to 0004800closedfrisch native compilation optimization of tuple assignment 
related to 0006242resolvedfrisch Better compilation of local functions only used for tail calls 

-  Notes
(0014542)
frisch (developer)
2015-10-15 16:14

Branch created on github, adapted to current trunk:

https://github.com/alainfrisch/ocaml/tree/unbox_across_jumps [^]
(0014547)
frisch (developer)
2015-10-15 17:06
edited on: 2015-10-15 17:07

I've has to extend cmm in order to detect float/boxed int constants at this level (see github).

I'm still not very satisfied with the current implementation, since it lacks simple cases such as:

let f a =
  let x = a +. 1. in
  let y = a *. 3. in
  let x, y =
    if x < y then x, y
    else y, x
  in
  x +. 30. *. y

the reason being that the unboxing of x and y happens after the compilation of the tuple binding (into a "jump"), so the code for compiling catch doesn't "see" that the argument is going to be immediately boxed. This results in:

(function camlFoo__f_1203 (a/1204: val)
 (let
   (x/1269 (+f (load float64u a/1204) 1.)
    y/1268 (*f (load float64u a/1204) 3.))
   (catch
     (if (<f x/1269 y/1268) (exit 1 (alloc 2301 y/1268) (alloc 2301 x/1269))
       (exit 1 (alloc 2301 x/1269) (alloc 2301 y/1268)))
   with(1 y/1208:val x/1207:val)
     (alloc 2301 (+f (load float64u x/1207) (*f 30. (load float64u y/1208)))))))



Forcing unboxing earlier with:

let f a =
  let x = a +. 1. in
  let y = a *. 3. in
  let x, y =
    if x < y then x +. 0., y +. 0.
    else y +. 0., x +. 0.
  in
  x +. 30. *. y


indeed avoids boxing:

(function camlFoo__f_1203 (a/1204: val)
 (let
   (x/1275 (+f (load float64u a/1204) 1.)
    y/1274 (*f (load float64u a/1204) 3.))
   (catch
     (if (<f x/1275 y/1274)
       (let (y/1271 (+f y/1274 0.) x/1270 (+f x/1275 0.))
         (exit 1 y/1271 x/1270))
       (let (y/1269 (+f x/1275 0.) x/1268 (+f y/1274 0.))
         (exit 1 y/1269 x/1268)))
   with(1 y/1272:float x/1273:float)
     (alloc 2301 (+f x/1273 (*f 30. y/1272))))))


One would need to reapply the optimization in subst_boxed_number. Alternatively, one should compile the body of a "let" binding already knowing that the bound id will be unboxed (as in 0006866).

(0014565)
frisch (developer)
2015-10-16 15:31

The github branch has been merged with unbox_earlier, and now it succeeds in eliminating all allocations in:

let () =
  let module M = Nativeint in
  let x = ref (M.of_int 1) in
  for i = 1 to 1000000000 do
    let a, b =
      if i mod 2 = 0 then !x, M.of_int 1
      else M.of_int 2, !x
    in
    x := M.add (M.mul !x a) b
  done;


trunk / unbox_earlier: 3.6
unbox_across_jumps: 1.65
(0015322)
doligez (administrator)
2016-02-08 12:30

@frisch, can we close this PR now that the github branch is merged?
(0015324)
frisch (developer)
2016-02-08 14:19

> can we close this PR now that the github branch is merged?

Which one? There is no PR corresponding to this branch. So please keep this open, I think this is a rather important (future) work.
(0019491)
frisch (developer)
2018-11-27 18:03

https://github.com/ocaml/ocaml/pull/2165 [^]

- Issue History
Date Modified Username Field Change
2015-10-13 18:40 frisch New Issue
2015-10-13 18:41 frisch File Added: unbox_across_jumps.diff
2015-10-15 15:36 frisch Relationship added related to 0004800
2015-10-15 15:36 frisch Relationship added related to 0006242
2015-10-15 16:14 frisch Note Added: 0014542
2015-10-15 17:06 frisch Note Added: 0014547
2015-10-15 17:07 frisch Note Edited: 0014547 View Revisions
2015-10-16 15:31 frisch Note Added: 0014565
2016-02-08 12:30 doligez Note Added: 0015322
2016-02-08 12:30 doligez Status new => feedback
2016-02-08 12:30 doligez Target Version => 4.03.0+dev / +beta1
2016-02-08 14:19 frisch Note Added: 0015324
2016-02-08 14:19 frisch Status feedback => new
2016-02-09 23:25 frisch Target Version 4.03.0+dev / +beta1 => later
2017-02-23 16:42 doligez Category Ocaml optimization => -Ocaml optimization
2017-02-24 11:25 doligez Tag Attached: patch
2017-02-24 11:40 doligez Status new => confirmed
2017-02-24 11:40 doligez Category -Ocaml optimization => back end (clambda to assembly)
2017-02-24 11:40 doligez Target Version later =>
2017-02-24 11:40 doligez Tag Attached: optimization
2017-02-24 16:13 frisch Severity minor => feature
2018-11-27 18:03 frisch Note Added: 0019491


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker