You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Original bug ID: 6932 Reporter:@mmottl Status: resolved (set by @damiendoligez on 2016-03-22T15:17:43Z) Resolution: suspended Priority: normal Severity: tweak Version: 4.02.2 Target version: 4.03.0+dev / +beta1 Category: middle end (typedtree to clambda) Monitored by:@ygrek@hcarty@yakobowski@mmottl
Bug description
Consider the following functor example:
module type Arg = sig val x : int end
module Make (Arg : Arg) = struct
open Arg
let f1 () = x
let x = x
let f2 () = x
end
The definitions of f1 and f2 look identical, but the generated machine code is different. f1 actually looks up the pointer to the functor argument Arg first, then dereferences it to obtain x. f2, however, accesses the "x" that is bound in the functor body instead, which saves one instruction.
This may seem like a small deal, but if the functor argument itself has deeply nested submodules, this can lead to a lot of superfluous pointer chasing that may require the user to "lift" a lot of argument entries by hand.
It might be interesting to consider an optimization where any entries (like "x" above) in an argument module (and submodules) that are referenced from within functions in a functor body are automatically "lifted" into the functor body.
The downside would be that an instantiated functor results in a "bigger" module in terms of memory and may require a little more time to instantiate. But I think it's fair to assume that functor applications are rare whereas function calls into instantiated functors are frequent.
Even better would be inlining of functor applications / defunctorization. Not sure whether this project ever got close to that goal:
As annoying as it may be, I have come to the conclusion that this problem may not be easily solved without manually binding entries for which this optimization makes sense. Unless anybody has a better idea, I think this issue can be closed.
@doligez: Thanks for motivating me to look into Flambda some more with this problem. I had actually tried it a while back, but hadn't seen much improvement. But that was a rather sloppy test with a development version, and I didn't know as much about Flambda back then.
I have just done some more extensive testing and have seen quite favorable results. The solution was to amp up the optimization settings. Just passing "-O2" will probably do the trick in many cases. Code annotations can probably help with a more targeted approach. The compiler with Flambda produces astoundingly good code when it knows functor arguments and if you give it enough leeway to specialize and inline. It can definitely get rid of the pointer chasing observed before Flambda came along.
There are likely scenarios where Flambda cannot possibly specialize the code, e.g. when the functor argument isn't known at compile time. I guess in these cases it's up to the user to lift some frequently needed values by hand from the functor argument to avoid some dereferencing.
Original bug ID: 6932
Reporter: @mmottl
Status: resolved (set by @damiendoligez on 2016-03-22T15:17:43Z)
Resolution: suspended
Priority: normal
Severity: tweak
Version: 4.02.2
Target version: 4.03.0+dev / +beta1
Category: middle end (typedtree to clambda)
Monitored by: @ygrek @hcarty @yakobowski @mmottl
Bug description
Consider the following functor example:
module type Arg = sig val x : int end
module Make (Arg : Arg) = struct
open Arg
let f1 () = x
let x = x
let f2 () = x
end
The definitions of f1 and f2 look identical, but the generated machine code is different. f1 actually looks up the pointer to the functor argument Arg first, then dereferences it to obtain x. f2, however, accesses the "x" that is bound in the functor body instead, which saves one instruction.
This may seem like a small deal, but if the functor argument itself has deeply nested submodules, this can lead to a lot of superfluous pointer chasing that may require the user to "lift" a lot of argument entries by hand.
It might be interesting to consider an optimization where any entries (like "x" above) in an argument module (and submodules) that are referenced from within functions in a functor body are automatically "lifted" into the functor body.
The downside would be that an instantiated functor results in a "bigger" module in terms of memory and may require a little more time to instantiate. But I think it's fair to assume that functor applications are rare whereas function calls into instantiated functors are frequent.
Even better would be inlining of functor applications / defunctorization. Not sure whether this project ever got close to that goal:
http://www.ocamlpro.com/blog/2013/07/11/inlining-progress-report.html
The text was updated successfully, but these errors were encountered: