|Anonymous | Login | Signup for a new account||2016-02-09 12:54 CET|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0006345||OCaml||OCaml backend (code generation)||public||2014-03-12 15:13||2015-12-11 19:25|
|Priority||normal||Severity||minor||Reproducibility||have not tried|
|Target Version||Fixed in Version||4.02.0+dev|
|Summary||0006345: Better compilation of optional arguments with default values|
|Description||The current compilation strategy for optional arguments with default values often creates many allocations on the call site to box the provided arguments with a "Some" constructor, which will be immediately thrown away by the function's body.|
An example would be:
let rec f ?(flag = false) ?(acc = 0) = function |  -> if flag then acc else acc + 1 | hd :: tl -> f ~flag ~acc:(acc + hd) tl
which turns out to be about 3 times slower than:
let rec f ~flag ~acc = function |  -> if flag then acc else acc + 1 | hd :: tl -> f ~flag ~acc:(acc + hd) tl let f ?(flag = false) ?(acc = 0) l = f ~flag ~acc l
when used in a loop such as:
let () = let l = [1;2;3;4;5;6;7;8;9] in for i = 1 to 10000000 do ignore (f l) done
It could be interesting to generate an inner function where all optional arguments with a default value are turned into non-optional arguments, and a wrapper function in charge of filling in the default values when needed. In most cases, this wrapper could be inlined and simplified on the call site, where one often knows statically if the optional arguments are provided or not (i.e. they are omitted or applied with ~, not ?). As the example above shows, it is worthwhile treating the case of recursive functions (where the inner function and the wrapper are mutually recursive), which is not currently the case for the generic inlining stuff.
|Tags||No tags attached.|
|Attached Files|| pr_6345_inlining.diff [^] (6,478 bytes) 2014-03-12 18:08 [Show Content]
pr_6345.diff [^] (9,308 bytes) 2014-03-17 11:44 [Show Content]
edited on: 2014-03-12 18:11
I attach a first patch, which improves the inlining mechanism enough so that the following code is compiled as expected:
let rec f ?(flag = false) ?(acc = 100) l = f_inner ~flag ~acc l and f_inner ~flag ~acc = function |  -> if flag then acc else acc + 1 | hd :: tl -> f ~flag ~acc:(acc + hd) tl let () = let l = [1;2;3;4;5;6;7;8;9] in for i = 1 to 10000000 do ignore (f l) done
Namely, the two calls to [f] are inlined, resulting in direct calls to [f_inner]. Moreover, the code in the [f] wrapper responsible for checking optional arguments is evaluated statically to remove the check and the allocation of the "Some" blocks.
The patch implements the following changes, which could have a beneficial (I hope) effect in other contexts:
- Recursive functions can be marked as "inlinable". Only functions defined *after* an inlinable function in a recursive group can benefit from inlining. This is restrictive and sensible to the ordering, but this is enough for our use case (if the wrapper is put before the inner function).
- The compilation of functions is changed a little bit to fail earlier when the environment is detected to be actually used (cf NotClosed exception in the patch). This is required to avoid internal errors during compilation, caused by inlining when we change during compilation the actual number of parameters to add the environment pseudo-parameter. I believe the new version is also more efficient (compile-time), since it avoids useless compilation of further functions in the group.
- During inlining, when a parameter corresponding to a optional argument with a default value is bound to an argument "Some expr", we instead bind it to "expr" and inline the "Some" constructor when it is referenced. (This is currently triggered by matching on the internal name of the parameter, which is always *opt*. We could make this more general if we detect that all use site can be simplified away to avoid creating the "Some" block.)
- The simplification pass applied after inlining is enriched to simplify away checks and field accesses related to such inlined "Some" block (concretely: a field access on a pure Pmakeblock primitive is resolved statically to its nth field, and a "if" check on a Pmakeblock primitive is resolved to the "then" branch). These optimizations could be useful for other kinds of inlining.
The next step is to generate automatically the wrapper function. This could be done at different levels. The simplest would probably be during type-checking, in typecore.ml, where the *opt* identifiers are synthesized. This might also simplify the rather ugly "push_defaults" code in translcore.ml.
The second patch (pr_6345.diff) includes the first one and adds the required machinery to split a function with default arguments into a wrapper function and an inner function. I've implemented the split during closure conversion, by matching the lambda code generated by typecore.ml to compile default values for optional arguments. This is somewhat fragile, and if this patch is accepted, the effectiveness of the optimization should be checked by non-regression tests.
With this patch, the original example is properly split into two functions, and the inner one is called by the main loop and for the recursive call, thus eliminating all the overhead of optional arguments with default values in that example.
I've got a look at pr_6345.diff, and what it does seems reasonable (and correct).
Doing it after push_defaults seems right, because you don't want to work directly on the typed tree.
Also, it seems hard to replace push_defaults, because it has to be done on the typed tree...
The only weak part seems to be the hard-coded check on "*opt*", as there is no guarantee this could not be used for something else...
Maybe it could be a good idea to check that none of these *opt* are free in the body of the function.
Also, it _is_ necessary to freshen the ids. At least, some optimizations on the lambda code rely on that.
They don't matter here, since we are in the closure conversion, but there could be other optimizations at latter stages.
I was wondering whether this wrapper/worker scheme could be generalized.
The basic idea here is that we want to do some dispatch statically, by inlining the dispatch code.
This is clearly relevant for optional arguments, which are going to be an explicit Some or None in most cases, but are there other such situations which could be detected at function definition?
Jacques, thanks for your review. I've added a check that the *opt* identifiers are indeed removed from the body, and a test in testsuite/tests/asmcomp to ensure that the new compilations scheme is effective (and that it doesn't break later).
Commited to trunk, rev 14466.
> I was wondering whether this wrapper/worker scheme could be generalized.
I don't know how common it is to pass freshly constructed data structures to function which immediately pattern-match on them. Maybe this appear after inlining. At least, contrary to optional arguments, the construction of the data structure is explicit at the call site, so people might tend to provide specialized functions if performance matters. But yes, it's certainly an idea to be investigated.
|2014-03-12 15:13||frisch||New Issue|
|2014-03-12 17:34||frisch||File Added: pr_6345_inlining.diff|
|2014-03-12 18:07||frisch||Note Added: 0011038|
|2014-03-12 18:08||frisch||File Deleted: pr_6345_inlining.diff|
|2014-03-12 18:08||frisch||File Added: pr_6345_inlining.diff|
|2014-03-12 18:11||frisch||Note Edited: 0011038||View Revisions|
|2014-03-17 11:44||frisch||File Added: pr_6345.diff|
|2014-03-17 11:48||frisch||Note Added: 0011046|
|2014-03-18 07:41||garrigue||Note Added: 0011049|
|2014-03-18 17:59||frisch||Note Added: 0011052|
|2014-03-19 12:43||frisch||Status||new => resolved|
|2014-03-19 12:43||frisch||Fixed in Version||=> 4.02.0+dev|
|2014-03-19 12:43||frisch||Resolution||open => fixed|
|2014-03-19 12:43||frisch||Assigned To||=> frisch|
|2015-12-11 19:25||xleroy||Status||resolved => closed|
|Copyright © 2000 - 2011 MantisBT Group|