Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better compilation of optional arguments with default values #6345

Closed
vicuna opened this issue Mar 12, 2014 · 4 comments
Closed

Better compilation of optional arguments with default values #6345

vicuna opened this issue Mar 12, 2014 · 4 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Mar 12, 2014

Original bug ID: 6345
Reporter: @alainfrisch
Assigned to: @alainfrisch
Status: closed (set by @xavierleroy on 2015-12-11T18:25:52Z)
Resolution: fixed
Priority: normal
Severity: minor
Fixed in version: 4.02.0+dev
Category: back end (clambda to assembly)
Monitored by: @diml @ygrek @jmeber @hcarty

Bug description

The current compilation strategy for optional arguments with default values often creates many allocations on the call site to box the provided arguments with a "Some" constructor, which will be immediately thrown away by the function's body.

An example would be:

let rec f ?(flag = false) ?(acc = 0) = function
  | [] -> if flag then acc else acc + 1
  | hd :: tl -> f ~flag ~acc:(acc + hd) tl

which turns out to be about 3 times slower than:

let rec f ~flag ~acc = function
  | [] -> if flag then acc else acc + 1
  | hd :: tl -> f ~flag ~acc:(acc + hd) tl

let f ?(flag = false) ?(acc = 0) l = f ~flag ~acc l

when used in a loop such as:

let () =
  let l = [1;2;3;4;5;6;7;8;9] in
  for i = 1 to 10000000 do
    ignore (f l)
  done

It could be interesting to generate an inner function where all optional arguments with a default value are turned into non-optional arguments, and a wrapper function in charge of filling in the default values when needed. In most cases, this wrapper could be inlined and simplified on the call site, where one often knows statically if the optional arguments are provided or not (i.e. they are omitted or applied with ~, not ?). As the example above shows, it is worthwhile treating the case of recursive functions (where the inner function and the wrapper are mutually recursive), which is not currently the case for the generic inlining stuff.

File attachments

@vicuna
Copy link
Author

vicuna commented Mar 12, 2014

Comment author: @alainfrisch

I attach a first patch, which improves the inlining mechanism enough so that the following code is compiled as expected:

let rec f ?(flag = false) ?(acc = 100) l = f_inner ~flag ~acc l
and f_inner ~flag ~acc = function
  | [] -> if flag then acc else acc + 1
  | hd :: tl -> f ~flag ~acc:(acc + hd) tl

let () =
  let l = [1;2;3;4;5;6;7;8;9] in
  for i = 1 to 10000000 do
    ignore (f l)
  done

Namely, the two calls to [f] are inlined, resulting in direct calls to [f_inner]. Moreover, the code in the [f] wrapper responsible for checking optional arguments is evaluated statically to remove the check and the allocation of the "Some" blocks.

The patch implements the following changes, which could have a beneficial (I hope) effect in other contexts:

  • Recursive functions can be marked as "inlinable". Only functions defined after an inlinable function in a recursive group can benefit from inlining. This is restrictive and sensible to the ordering, but this is enough for our use case (if the wrapper is put before the inner function).

  • The compilation of functions is changed a little bit to fail earlier when the environment is detected to be actually used (cf NotClosed exception in the patch). This is required to avoid internal errors during compilation, caused by inlining when we change during compilation the actual number of parameters to add the environment pseudo-parameter. I believe the new version is also more efficient (compile-time), since it avoids useless compilation of further functions in the group.

  • During inlining, when a parameter corresponding to a optional argument with a default value is bound to an argument "Some expr", we instead bind it to "expr" and inline the "Some" constructor when it is referenced. (This is currently triggered by matching on the internal name of the parameter, which is always opt. We could make this more general if we detect that all use site can be simplified away to avoid creating the "Some" block.)

  • The simplification pass applied after inlining is enriched to simplify away checks and field accesses related to such inlined "Some" block (concretely: a field access on a pure Pmakeblock primitive is resolved statically to its nth field, and a "if" check on a Pmakeblock primitive is resolved to the "then" branch). These optimizations could be useful for other kinds of inlining.

The next step is to generate automatically the wrapper function. This could be done at different levels. The simplest would probably be during type-checking, in typecore.ml, where the opt identifiers are synthesized. This might also simplify the rather ugly "push_defaults" code in translcore.ml.

@vicuna
Copy link
Author

vicuna commented Mar 17, 2014

Comment author: @alainfrisch

The second patch (pr_6345.diff) includes the first one and adds the required machinery to split a function with default arguments into a wrapper function and an inner function. I've implemented the split during closure conversion, by matching the lambda code generated by typecore.ml to compile default values for optional arguments. This is somewhat fragile, and if this patch is accepted, the effectiveness of the optimization should be checked by non-regression tests.

With this patch, the original example is properly split into two functions, and the inner one is called by the main loop and for the recursive call, thus eliminating all the overhead of optional arguments with default values in that example.

@vicuna
Copy link
Author

vicuna commented Mar 18, 2014

Comment author: @garrigue

I've got a look at pr_6345.diff, and what it does seems reasonable (and correct).
Doing it after push_defaults seems right, because you don't want to work directly on the typed tree.
Also, it seems hard to replace push_defaults, because it has to be done on the typed tree...
The only weak part seems to be the hard-coded check on "opt", as there is no guarantee this could not be used for something else...
Maybe it could be a good idea to check that none of these opt are free in the body of the function.
Also, it is necessary to freshen the ids. At least, some optimizations on the lambda code rely on that.
They don't matter here, since we are in the closure conversion, but there could be other optimizations at latter stages.

I was wondering whether this wrapper/worker scheme could be generalized.
The basic idea here is that we want to do some dispatch statically, by inlining the dispatch code.
This is clearly relevant for optional arguments, which are going to be an explicit Some or None in most cases, but are there other such situations which could be detected at function definition?

@vicuna
Copy link
Author

vicuna commented Mar 18, 2014

Comment author: @alainfrisch

Jacques, thanks for your review. I've added a check that the opt identifiers are indeed removed from the body, and a test in testsuite/tests/asmcomp to ensure that the new compilations scheme is effective (and that it doesn't break later).

Commited to trunk, rev 14466.

I was wondering whether this wrapper/worker scheme could be generalized.

I don't know how common it is to pass freshly constructed data structures to function which immediately pattern-match on them. Maybe this appear after inlining. At least, contrary to optional arguments, the construction of the data structure is explicit at the call site, so people might tend to provide specialized functions if performance matters. But yes, it's certainly an idea to be investigated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants