|Anonymous | Login | Signup for a new account||2017-05-28 20:24 CEST|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0006343||OCaml||back end (clambda to assembly)||public||2014-03-07 12:24||2015-12-11 19:25|
|Priority||normal||Severity||feature||Reproducibility||have not tried|
|Target Version||Fixed in Version||4.02.0+dev|
|Summary||0006343: Making better use of extra slots in the symbol corresponding to the current unit|
|Description||The compilation of unit has recently been changed to map value identifiers defined in sub-modules into extra slots of the symbol corresponding to the current unit. This makes it possible to access those values directly by a single indirection from the global symbol. Unfortunately, this ability is not used optimally. For instance, consider:|
module A1 = struct
module A2 = struct
let r = ref 1
let f () = !A1.A2.r
Here, A2 and 4 are stored in extra slots of the global (fields number 2 and 3 respectively). Unfortunately, the reference to r in f still goes through 4 indirections from the global symbol, while 2 would be enough.
On a related note, if we put an .mli on this unit which forces some non-trivial coercions on A1 (e.g. an empty signature for A1), then it's even worse, since the indirections don't start from the global symbol, but from the function's environment (which is now required, the closure is no longer constant). To fix this part, one could keep extra slots in the global to store the uncoerced version of module identifiers, so that they can be accessed from this root symbol.
|Tags||No tags attached.|
|Attached Files||pr_6343.diff [^] (2,807 bytes) 2014-03-07 14:53 [Show Content]|
I've attached a patch which attacks the main issue (not using existing global slots as a faster way to access nested value identifiers) by enriching the notion of approximation tracked during the closure conversion with a new case to represent the nth field of a global.
In the long tradition of meaningless but funny micro-benchmarks (Xavier, you're welcome :-)), I've tested the patch with:
module A1 = struct module A2 = struct let r = ref 1 end end let f () = for i = 1 to 10000 do incr A1.A2.r done; !A1.A2.r let () = for i = 1 to 100000 do ignore (f ()) done
and it runs about twice as fast as with the current trunk.
This would need to be checked, but I believe the patch could allow to simplify Translmod to avoid keeping track of a lambda-substitution to map identifiers to global field references.
That said, another approach would be to attack the issue precisely in Translmod directly, in order to get a nicer lambda code. One advantage is that it could in theory benefit more easily to bytecode (if we switch to the tranl_store way of compiling structures), which might be nicer for js_of_ocaml in particular.
|This seems to be related to 0005537 and 0005573.|
The patch also fixed the "related" note of the original ticket. Consider for instance:
module A1 : sig end
val f : unit -> int
module A1 = struct let r = ref 1 end
let f () = !A1.r
The "r" value is put in a global slot, and the "A1" block is created by accessing this global slot. This information is preserved in the approximation of A1, so that when the lambda code accesses A1.r, the closure conversion knows that the value can be retrieved from the global slot.
If A1 is completely hidden from the interface (or if "r" is added to it, so that there is no coercion involved), the lambda code for accessing A1.r will be quite different (it will start from the global symbol, not from the local A1 block), but the end result will be the same.
|Patch committed to trunk (commit 14452). It has the effect of turning more functions into closed ones, even if they seemingly have free variables in the lambda code (this already used to be the case, but less frequently). In such cases, we can allocate the corresponding closures statically (commit 14453).|
|2014-03-07 12:24||frisch||New Issue|
|2014-03-07 14:53||frisch||File Added: pr_6343.diff|
|2014-03-07 15:06||frisch||Note Added: 0011024|
|2014-03-07 15:20||frisch||Note Added: 0011025|
|2014-03-07 15:33||frisch||Relationship added||related to 0005537|
|2014-03-07 17:52||frisch||Description Updated||View Revisions|
|2014-03-07 17:59||frisch||Note Added: 0011028|
|2014-03-10 11:08||frisch||Note Added: 0011033|
|2014-03-13 10:40||frisch||Status||new => resolved|
|2014-03-13 10:40||frisch||Fixed in Version||=> 4.02.0+dev|
|2014-03-13 10:40||frisch||Resolution||open => fixed|
|2014-03-13 10:40||frisch||Assigned To||=> frisch|
|2015-12-11 19:25||xleroy||Status||resolved => closed|
|2017-02-23 16:35||doligez||Category||OCaml backend (code generation) => Back end (clambda to assembly)|
|2017-02-23 16:44||doligez||Category||Back end (clambda to assembly) => back end (clambda to assembly)|
|Copyright © 2000 - 2011 MantisBT Group|