Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005916OCamlOCaml typingpublic2013-02-07 06:242014-05-25 19:52
Reportersliquister 
Assigned To 
PrioritynormalSeverityminorReproducibilityalways
StatusacknowledgedResolutionopen 
PlatformOSOS Version
Product Version4.00.0 
Target Version4.03.0+devFixed in Version 
Summary0005916: Compile time performance of opens
DescriptionThe attached patch is a try at improving compilation speed.

It does two things:
1. compose the two substitutions applied when typing [open] constructs like [open Core.Std], instead of applying the two substitutions one after the other
2. apply the resulting substitution lazily. The substitution is only applied on inner modules of, say, [Core.Std] when they are looked up.

The reason for this patch is because many libraries at jane street like core and async are packed as one module, and are accessed through the Std submodule. As a consequence, the aforementioned substitutions are currenty applied on the whole library (twice) when writing [open Lib.Std], as opposed to once on the modules used by the current compilation unit with the patch.

I haven't tried the patch completely by itself, but compiling (without linking) one big tree using
1. a standard 4.00.0 (or 4.00.1?) compiler took 168min (wall clock time).
2. the same compiler with OCAMLRUNPARAM=s=32M took between 141min and 151min.
3. the same compiler with a patch equivalent to the one attached and OCAMLRUNPARAM=s=32M took 109min.

If we just looked at user time, the improvement would be more visible (3. would be roughly 3x faster than 1. if I remember right).
Tagspatch
Attached Filesdiff file icon lazy_subst.diff [^] (13,259 bytes) 2013-02-07 06:24 [Show Content]
diff file icon lazy-subst2.diff [^] (16,693 bytes) 2013-02-20 15:18 [Show Content]
diff file icon lazy-subst3.diff [^] (18,138 bytes) 2013-07-15 17:18 [Show Content]

- Relationships
has duplicate 0005877resolvedfrisch multiple "open" can become expensive in memory 

-  Notes
(0008823)
frisch (developer)
2013-02-07 06:39

We already incorporated a patch to improve performance of opens in the compiler (0005877). Do you think your proposal brings an additional improvement, or maybe that it should replace the other patch?
(0008824)
sliquister (reporter)
2013-02-07 07:07

I don't understand the other patch but judging by the description, I don't think they overlap. This one is about making opening of big structure faster because even one open is already quite costly. It aims for this kind of code:
  open Core.Std
  open Core_extended.Std
  open Async.Std
  ...
There are no multiple opens here so the other patch would do nothing.
(0008825)
lefessan (developer)
2013-02-07 10:37

The two patches are independant. From what I understood, this one lazily applies the substitution, but if there are two opens, then the substitution is (lazily) applied twice. Maybe such laziness is enough to solve the problem of multiple opens, it would need more investigation to know.

Anyway, I think both patches should be applied for now.
(0008827)
frisch (developer)
2013-02-07 11:15

I'm a bit worried by the introduction of a lazy component in the environment, which needs to be serialized. Shouldn't the "explicit lazy" mechanism that you (Fabrice) introduced be used as well here?
(0008828)
sliquister (reporter)
2013-02-07 15:33

I had the impression that the environment was not serialized, but I looked at at cmi files. Do you mean serialized in cmt files?
(0008830)
garrigue (manager)
2013-02-08 03:17

Actually, if I remember correctly the discussion we had about binannot, the full environment is not serialized, only the env_summary field is.
So I'm not sure that EnvLazy is useful at all.
Why is it there ?
(0008831)
frisch (developer)
2013-02-08 06:44

Precisely, the patch adds laziness to the summary:

- | Env_module of summary * Ident.t * module_type
+ | Env_module of summary * Ident.t * module_type Lazy.t
(0008832)
sliquister (reporter)
2013-02-08 14:42

Well it adds laziness both to the summary and to the environment. I can change that but what is the problem with serializing lazy values?
(0008833)
garrigue (manager)
2013-02-08 15:18

It is just that a non-evaluated lazy value contains a closure, which cannot be serialized.
The EnvLazy module avoids that by providing the evaluation function explicitly, at extraction time.
So it could solve that problem.
Another solution would be to force the evaluation of these lazy values before serialization, but then you get back your performance problem if you use binannot.
(0008834)
frisch (developer)
2013-02-08 15:30

AFAIK, environment summaries are also used by the bytecode debugger.
(0008883)
sliquister (reporter)
2013-02-20 15:18

It took me some time, but I attached a new version of the patch. This one was written against a recent version of trunk, so the conflicts with Fabrice's changes have been solved.
I replaced the lazy by an EnvLazy. Since Env.summary contains this lazy value and it is exposed, I could have exposed the Env module altogether but I thought it was simpler to just expose a function to force these lazy values.

I retried the whole tree build and the performance improvement is the same.

What is new though is that now my /tmp contains 50000 ocamlpp* files (and I don't compile anything else on that machine), but I did not investigate that.
(0008978)
gasche (developer)
2013-03-17 17:40
edited on: 2013-03-17 17:42

What is the status of this patch? Alain, Fabrice, do you think it could be applied now that EnvLazy is used?

PS: the ocamlpp* problem was independently submitted by sliquister (thanks!) as 0005930 and is now solved.

(0008989)
sliquister (reporter)
2013-03-17 20:34

Just a note: this patch would probably be much less useful if we had namespaces, assuming the design of namespaces allowed to only load (thus substitute) the submodules of packed modules that are referenced (or the parts of whatever would replace packed modules).
I didn't know namespaces where in the works when I started that, and I don't know how sure it is that they will be incorporated in caml and how far away they are.
(0008991)
frisch (developer)
2013-03-18 09:34

I don't have anything against the patch but I won't have time to review it soon. Gabriel: if you review the patch and if you feel comfortable with the patch, I think you can apply it.
(0009779)
frisch (developer)
2013-07-15 17:02
edited on: 2013-07-15 17:20

I've uploaded a new version of the patch adapted to current trunk (resolving conflicts with 0005965 and 0005980).

I'm still unable to produce a micro-benchmark to illustrate the gain. I've tried generating huge nested structures and opening only a sub-structure but I could not observe a difference in compilation time (yet).

(0010001)
sliquister (reporter)
2013-07-30 00:24

I would need to look at the patch again, but right now I can't compile the compiler at home. I think the description I initially gave was more general than what the patch actually does. Have you tried:

a.ml: big structure
b.ml: empty file
c.ml: module A = A module B = B
pack.cmi: pack of a.cmi b.cmi c.cmi

and then comparing the compilation time of a file containing only 'open C.B' or only 'open C.A' (both should be slow without the patch whereas the first one should be fast with the patch).
(0010373)
frisch (developer)
2013-09-18 12:23

a.ml: let x1 = 1;; let x2 = 2;; ...;; let x10000 = 10000
b.ml: (* empty file *)
c.ml: module A = A module B = B
da.ml: open C.A;; open C.A;; open C.A;; (* 10 times *)
db.ml: open C.B;; open C.B;; open C.B;; (* 10 times *)

Compiling with trunk:

$ time ocamlc -c da.ml
ocamlc -c da.ml 0.48s user 0.02s system 101% cpu 0.493 total
$ time ocamlc -c db.ml
ocamlc -c db.ml 0.02s user 0.00s system 99% cpu 0.020 total


So "open C.B" already seems to be quite fast. To eliminate the effect of the memoization patch, I've also tried with a single "open C.A" vs "open C.B":

$ time ocamlc -c da.ml
ocamlc -c da.ml 0.11s user 0.00s system 106% cpu 0.103 total
$ time ocamlc -c db.ml
ocamlc -c db.ml 0.02s user 0.00s system 99% cpu 0.020 total
$ time ocamlc.opt -c da.ml
ocamlc.opt -c da.ml 0.03s user 0.01s system 100% cpu 0.040 total
$ time ocamlc.opt -c db.ml
ocamlc.opt -c db.ml 0.01s user 0.01s system 129% cpu 0.015 total

Even there, "open C.B" is much faster than "open C.A" (without the patch).

Could you run your experiment again on your large code base to confirm the impact of the patch?

- Issue History
Date Modified Username Field Change
2013-02-07 06:24 sliquister New Issue
2013-02-07 06:24 sliquister File Added: lazy_subst.diff
2013-02-07 06:36 frisch Relationship added has duplicate 0005877
2013-02-07 06:39 frisch Note Added: 0008823
2013-02-07 07:07 sliquister Note Added: 0008824
2013-02-07 10:37 lefessan Note Added: 0008825
2013-02-07 11:15 frisch Note Added: 0008827
2013-02-07 15:33 sliquister Note Added: 0008828
2013-02-08 03:17 garrigue Note Added: 0008830
2013-02-08 06:44 frisch Note Added: 0008831
2013-02-08 14:42 sliquister Note Added: 0008832
2013-02-08 15:18 garrigue Note Added: 0008833
2013-02-08 15:30 frisch Note Added: 0008834
2013-02-20 15:18 sliquister Note Added: 0008883
2013-02-20 15:18 sliquister File Added: lazy-subst2.diff
2013-03-17 17:40 gasche Note Added: 0008978
2013-03-17 17:42 gasche Note Edited: 0008978 View Revisions
2013-03-17 20:34 sliquister Note Added: 0008989
2013-03-18 09:34 frisch Note Added: 0008991
2013-04-15 12:13 lefessan Assigned To => lefessan
2013-04-15 12:13 lefessan Status new => confirmed
2013-07-12 09:30 doligez Target Version => 4.01.0+dev
2013-07-15 17:01 frisch Assigned To lefessan => frisch
2013-07-15 17:01 frisch Status confirmed => assigned
2013-07-15 17:02 frisch Note Added: 0009779
2013-07-15 17:18 frisch File Added: lazy-subst3.diff
2013-07-15 17:20 frisch Note Edited: 0009779 View Revisions
2013-07-16 16:25 frisch Target Version 4.01.0+dev => 4.01.1+dev
2013-07-30 00:24 sliquister Note Added: 0010001
2013-09-18 12:23 frisch Note Added: 0010373
2013-09-18 12:43 frisch Target Version 4.01.1+dev => after-4.02.0
2013-11-14 10:41 frisch Assigned To frisch =>
2013-11-14 10:41 frisch Status assigned => acknowledged
2013-12-16 14:16 doligez Tag Attached: patch
2014-05-25 19:52 doligez Target Version after-4.02.0 => 4.03.0+dev


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker