Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006030OCamlOCaml typingpublic2013-06-04 14:502013-06-04 21:24
Reportergmelquiond 
Assigned Tofrisch 
PrioritynormalSeveritymajorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version4.00.1 
Target VersionFixed in Version4.01.0+dev 
Summary0006030: Dramatic performance increase when compiling with annotations
DescriptionWhen compiling the core library of Why3 (lib/why3/why3.cma) on my computer, it takes about 28 seconds. What is interesting is the ratio between user time and system time: actual compilation takes 7 seconds, while the kernel wastes 21 seconds outputting files.

A look at the top four entries of the kernel profile shows:

 49,56% ocamlc.opt [kernel.kallsyms] [k] aes_encrypt
  7,17% ocamlc.opt [kernel.kallsyms] [k] memcpy
  4,08% ocamlc.opt [kernel.kallsyms] [k] crypto_xor
  1,82% ocamlc.opt [kernel.kallsyms] [k] crypto_cbc_encrypt

(As you can guess from the profile, the disk is encrypted.)

If we now take a look at the compilation of a single file, e.g. src/core/term.cmo, strace shows that ocamlc does 47,000 writes to src/core/term.annot. In some cases, it performs one system call per two characters!

The culprits are the print_ident_annot and print_info functions from typing/stype.ml. They contain format strings such as "@.)@.", which are definitely a recipe for disaster.

With the attached patch, overall compilation speed is multiplied by 3.5, system time is divided by 12, and the kernel profile now looks a lot saner:

  6,60% ocamlc.opt ocamlc.opt [.] mark_slice
  5,21% ocamlc.opt ocamlc.opt [.] caml_page_table_lookup
  3,74% ocamlc.opt ocamlc.opt [.] caml_oldify_one
  2,48% ocamlc.opt [kernel.kallsyms] [k] aes_encrypt

Please apply.
TagsNo tags attached.
Attached Filespatch file icon stypes.patch [^] (2,092 bytes) 2013-06-04 14:50 [Show Content]

- Relationships

-  Notes
(0009388)
frisch (developer)
2013-06-04 14:55

You should try the trunk version. It has already been optimized a lot, after observing (as you did) how slow the previous version was.

Commits:

http://caml.inria.fr/cgi-bin/viewvc.cgi?view=revision&revision=13070 [^]

http://caml.inria.fr/cgi-bin/viewvc.cgi?view=revision&revision=13071 [^]
(0009389)
gasche (developer)
2013-06-04 14:57

Alain has already made those changes in trunk, commits 13070 and 13071. Can you confirm that compiling trunk as-is solves your problem?
(0009391)
gmelquiond (reporter)
2013-06-04 15:20

This is a bit better with trunk, but nowhere near as good as with my patch. The profile with trunk looks like:

 35,36% ocamlc.opt [kernel.kallsyms] [k] aes_encrypt
  4,84% ocamlc.opt [kernel.kallsyms] [k] memcpy
  3,80% ocamlc.opt ocamlc.opt [.] mark_slice
  2,98% ocamlc.opt [kernel.kallsyms] [k] crypto_xor

System time is at 8 seconds, while total time is at 14 seconds. A decent compiler should never waste 60% of its time doing I/O. (With my patch, system time is at 1.5 seconds.)

For instance, if we take a look again at the compilation of src/core/term.cmo in Why3, we see that:

- ocaml-4.00.01 performs 47,000 calls to write,
- ocaml-trunk performs 15,000 calls to write,
- my patch performs only 18 calls to write.
(0009392)
gasche (developer)
2013-06-04 15:33

The only difference I found (by code inspection) that may be relevant to performances is the fact that Alain's code has a flush in the [Ti_expr] case of [print_info], after the opening parenthesis, while Guillaume's patch only uses "@\n" (newline with no flush). If I can validate this hypothesis on some long file, we can validate the change.
(0009393)
frisch (developer)
2013-06-04 15:40

I'm working on it.
(0009394)
frisch (developer)
2013-06-04 15:48
edited on: 2013-06-04 15:52

Commit 13742 on trunk should fix the problem. When compiling typing/typecore.cmo (with -annot), the number of writes is reduced from 29494 to 39.

Can you try it and report how it affects the performances on your example?

(0009395)
gmelquiond (reporter)
2013-06-04 15:48

Note that this is not the only flush in Alain's code; there is also one hidden behind pp_print_newline. Also, I don't know the internals of Format, but the fact that there is no final flush for annotations in trunk worries me a bit. Could annotations be abruptly truncated once the compiler terminates? (If not, then the explicit flush I put in my patch is redundant.)
(0009396)
frisch (developer)
2013-06-04 15:51

> Could annotations be abruptly truncated once the compiler terminates?

Annotations are collected and all dumped at once (Stypes.dump), so I don't see how this could happen, except if you kill the process explicitly.
(0009397)
frisch (developer)
2013-06-04 15:57
edited on: 2013-06-04 15:57

> Note that this is not the only flush in Alain's code; there is also one hidden behind pp_print_newline.

The commit also get rid of this one. All uses of Format are now on Format.str_formatter (no syscall).

(0009398)
gmelquiond (reporter)
2013-06-04 16:00

I understand that they are all dumped at once, but I don't see where the final half-filled buffer is flushed to disk in the code.

Anyway, as far as I am concerned, the number of system calls for writing annotations is now optimal, so the bug can be marked as fixed.
(0009399)
frisch (developer)
2013-06-04 17:10

> I understand that they are all dumped at once, but I don't see where the final half-filled buffer is flushed to disk in the code.

All open out_channels are flushed when the process exits (Pervasives.flush_all is the default exit_function in Pervasives).
(0009400)
frisch (developer)
2013-06-04 17:16

Guillaume, thanks for the detailed report and quick feedback.
(0009402)
frisch (developer)
2013-06-04 17:38

Actually, it would be useful to compare the total time between the trunk and your patch (which is much less invasive that my attempt to reduce the use of Format and of format strings).
(0009404)
gmelquiond (reporter)
2013-06-04 18:44

I have measured again how long it takes to compile why3.cma. There is no doubt about it: your version is so much better than mine. In fact, the huge timing difference on the actual compilation time (20% overhead) kind of frightens me about the performance of Format.

Trunk:

5.30user 1.80system 0:08.72elapsed 81%CPU
5.30user 1.80system 0:08.71elapsed 81%CPU
5.29user 1.78system 0:08.66elapsed 81%CPU
5.25user 1.96system 0:08.84elapsed 81%CPU
5.35user 1.75system 0:08.71elapsed 81%CPU

Trunk + reverting stypes.ml + applying my patch - the final flush:

6.36user 1.79system 0:09.75elapsed 83%CPU
6.34user 1.78system 0:09.72elapsed 83%CPU
6.25user 1.80system 0:09.68elapsed 83%CPU
6.18user 1.92system 0:09.72elapsed 83%CPU
6.34user 1.82system 0:09.77elapsed 83%CPU
(0009405)
frisch (developer)
2013-06-04 21:24

Thanks again for the results. They justify the extra effort put into de-Formatting this piece of code. And indeed, there are reasons to be concerned with the performance of Format and format strings. See:

 http://www.lexifi.com/blog/note-about-performance-printf-and-format [^]

Hopefully, 0006017 will remove some overhead related to parsing format strings at runtime (but not the overhead related to Format itself, I guess).

- Issue History
Date Modified Username Field Change
2013-06-04 14:50 gmelquiond New Issue
2013-06-04 14:50 gmelquiond File Added: stypes.patch
2013-06-04 14:55 frisch Note Added: 0009388
2013-06-04 14:57 gasche Note Added: 0009389
2013-06-04 14:57 gasche Status new => feedback
2013-06-04 15:20 gmelquiond Note Added: 0009391
2013-06-04 15:20 gmelquiond Status feedback => new
2013-06-04 15:33 gasche Note Added: 0009392
2013-06-04 15:38 frisch Assigned To => frisch
2013-06-04 15:38 frisch Status new => assigned
2013-06-04 15:40 frisch Note Added: 0009393
2013-06-04 15:48 frisch Note Added: 0009394
2013-06-04 15:48 gmelquiond Note Added: 0009395
2013-06-04 15:50 frisch Note Edited: 0009394 View Revisions
2013-06-04 15:51 frisch Note Added: 0009396
2013-06-04 15:52 frisch Note Edited: 0009394 View Revisions
2013-06-04 15:57 frisch Note Added: 0009397
2013-06-04 15:57 frisch Note Edited: 0009397 View Revisions
2013-06-04 16:00 gmelquiond Note Added: 0009398
2013-06-04 17:10 frisch Note Added: 0009399
2013-06-04 17:14 frisch Status assigned => resolved
2013-06-04 17:14 frisch Resolution open => fixed
2013-06-04 17:14 frisch Fixed in Version => 4.01.0+dev
2013-06-04 17:16 frisch Note Added: 0009400
2013-06-04 17:38 frisch Note Added: 0009402
2013-06-04 18:44 gmelquiond Note Added: 0009404
2013-06-04 21:24 frisch Note Added: 0009405


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker