Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive memory consumption while compiling the OCaml distribution #6001

Closed
vicuna opened this issue Apr 27, 2013 · 12 comments
Closed

Excessive memory consumption while compiling the OCaml distribution #6001

vicuna opened this issue Apr 27, 2013 · 12 comments

Comments

@vicuna
Copy link

vicuna commented Apr 27, 2013

Original bug ID: 6001
Reporter: hgouraud
Assigned to: @bobzhang
Status: closed (set by @bobzhang on 2013-06-06T01:34:23Z)
Resolution: fixed
Priority: normal
Severity: minor
Platform: 1&1 mutualised server
OS: Linux
OS Version: Debian GNU/Linux
Version: 4.00.1
Category: configure and build/install
Tags: compiler-time-or-space-regression
Child of: #7630
Monitored by: @bobzhang @alainfrisch

Bug description

During make world, on a 1&1 mutualised server (restricted memory)

Fatal error: exception Out_of_memory
Exit code 2 while executing this command:
../ocamlcomp.sh -c -g -warn-error A -w a -I camlp4/boot -I camlp4 -I stdlib -o camlp4/boot/camlp4boot.cmo camlp4/boot/camlp4boot.ml
make[1]: *** [camlp4out] Error 2
make[1]: Leaving directory `/homepages/30/d456759237/htdocs/ocaml-4.00.1'
make: *** [world] Error 2

File attachments

@vicuna
Copy link
Author

vicuna commented Apr 27, 2013

Comment author: @gasche

This is not a "bug" per se: your particular environment (a mutualized server) does not provide you enough memory to compile Camlp4 (camlp4boot.ml is the bootstrap file that is indeed quite large and takes a good portion of the compilation time and memory for Camlp4).

As far as I know, the time and memory consumption of the OCaml distribution is extremely reasonable when compared to other software projects (LLVM, GHC, Java... not to mention Firefox or LibreOffice). If it doesn't compile on your environment, we'd rather blame it.

You could try to:

  • cross-compile Camlp4 (or OCaml altogether): compile it from a computer having a similar development environment, and move the resulting binaries to your server
  • compile OCaml witout Camlp4 (-no-camlp4 option during ./configure)
  • tweak the compilation settings, hoping to consume less memory (I see that the -g flags is passed, it may be better to compile without it)

Any of these may be better discussed on the mailing-list rather than the bugtracker: I'm sure some people reading the list have done some of these, and they don't necessarily participate on the bugtracker.

@vicuna
Copy link
Author

vicuna commented May 28, 2013

Comment author: @gasche

For the record, I've heard several other reports of Camlp4's compilation consuming too much memory, and I now think it may be interesting to look for ways to reduce it. If time allows, I will have a look, but I welcome any previous experience on monitoring the memory consumption of the OCaml compiler, or help in that matter.

@vicuna
Copy link
Author

vicuna commented May 28, 2013

Comment author: @bobzhang

The bootstrapping model for Camlp4 is that it tries to marshal all files into one single module, resulting in a very large file, camlp4boot.ml, 55163 lines. Is this the cause of the problem?

@vicuna
Copy link
Author

vicuna commented May 28, 2013

Comment author: @gasche

Yes, I believe that this is the file for which compilation consumes so much memory -- if you have insights on non-invasive changes to split it up, that would be interesting. However, given that I don't remember hearing such memory problems before 4.00, it might also be the a regression in memory use by the compiler (related to new warnings or I don't know what), which could be worrying as well and worth tracking down.

@vicuna
Copy link
Author

vicuna commented May 28, 2013

Comment author: @bobzhang

It turns out this command takes up to 700M memory, this issue has nothing to do with
camlp4, since camlp4boot.ml is simply plain ocaml file.

ocamlopt.opt -c camlp4boot.ml


steps to reproduce

assume you are in the compiler toplevel

ocaml>ocamlbuild camlp4/Camlp4_import.ml
ocaml>cp _build/camlp4/Camlp4_import.ml camlp4/Camlp4_config.ml camlp4/Camlp4_config.mli camlp4/boot/
ocaml>cd camlp4/boot/
boot>ocamlopt.opt dynlink.cmxa Camlp4_import.ml Camlp4_config.mli Camlp4_config.ml Camlp4.ml camlp4boot.ml

I tried to test it under different compiler versions, however, ocaml 3.11.2 is broken in my machine, if you have ocaml 3.11.2 installed, it would be very helpful to try to compile with the same source code(It should compile) and see the result.

Other issues:
It's possible to change the build system without breaking the compatibility, but it may invalidate the knowledge other maintainers have about camlp4'boot system, I am not sure whether it's worth, though.

@vicuna
Copy link
Author

vicuna commented May 31, 2013

Comment author: @bobzhang

I think this is probably a bug in the compiler, 50,000 lines of code are not that big in reality

@vicuna
Copy link
Author

vicuna commented Jun 4, 2013

Comment author: @gasche

A memory usage analysis ( /usr/bin/time -v for peak memory usage, http://ygrek.org.ua/p/code/pmpa for finer-grained traces of code that allocates ) indicates that the problem is related to the long sequence of Grammar.extend call in Camlp4OCamlRevisedParser.ml (as a sub-file of camlp4boot.ml), which then provokes a bad behavior in the register allocator (constructing the coloration graph consumes hundreds of megabytes of memory).

As with all the problems where unnaturally long code sequences passed to the OCaml compiler, a possible fix is to change the code generator to break these sequences in smaller pieces. It is exactly what I did in the attached patch, and indeed it seems to solve the memory consumption issue for camlp4boot.ml (remains to check that this is the only issue).

The patch changes the way the EXTEND syntactic sugar works, to avoid making too many Grammar.Extend calls in a sequence.

This is only a preliminary attempt at solving the problem (and the code is in a very rough state). It would be more interesting to see if the compiler itself can be fixed, without a change in readability. That said, it may be interesting to include a simpler hack like this one in time for the next release.

Hongbo (or whoever is interested), could you comment on whether this patch indeed fixes the memory consumption problem?

One difficulty is that one needs to perform a Camlp4 bootstrap cycle for the changes to affect camlp4boot.ml. As bootstrapping Camlp4 is not easy, I have also uploaded a patch that does this camlp4 bootstrap -- to be applied on top of trunk.

@vicuna
Copy link
Author

vicuna commented Jun 4, 2013

Comment author: @alainfrisch

Would something like #4074 improve the situation?

@vicuna
Copy link
Author

vicuna commented Jun 5, 2013

Comment author: @bobzhang

gasche: I upload a simplified patch without changing the code generator, now ocamlc.opt and ocamlopt.opt consumes the same amount of memory, around 200M in my machine. But 200M memory may still seem to be a lot?

alain: would you provide a patch against the current trunk?

Thanks

@vicuna
Copy link
Author

vicuna commented Jun 5, 2013

Comment author: @gasche

Alain, I think that the problem is orthogonal, as the "register use worst case" is here happening entirely inside one expression, not spread among several structure items.

Hongbo, your patch is indeed much simpler. Mine had the advantage of also fixing the problem for users writing large grammars with EXTEND, but that's a secondary aspect at best because I'm not sure there are actual uses of this.

I think you should push your patch upstream; there is really no reason not to, as it is nicely orthogonal to any solution we could find to the graph-coloring memory use issue.

@vicuna
Copy link
Author

vicuna commented Jun 5, 2013

Comment author: @alainfrisch

Alain, I think that the problem is orthogonal, as the "register use worst case" is here happening entirely inside one expression, not spread among several structure items.

I see, thanks.

It would be interesting to see how the linear scan allocator (#5324) would behave on that file.

@vicuna
Copy link
Author

vicuna commented Jun 6, 2013

Comment author: @bobzhang

Fixed in r13749

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant