Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocamltest: parallel build failure in 4.06 branch #7649

Closed
vicuna opened this issue Oct 2, 2017 · 7 comments
Closed

ocamltest: parallel build failure in 4.06 branch #7649

vicuna opened this issue Oct 2, 2017 · 7 comments

Comments

@vicuna
Copy link

vicuna commented Oct 2, 2017

Original bug ID: 7649
Reporter: @gasche
Status: acknowledged (set by @damiendoligez on 2017-10-02T11:47:56Z)
Resolution: open
Priority: normal
Severity: minor
Version: 4.06.0 +dev/beta1/beta2/rc1
Category: configure and build/install

Bug description

When build "make world.opt -j5" in the 4.06 branch, I sometimes get the following failure:

File "none", line 1:
Error: Files testlib.cmx and ../compilerlibs/ocamlcommon.cmxa
make inconsistent assumptions over implementation Location
Makefile:136: recipe for target 'ocamltest.opt' failed
make[3]: *** [ocamltest.opt] Error 2
make[3]: Leaving directory '/home/gasche/Prog/ocaml/github-4.06/ocamltest'

@vicuna
Copy link
Author

vicuna commented Oct 7, 2017

Comment author: @gasche

The error message seems to indicate that there is a race to location.cmx happening during the build.

ocamltest includes modules directly from (see ocamltest/Makefile):

directories = ../utils ../parsing ../stdlib ../compilerlibs

I first looked for a race between ocamltest and compilerlibs/ocamlcommon.cmxa themselves, but this is not possible: ocamlc.opt is built sequentally-before ocamltest.opt in opt.opt, and ocamlc.opt depends on ocamlcommon.cmxa.

Grepping for location.cmx in the .depend files show which other targets may trample the location.cmx file:

$ git grep --files-with-matches location.cmx **depend
.depend
debugger/.depend
ocamldoc/.depend
tools/.depend

There is no native build of the debugger, so I am making the hypothesis that toolsopt.opt and ocamldoc.opt are suspect (this is incomplete: it could also be .depend, or a transitive dependency).

Indeed, opt.opt runs both toolsopt and ocamldoc.opt in parallel with ocamltest.opt:

$(MAKE) ocamllex.opt ocamltoolsopt ocamltoolsopt.opt $(OCAMLDOC_OPT)
ocamltest.opt

I would propose to avoid the race by sequentializing more:

$(MAKE) ocamllex.opt ocamltoolsopt
$(MAKE) ocamltoolsopt.opt
if test -n "$(OCAMLDOC_OPT)"; then
$(MAKE) $(OCAMLDOC_OPT);
fi
$(MAKE) ocamltest.opt

This is obviously safe -- I'm considering proposing such a patch soon, because broken parallel builds are very annoying.

Note that I think a better long-term solution would be to have ocamltools, ocamldep and ocamltest both depends on ocamlcommon.cmxa only, instead of using files from the typing/ and parsing/ directory directly. But this would be a more invasive change.

@vicuna
Copy link
Author

vicuna commented Oct 7, 2017

Comment author: @xavierleroy

Go ahead with the proposed change. No need for a pull request.

These parallel make problems are elusive. I ran your "make world.opt -j5" in a loop on the barsac server and killed it after 100 runs because nothing happened. It can fail almost consistently for you and almost never for everyone else.

@vicuna
Copy link
Author

vicuna commented Oct 13, 2017

Comment author: @alainfrisch

Note that I think a better long-term solution would be to have ocamltools, ocamldep and ocamltest both depends on ocamlcommon.cmxa only

I think ocamldep already links against compiler-libs. Do you suggest to arrange so that it doesn't "see" .cmx files from typing/ and parsing/?

@vicuna
Copy link
Author

vicuna commented Oct 13, 2017

Comment author: @gasche

Aside: I have thought more about the build problem and I think my explanation above is probably wrong: ocamlc.opt is sequentially-before all these parallel targets in opt.opt, and it builds location.cmx, so there is no reason that the parallel targets would race with each other unless they provoke a rebuild (and the rebuild would come from a .cmi race during ocamlc.opt or earlier, I suppose?).

Alain, I suppose that it is ok for they to "see" the .cmx files in other directories, but they should never build them, and they should only be run when those .cmx files have been built (which is guaranteed by depending on the compiler-libs .cmxa). Right now with the .depend that mentions files in ../parsing etc., for all I know the ocamldep Makefile may trigger a rebuild of location.cmx (using its own compilation flags).

@vicuna
Copy link
Author

vicuna commented Oct 20, 2017

Comment author: @damiendoligez

Is there any way to find the reason for the failure by examining the log of a failed build?

@github-actions
Copy link

github-actions bot commented May 7, 2020

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label May 7, 2020
@xavierleroy
Copy link
Contributor

High-parallelism builds are now tested as part of the Jenkins CI and seem to work. I'm optimistically closing this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants