Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double linking of bytecode modules #5461

Closed
vicuna opened this issue Jan 3, 2012 · 25 comments · Fixed by #11635
Closed

Double linking of bytecode modules #5461

vicuna opened this issue Jan 3, 2012 · 25 comments · Fixed by #11635

Comments

@vicuna
Copy link

vicuna commented Jan 3, 2012

Original bug ID: 5461
Reporter: gerd
Assigned to: @mshinwell
Status: assigned (set by @alainfrisch on 2017-03-01T15:50:17Z)
Resolution: fixed
Priority: low
Severity: minor
Version: 3.12.1
Fixed in version: 3.13.0+dev
Category: compiler driver
Tags: patch
Has duplicate: #4231
Monitored by: @gasche @protz mehdi @ygrek @glondu @xclerc @jmeber "Julien Signoles" jm @Chris00 ertai @alainfrisch @damiendoligez

Bug description

It is possible to link toplevel modules several times (bytecode only). This leads to errors that are hard to find and to track down, like that exceptions cannot be caught anymore.

Steps to reproduce

--- z.ml ---

let m ex =
match ex with
| Unix.Unix_error(,,_) ->
print_endline "match"
| _ ->
print_endline "nomatch"

--- y.ml ---

let () =
Z.m(Unix.Unix_error(Unix.ENOENT,"",""))

Now link as follows:

$ ocamlc -o z unix.cma z.ml unix.cma y.ml

And:

$ ./z
nomatch

ocamlopt is immune:

ocamlopt -o z unix.cmxa z.ml unix.cmxa y.ml
File "y.ml", line 1, characters 0-1:
Error: Files /opt/godi-3.12/lib/ocaml/std-lib/unix.cmxa
and /opt/godi-3.12/lib/ocaml/std-lib/unix.cmxa
both define a module named Unix

File attachments

@vicuna
Copy link
Author

vicuna commented Jan 3, 2012

Comment author: gerd

Of course it's usually not such a simple ocamlc invocation. The problem can be well hidden, e.g. there can be two libraries l1.cma and l2.cma, and both contain the same module by accident.

As bytecode can be partially linked, it is easy to get into such situations even with standard modules like Unix (the problem report I got showed exactly this problem) - just have funny users who do "ocamlc -a -o out.cma unix.cma mylib.cma".

@vicuna
Copy link
Author

vicuna commented Jan 3, 2012

Comment author: @ygrek

This is indeed a rather hard to find issue for the novice. Even in the ocamlopt case - the error message could be better (e.g. add the check that both files are not only defining the same module but are actually the same file).

@vicuna
Copy link
Author

vicuna commented Jan 3, 2012

Comment author: @gasche

I wrote a quick patch (attached) to tackle the issue, that just reproduces the error logic of asmlink.ml into bytelink.ml. It turns out that such an easy patch does not work. My understanding (which may be wrong) is that the way dependencies between compilation units are detected is fundamentally different between native and bytecode compilation, and is accurate in the native case, but approximate in the bytecode case. While that doesn't make a difference in most cases, here it turns out that the bytecode heuristic is not precise enough, and raises a lot of "multiple definitions" when compiling existing code (the OCaml compiler).

My conclusion is that fixing this is harder than it seems, and it might even not be entirely fixable (without losing the separate compilation properties that .cmo have today, and that .cmx don't have). I would go for a warning, instead of failing with an error, when this situation is (approximately) detected.

Of course, this is limited by my own understanding. A developer from the Ocaml team may provide further information and find a convenient way to fix this. I'm just saying that it does not look simple.

Below is a typical error I get when trying to "make world" with this patch applied:

  • boot/ocamlrun boot/myocamlbuild.boot -tag-line true: -use_stdlib -log _boot_log1 ocamlbuild/ocamlbuildlightlib.cma ocamlbuild/ocamlbuildlight.byte
    ../ocamlcomp.sh -g -I stdlib stdlib/pervasives.cmo ocamlbuild/ocamlbuild_pack.cmo ocamlbuild/ocamlbuildlight.cmo -o ocamlbuild/ocamlbuildlight.byte
    File "none", line 1:
    Error: Files stdlib/pervasives.cmo and ../stdlib/stdlib.cma(Pervasives)
    both define a module named Pervasive

@vicuna
Copy link
Author

vicuna commented Jan 4, 2012

Comment author: @alainfrisch

The problem can also be triggered without libraries and also with ocamlopt (using -pack in this case); actually, it is not only about weird semantics, this can really break the soundness of the type system. This was reported (in French) in #4231. One should also be careful with dynlink (#4229, #4839).

@vicuna
Copy link
Author

vicuna commented Jan 5, 2012

Comment author: @lefessan

Gabriel, the fact that your patch triggers compilation errors for OCaml itself shows that the problem is even worse, since in the example you give, Pervasives is linked twice in ocamlbuildlight.byte (what is then, for example, the behavior of "at_exit", since it appears now twice in the executable ?).

We should actually use your patch to fix these problems first in the distribution, and then maybe add an option -linkonce to trigger your patch only on demand (once namespaces will be available in OCaml, there will be no reason to accept double linking of the same module, since namespaces can be used to solve this issue).

@vicuna
Copy link
Author

vicuna commented Jan 5, 2012

Comment author: @gasche

I agree that those could be bugs, but that is for maintainers to decide at each occurence. I'll ping Xavier Clerc.

I think the simplest course of action would be to create a warning enabled by default, that could be turned off (in the strange cases where this is the intended behavior) or into an error (for sane developpers) by default. I don't think we could have an error right now, if that breaks existing programs.

The reason why I haven't provided a patch with the warning yet is that adding a warning is heavier than adding an error (you need to pick a warning number, etc.), so it's best to know if that is considered a good choice first.

Alain, what do you think? Your #4231 conclusion was rather on the error side. We could also implement the heuristic your suggest for packed modules.

@vicuna
Copy link
Author

vicuna commented Jan 6, 2012

Comment author: @alainfrisch

I've some preference for the error solution; we are talking about a way to circumvent the type system here (-> type unsoundness), and there is no reason to accept it... But there might some code base around which depend on the current behavior and would not be trivial to fix, so a warning could be a good (temporary?) solution.

@vicuna
Copy link
Author

vicuna commented Jan 10, 2012

Comment author: @lefessan

I tried for one hour to tell ocamlbuild to not link with stdlib/pervasives.cmo, but I don't even understand when it decides to link with it (I tried multiple options, -no-stdlib, changing the tag line, -nopervasives, logging and going through the sources). As I don't want to become crazy, could somebody who understand the tool prevent it from double linking pervasives.cmo ?

What about switching to Makefile to build ocamlbuild and camlp4 ? If somebody agrees with it, I can fill a bug report asking for it...

@vicuna
Copy link
Author

vicuna commented Jan 11, 2012

Comment author: @lefessan

Gabriel, I updated your patch to prevent ocamlbuild from linking twice several modules (I had to put all the modules of stdlib in a -ignore '...' argument, but it worked). So, now, we just have to replace the error by a warning to make it ready for inclusion.

@vicuna
Copy link
Author

vicuna commented Jan 11, 2012

Comment author: @lefessan

I added it as warning 31. What should we do now, merge in SVN ?

@vicuna
Copy link
Author

vicuna commented Jan 11, 2012

Comment author: ertai

What about adding the nopervasives tag to ocamlbuildlight.byte in the ocamlbuild/_tags?

I did some light testing and seen the extra pervasives.cmo disapear.

I acknowledge that the handling of stdlib vs nostdlib, pervasives vs nopervasives, ocamlbuildlight vs ocamlbuild, ... makes the whole thing complicated. However I think that any building tool end up like this if we do not make the extra work of simplifying the problem from time to time.

diff ocamlbuild/_tags
-"ocamlbuildlight.byte": -use_unix
+"ocamlbuildlight.byte": -use_unix, nopervasives

@vicuna
Copy link
Author

vicuna commented Jan 12, 2012

Comment author: @gasche

In any case, I think we could separate the fix in two, with the compiler changes in one patch, and eventual ocamlbuild (or anything else) fixes in a second patch. This would make review, blaming etc. easier.

I don't know about the ocamlbuild part, but on the purely-compiler side Fabrice's patch looks good to me -- thanks for taking the time to write it.

@vicuna
Copy link
Author

vicuna commented Jan 17, 2012

Comment author: @lefessan

I applied my last patch, with Nicolas' modification to avoid the double linking while building ocamlbuild_light.byte (commit r12033 in SVN trunk).

@vicuna
Copy link
Author

vicuna commented Feb 10, 2012

Comment author: @gasche

As of the current trunk (12114) I still get some related warnings. I have attached the output of grep "Warning 31" -B 3 on my make world.opt output. There is still one warning for pervasives.cmo for ocamlbuildlight.byte, and a bunch of warnings for camlp4boot.byte.

How should they be fixed?

@vicuna
Copy link
Author

vicuna commented Feb 10, 2012

Comment author: @lefessan

I don't understand: these are the original warnings, that were supposed to be removed by the modification of _tags provided by Nicolas. Moreover, they used to trigger an error, not a warning.

I won't fight again with ocamlbuild. My previous patch (attached) fixes this problem by asking ocamlbuild to ignore stdlib modules. It can still be used unless a more elegant solution is found.

@vicuna
Copy link
Author

vicuna commented Feb 18, 2013

Comment author: @lefessan

Still present in 4.00.1

@vicuna
Copy link
Author

vicuna commented Feb 18, 2013

Comment author: @lefessan

I submitted a fix in revision 13296 of the SVN trunk. Anybody can double check ?

@vicuna
Copy link
Author

vicuna commented Jun 14, 2013

Comment author: @alainfrisch

Now that the compiler is free of warning 31 (I guess!), I'd suggest to turn it into a proper error, since it can lead to segfaults (see example in #4231). We could use OPAM to test that it doesn't break existing projects (or fix them).

@vicuna
Copy link
Author

vicuna commented Jun 18, 2013

Comment author: @damiendoligez

Let's keep it as a warning for 4.01 and think about having "+31" as the default warn-error for 4.02.

@vicuna
Copy link
Author

vicuna commented Jun 18, 2013

Comment author: @alainfrisch

Let's keep it as a warning for 4.01 and think about having "+31" as the default warn-error for 4.02.

Ok for 4.01, but then, shouldn't we turn it into a proper error?

@vicuna
Copy link
Author

vicuna commented Dec 9, 2015

Comment author: @alainfrisch

Warning 31 is now a "warn-error" by default. I'd prefer to have a real error at some point, so I leave this ticket open.

@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label Apr 26, 2021
@gasche
Copy link
Member

gasche commented Apr 26, 2021

Warning 31 has been a warn-error for years, and I suspect that the problem of redundant linking has now gone away from the ecosystem. It may now be time to turn warning 31 into a hard error, to match the native-compilation behavior?

@gasche gasche removed the Stale label Apr 26, 2021
@xavierleroy
Copy link
Contributor

Warning 31 has been a warn-error for years, and I suspect that the problem of redundant linking has now gone away from the ecosystem. It may now be time to turn warning 31 into a hard error, to match the native-compilation behavior?

Why don't you go ahead? 5.00 feels like the right time for this kind of changes.

@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants