|Anonymous | Login | Signup for a new account||2015-02-27 08:33 CET|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0006462||OCaml||OCaml runtime system||public||2014-06-18 21:26||2015-02-09 13:18|
|Platform||amd64||OS||Linux||OS Version||Ubuntu 12.04|
|Target Version||4.03.0+dev||Fixed in Version|
|Summary||0006462: Dynlinking duplicate module clobbers host program state|
|Description||If you inadvertently duplicate a module between the executable and a dynamically loaded library, for example by adding an extraneous -linkpkg when building a .cmxs, loading the library will "re-initialize" the static data owned by the executable's copy of the module.|
I've attached a tarball which demonstrates this. I would expect there to be a private copy of "myval" in lib2.cmxs, so that main continues to see 69105 instead of 42. (Alternatively, there could be an explicit treatment of symbol visibility and overriding, so that the user can control what happens, but that seems to be opening a can of worms.)
This is related to issue 0004839, but applies even if you have compatible signatures. It's not a type-correctness problem so much as a general semantic bug.
It seems worth mentioning that this also seems to make the GC corrupt the program . Perhaps the root set gets clobbered somehow? The smallest example I have is a null CIL plugin, which is also included in the tarball -- "make run-cilly". This segfaults on my machine. If you dig around in gdb using watchpoints, you find that the storage allocated by the second initializer (e.g. try watching Pretty.aligns, which for me is at &camlPretty + 0x190 bytes) gets silently re-used as if it were unreachable (e.g. I have seen it being updated to point to a function, not a list, which is clearly wrong). Since the old pointer is still live, this quickly crashes the program. I'll be happy to help anybody reproduce this.
|Steps To Reproduce||Extract tarball, run make. |
To see the GC problem, make sure you have CIL installed and then make run-cilly.
|Additional Information||I was hoping the simple test case would illustrate the GC problems too, which is why I made it run in a loop and keep allocating... but it doesn't crash for me.|
|Tags||No tags attached.|
|Attached Files||ocaml-dynlink-clobber-bug.tar.gz [^] (1,126 bytes) 2014-06-18 21:26|
> I would expect there to be a private copy of "myval" in lib2.cmxs
We would like to do that, but the Unix linker does not support it: its namespace is desperately flat, and it provides no renaming facilities. Gnu binutils do provide renaming for object files, but it introduces its own set of problems. BTW, this is also the reason for the existence of the -for-pack option.
For the GC-related crash, I'm not surprised: the second copy of the module will indeed clobber the roots of the first. I'm not sure how this leads to a dangling pointer, but that's probably not worth investigating.
The only solution I can see is to forbid dynlinking a module that has the same name as an existing module (static or dynlinked). I just hope this can be done without big changes to the compiler.
In hindsight I was over-hasty to say that lib2 should have a private instance of myval. It's more reasonable for there to be a unique global myval. So the real issue is: why does lib2 want to initialise state belonging to lib1? Surely lib1, and only lib1, should do that? There's nothing in Unix linking that prevents this; it's the usual behaviour.
(Of course it should be possible to use static linking to arrange that lib2 has a private copy of myval. But that is a separate configuration and probably should not be the default.)
I don't think namespacing is the issue. Although quirky, there is quite a bit of namespacing support available in Unix linkers: symbol scope, symbol visibility and (at dynamic-load time) RTLD_LOCAL.
If I'm understanding -pack correctly, it's like ld -r (relocatable output). With both this and the problem I reported, it seems as though ocamlopt is duplicating linker functionality in a way that adds overall complexity. What are the reasons for doing this, rather than using the linker's actual features directly? Would it be possible/helpful to write down a mapping from ocaml's linking semantics to ELF linker features? I could potentially help with such an effort, since I know a little about linkers.
|2014-06-18 21:26||stephenrkell||New Issue|
|2014-06-18 21:26||stephenrkell||File Added: ocaml-dynlink-clobber-bug.tar.gz|
|2014-07-16 10:22||doligez||Relationship added||related to 0004839|
|2014-07-16 10:23||doligez||Status||new => acknowledged|
|2014-07-16 16:43||doligez||Target Version||=> 4.02.1+dev|
|2014-09-04 00:25||doligez||Target Version||4.02.1+dev => undecided|
|2014-09-14 22:38||doligez||Target Version||undecided => 4.02.2+dev|
|2015-02-06 18:56||doligez||Description Updated||View Revisions|
|2015-02-06 19:07||doligez||Note Added: 0013243|
|2015-02-06 19:07||doligez||Target Version||4.02.2+dev => 4.03.0+dev|
|2015-02-09 13:18||stephenrkell||Note Added: 0013261|
|Copyright © 2000 - 2011 MantisBT Group|