Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keeping locations in cmi files #5817

Closed
vicuna opened this issue Nov 9, 2012 · 35 comments
Closed

Keeping locations in cmi files #5817

vicuna opened this issue Nov 9, 2012 · 35 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Nov 9, 2012

Original bug ID: 5817
Reporter: @alainfrisch
Assigned to: @alainfrisch
Status: closed (set by @xavierleroy on 2015-12-11T18:24:10Z)
Resolution: fixed
Priority: normal
Severity: feature
Fixed in version: 4.02.0+dev
Category: ~DO NOT USE (was: OCaml general)
Monitored by: @jmeber @hcarty

Bug description

Currently, locations of value and type declarations are discarded in .cmi files (cf for_saving mode in subst.ml). Keeping those locations would give several nice features "for free":

  • In case of a mismatch between an .mli interface and its .ml implementation, the location of the two incompatible declarations could be displayed in the error message (currently, only the location in the implementation is shown).

  • The .annot file could trivially show the location of external declarations (either in the .mli file or in the .ml file if there is no .mli). This would makes it very easy to implement a "jump to definition" feature in emacs.

  • It would become very simple to implement a global detector of unused declarations (parsing all .cmi and .cmt files of a project and marking used declarations in .cmi files based on their location, found in .cmt files).

The patch is trivial (s/if s.for_saving then Location.none else // in typing/subst.ml, three occurrences). This is more a matter of deciding whether it's a good idea to keep those locations or not. The only (minor) drawback I can see is that changing a comment or whitespace in the .mli file will change the resulting .cmi file and force more recompilation (for build system based on file content/digest). Of course, this could be decided by a command-line switch.

File attachments

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @bobzhang

This is great, something I thought for a while is that it would be helpful to add a documentation string into cmi file. Then like python, you can get the help from the toplevel

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @gasche

The only (minor) drawback I can see is that changing a comment
or whitespace in the .mli file will change the resulting .cmi file
and force more recompilation (for build system based on file
content/digest).

I'm rather favorable to the change you propose, but this is a major drawback rather than a minor one. How easy is it to access the internal interface checksum for an external problem? It might be that those build systems could be converted to asking the internal checksum instead of hashing the .cmi to decide recompilation.

I think the other option would be to generate a .cmti file along with the .cmt, that contains the typedtree of the signature if it exists (otherwise the tool needs to look into the .cmt). I realize that the change would maybe not be as simple, but it seems more principled and less dangerous tool-wise.

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @lefessan

The change in the checksum is not only forcing to rebuild the library, it forces to rebuild everything. For example, adding a documentation comment in a .mli file would force a binary package maintainer to rebuild all the packages depending on that interface.

Moreover, interfaces are sometimes copied from one package to another one. For example, you might want to copy the .mli file of some module from one library to provide an alternative implementation in your library, so that users can decide at the last moment which implementation they want to use. That would be impossible now, as long as the .mli file is changing but not the interface (especially if some file content is generated, such as (* $Id *) comments). It was already made difficult (but not impossible) by cross-module value propagation, it would now be impossible, unless the locations are not used in the checksum computation.

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @alainfrisch

How easy is it to access the internal interface checksum for an external problem? It might be that those build systems could be converted to asking the internal checksum instead of hashing the .cmi to decide recompilation.

The internal checksum is really computed from the content of the generated .cmi itself, so this wouldn't help. Moreover, if we store locations in a .cmi file, we want other modules that depend on it to be recompiled, so that they get correct locations in their .cmt/.annot files and error messages.

Do you really think that avoiding recompilation when comments/whitespace change in a .mli file is so important?

I think the other option would be to generate a .cmti file along with the .cmt

I'm not sure to understand your proposal here. .cmti files are already generated for .mli files compiled with -bin-annot.

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @lefessan

So ok, the checksum computation should be done in memory, by first generating an equivalent signature without locations, and finally storing the signature with the locations. It would be a little more expensive in time, probably not much.

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @garrigue

I have one small concern about having the checksum depend on locations: what happens if you change from unix text to dos text and recompile?
Having a different checksum would be a problem, I think.
So I kind of agree with Fabrice: computing the checksum on a copy without locations would be a good idea, even if we eventually keep the locations.
(On the other hand, it is almost impossible to keep the same checksum after even trivial changes in the compiler, so I got used to their being fragile.)

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @alainfrisch

Yes, that's doable, but I don't see the point. If a.mli changes textually, then external locations to be stored in b.cmt (where b depends on module A) change as well, so we need to recompile b.ml anyway.

If this is really important, we could introduce a command-line switch, but I'm still not convinced that it's so important to avoid recompilation in this special case (and people still using Makefiles wouldn't see the difference anyway).

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @alainfrisch

Oh, I see, you were referring to the digest checksum performed by the compiler. It makes sense, indeed.

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @gasche

I had forgotten about (or never looked into) the current .cmti files. My proposal would be to keep a filesystem interface where the .cmi enforces semantic compatibility as tightly as possible, to maximize separate compilation, and expose informations aimed at the user in other files (the .cmti). The .cmt* could evolve to become a general place to look at not-strictly-semantics information.

It is not necessary to have this separation (likewise we could actually decide to store the content of the .cmt in the .cm[ox]), as a Sufficiently Smart Tool could always do the right thing to explore it wisely, but the filesystem level is the right boundary choice to interface with other tools, at least in the Unix world.

In short: if you want the fancy stuff, use -bin-annot. I don't think it is a good deal to potentially blow up compilation times each time I change a documentation comment in the utils.mli of my project, in exchange for not having to use -bin-annot for the features you're requesting.

(Of course that means the compiler maintainer, being you in this case, has to consider looking for .cmt* files when printing error messages or other user interactions. Do you think the complexity costs would be excessive?)

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @alainfrisch

Do you think the complexity costs would be excessive?

If I understood your proposal correctly, yes, this would be quite complex to do. Basically, when type-checking in -bin-annot mode, you'd need to load .cmti files in addition to .cmi files and somehow recombine the information. A value lookup would get the value description from the .cmi file and then it will need to find the location in the .cmti file so that the location stored in the Typedtree (hence in the .cmt file) is correct.

This seems rather overkill to me, and rather useless. If you compile a client module in -annot or -bin-annot mode, you will have dependencies on .cmti files (and you'll need to tell you build system about it, btw), so you won't really avoid any recompilation anyway. So maybe we could just say that locations are kept in .cmi files if and only if we compile in -annot or -bin-annot mode (instead of introducing an extra command-line switch).

@vicuna
Copy link
Author

vicuna commented Nov 9, 2012

Comment author: @gasche

Basically, when type-checking in -bin-annot mode, you'd need
to load .cmti files in addition to .cmi files and somehow
recombine the information.

More or less: when printing an error message about some signature, you can look at whether there is a .cmti file available for it (which depends on whether the corresponding ,mli was compiled with the -bin-annot option). Getting the annotation corresponding to a given Longident would then be simple, I assume (just as you would from the signature in the .cmi: Env.add_signature then Env.lookup_; add_signature is not free but you only pay for it when user interface request it).

If you compile a client module in -annot or -bin-annot mode,
you will have dependencies on .cmti files (and you'll need
to tell your build system about it, btw), so you won't really
avoid any recompilation anyway.

I'm not sure where the dependencies on the .cmti would come from in this model. Semantically (in bytecode at least), if compilation unit A depends on compilation unit B, A needs to be recompiled only if b.cmi changed. A change in the .cmti should not incur recompilation of B (or I don't understand what you've been doing with the .cmti). Avoiding excessive recompilation is important.

Granted, there is the issue that the .cmti may become outdated -- just as this may happen with .cmt files. I admit I haven't thought much about this aspect; the build system will have logic to re-update the .cmt*, the editor should have some as well, but maybe it's a bit unreasonable to add such logic in the compiler just to check whether we can output nice error messages. I'd rather run the risk of having occasionally stale location information than doing (even) more recompilation than necessary.

@vicuna
Copy link
Author

vicuna commented Nov 10, 2012

Comment author: @alainfrisch

One of the goals was to ensure that references to external value declarations in .cmt files would have the correct location. This means that the type-checker needs to have access to this information, either from the .cmi or .cmti files. Either way, the resulting .cmt file produced the compiler would depend on compiled files with locations, and this requires recompilation when the .mli files change, even only syntactically. Same story for .annot files.

Having locations directly in .cmt files makes it quite simple to implement a jump to definition feature or a global detector of unused exported values. This is a very low-hanging fruit (removing a few tens of characters from subst.ml). Actually, if we do it, we automatically get location information for external declarations in .annot files today.

The alternative is to do nothing in the compiler (except maybe for error messages, but this is the less important benefit). We already have the location information in the .cmti files, and by redoing some non-trivial lookup logic in external tools, we could probably achieve the same effect; it's just more work.

@vicuna
Copy link
Author

vicuna commented Nov 10, 2012

Comment author: @lefessan

What's the point of this discussion if computation of checksums in memory makes keeping locations invisible to the compiler, and only a minor rebuild is necessary at the build system level ? (I.e. only files directly depending on the interface) Actually, I would save the checksum in raw form at the beginning of the .cmi file, so that "smart" build systems can detect that the file changed but not the checksum, so not rebuilding at all.

@vicuna
Copy link
Author

vicuna commented Nov 10, 2012

Comment author: @gasche

Alain, needing to recompile the .cmti and .annot when the .mli changes is fine, as long as other compilation units don't have to be recompiled.

The alternative is to do nothing in the compiler (except maybe
for error messages, but this is the less important benefit)

I'm not sure I follow, besides error messages, what other uses did you envision directly in the compiler? Couldn't the global "unused declaration" pass be implemented as an external tool?

Fabrice, will make be able to know that, even if the file changed, its "important information" (the checksum) did not? I'm quite sure other build systems could be adapted, but supporting raw Makefile is quite important. If that works, that would be a good compromise.

@vicuna
Copy link
Author

vicuna commented Nov 10, 2012

Comment author: @alainfrisch

Let me try to explain better... What I say is that, if locations of external declarations are stored in .cmt files, then:

  1. external tools can use this information very easily.

  2. if b.ml refers to external module A, it must be recompiled when a.mli changes, even purely syntactically (because b.cmt contains locations in a.mli).

I claim that the benefits of 1) makes 2) acceptable. Now, if we want to avoid 2) it means that we must keep the status quo (.cmt files don't include location of external declarations), i.e. do no change anything in the compiler.

@vicuna
Copy link
Author

vicuna commented Nov 10, 2012

Comment author: @gasche

You mean .cmi, not .cmt, right? If so, then I agree with your 1/2 analysis, and I disagree with the conclusion: I find (2) unacceptable and (1) of only marginal utility in presence of .cmti files that have been designed for this purpose. In my view maximizing separate compilation is fundamental and locations are nice, not the other way around.

Besides the staleness problem (which I can live with), I don't understand why you think the external tools couldn't use the information from the .cmti right now, just as well as from the .cmi after the change you propose.

@vicuna
Copy link
Author

vicuna commented Nov 10, 2012

Comment author: @alainfrisch

No, I mean .cmt, not .cmi. Keeping locations in a.cmi makes it possible to have them in b.cmt when b.ml refers to declarations from a.mli. Just by reading at b.cmt, you can find the location of external declarations. And the same applies to .annot files: keeping locations in a.cmi automatically makes b.annot contain references to a.mli, which is very nice.

Unless I missed something (I never tried it), ocamlspotter would be entirely replaced by this proposal.

I don't understand why you think the external tools couldn't use the information from the .cmti right now, just as well as from the .cmi after the change you propose.

It would just be much more complex. To be able to find the location of an external declaration in b.ml, the tool would need to look at b.cmt (to find the fully qualified path of the identifier), find and load a.cmti, and do some lookup in it to find the location in a.mli of the declarations. If we keep locations (in a.cmi, hence in b.cmt), the tool can simply read b.cmt and directly report the location.

@vicuna
Copy link
Author

vicuna commented Nov 10, 2012

Comment author: @lpw25

I may be missing something, but couldn't you fetch the location information from the .cmti when creating the .cmt. This would mean that the .cmt would depend on the the .cmi and the .cmti, but the .cm[ox] would only depend on the .cmi.

Then if you made purely syntactic changes it would only change the .cmti file. So that, unless you specifically ask the build tool to use the .cmt file as a target, there is no need for recompilation.

@vicuna
Copy link
Author

vicuna commented Nov 10, 2012

Comment author: @alainfrisch

The nice thing with -bin-annot is that at very little cost (performance and complexity in the compiler), we can get rich feedback from the compiler, readily available for external tools, simply by dumping the internal representation produced by the type-checker. Now, Leo's proposal is that the compiler would need to enrich this internal representation by loading more files (.cmti), only to retrieve location information that have been explicitly discarded from .cmi files. This seems very convoluted to me, and I expect this to have a non negligible cost (complexity in the compiler, complexity in build systems, compilation time).

@vicuna
Copy link
Author

vicuna commented Nov 10, 2012

Comment author: @alainfrisch

As a side note, it might be useful for this discussion to realize that changing a comment or whitespace in an implementation file (.ml) can already force recompilation of all modules depending on this module, in native code (this is because some constructions, like "assert", depend on the concrete location of the source code).

@vicuna
Copy link
Author

vicuna commented Nov 11, 2012

Comment author: @gasche

Indeed, I think we should have more support for separate compilation in native code (currently you have to manually hide .cmx at compile-time to avoid a hard dependency on the implementation), but that is a separate issue.

@vicuna
Copy link
Author

vicuna commented Nov 11, 2012

Comment author: @alainfrisch

My last point about comments and whitespaces in .ml was to illustrate that binary files produced by the compiler (.cmo/.cmx) already depend on syntactic details of the source files (.ml). I would personally object to introducing extra complexity in the compiler (like having it search and load new kinds of files, and do non-trivial postprocessing of the typedtree) only to avoid having the same situation for interfaces (.mli -> cmi).

@vicuna
Copy link
Author

vicuna commented Nov 12, 2012

Comment author: @gasche

(Originally posted in the bad place, sorry for the confusion and thanks to Alain for being understanding.)

I have given the current implementation a more thorough look this week-end. I see your point regarding complexity: loading locations from the .cmi when they become needed is not simple, because the current code is not planned for it. More precisely, locations are used in a lot of different places for different purposes (eg. indexing into lookup tables) and it would require some work to examine all these uses to determine which one should use this improved location, and flow enough context towards them so they have enough information to refine the location locally (for example the identifier that was used to lookup the declaration). On the other hand, changing the locations of the values coming from the .cmi files may require a careful study of these different uses anyway (because some "is_ghost" logic is used in different places; I think the change should be ok, and you certainly know that better than me, but it still requires some thought).

A point I have encountered during this code study is the following: the Subst code "sanitizes" interfaces for saving, by both stripping locations and freshening (if I understand correctly) identifiers and levels. The .cmi data is sanitized in this way, but the .cmti data is not. Is it correct to not sanitize the .cmti data? My intuition would be that the locations should not be stripped, but the type levels and stuff should still be updated for saving in the .cmti. I would therefore suggest that the Subst.ml handles "location" and "semantic stuff" with two different flags (rather than only for_saving: for_saving and discard_loc, for example), with the .cmi using both for_saving and discard_loc, and .cmti only for_saving. I don't really know what the for_saving stuff (besides discard_loc) does, so I would welcome feedback here.

Other than that, I still believe that adding location information to the .cmi is a bad idea because of the weakened dependency precision that introduces. I would see two ways to move forward that avoid that defect:

(best way) add "provenance" data to the concerned structures, of type (file_path * longident) option, that would be set if the description comes from a .cmi and contain the .cmi path and corresponding longident. Then update display logic at the relevant places to refine location information through this field (loading the .cmti file if present). My initial idea was to have a lazy field directly returning the refined location, but those descriptions need to be easily serializable into the .cmi, precisely. I suppose the "provenance data" could be inserted into the signature during the substitution pass (where we know the final path and can compute the longident rather easily), adding little computation overhead.

(simple way) as a first step, we could simply change read_pers_struct to read declaration information from the .cmti rather than the .cmi when it is available. To avoid introducing any change of behaviour, this means that the .cmti declarations must have been sanitized with Subst.for_saving before storing (hence my previous question). There are two possibilities to check that the .cmti is up-to-date:

  1. .cmt* files store a hash of the corresponding source file, so we could re-hash the .mli and check equality, orelse fallback to the .cmi
  2. the .cmti contains the whole content of the .cmi file, so we could load both and test for equality of the .cmi (orelse fallback to the .cmi)
    (1) is a good general strategy for editor tools, but has the drawback of requiring the original source file to still be around, which isn't generally true for compiled interfaces. (2) is therefore a better choice in the current situation.

As Alain already noted, the simple way increases compilation time (a file with a lot of external dependencies opens a lot of .cmi). If the best way is judged too complex, it could still make sense to implement the simple way to measure the performance overhead. We could also consider making the "refine locations" logic optional and configurable. I'd rather have it on-demand and always-enabled.

Summing up:

  • I still think we should avoid having locations in .cmi and see ways to do that.
  • Granted, refining locations lazily is a bit complex as it requires to inspect how the current code use location information. I think this is mostly external complexity due to the arguably messy multi-use of locations in the codebase.
  • There is an intermediate strategy (refine locations eagerly, making this configurable to avoid the performance penalty) that I think would be quite simple to implement.

Alain, do you think the "best way" is feasible? If so, would you accept to review a patch implementing it?

@vicuna
Copy link
Author

vicuna commented Nov 12, 2012

Comment author: @gasche

PS: a downside of the "simple way" is that it is more invasive. The "best way" keeps the exact same behavior in most parts of the compiler, except those dedicated to user interface (whose correctness is less important). The "simple way" on the contrary changes the value provenance for the whole type-checking process. The validation of this change (checking that the cmi_information saved in the .cmti is equal to the .cmi) is in principle a strong guarantee of correctness, but that still makes this choice less robust -- which is one reason why I would be more satisfied with the "best way".

@vicuna
Copy link
Author

vicuna commented Nov 12, 2012

Comment author: @alainfrisch

Gabriel: I'm not sure to understand the goal of your proposal. If we are to implement extra logic to read back information from .cmti files, I think it's better to do it in extra tools than to complexify the compiler. ocamlspotter demonstrates that "jump to definition" can already be implemented purely as an external tool. Improving error messages in the compiler would be the only place where the compiler itself could benefit from more precise locations. But: (i) this does not bring much and in my opinion does not justify any extra complexity in the compiler; (ii) if we do that (better error messages with external locations), I don't see how you could avoid a dependency on syntactic details of .mli files. By definition, if the error message you get when compiling b.ml shows locations in a.mli, then you need to replay the compiler on b.ml as soon as a.mli changes, even because of whitespace (well, build systems don't typically memoize failed compilations, but it would seem crazy to say that the dependency of the command "ocamlc -c b.ml" does not include, transitively, "b.mli" even though its behavior depends on the content of this file).

The point of my proposal is that tools like ocamlspotter can be subsumed or largely simplified by keeping locations in .cmi files (hence locations of external declarations in .cmt files). Personally, I'm ready to pay the price of having to recompile all dependencies when I change comments or whitespaces in an .mli file if this can give be better and simpler tools. Concretely, if we keep locations, .annot files automatically contain enough information to implement "jump to definition" in the emacs mode. This is a huge step forward to me, compared to the complexity of ocamlspotter, and it certainly deserves to sacrifice a property I don't find very important and don't rely upon (and which does not hold for those using time-stamp based build systems like make). If you don't agree, I claim it's better to do nothing in the compiler: the extra complexity is better located in external tools. To satisfy both points of views, one could make keeping locations optional with a command-line flag (-keep-locations?).

@vicuna
Copy link
Author

vicuna commented Nov 12, 2012

Comment author: @alainfrisch

To illustrate my point, I've attached a small tool which parses .cmi and .cmt files from a project (compiled with -bin-annot and the patch to keep locations in .cmi files) in order to report exported value declarations (found in .cmi files with an explicit .mli source) which are never used.

@vicuna
Copy link
Author

vicuna commented Nov 12, 2012

Comment author: @gasche

To use .cmti files instead of .cmi, I only had to change the load_file function of your tool in the following way, and it seems to work on OCaml projects without any modification to the compiler:

let rec load_file fn =
if Sys.is_directory fn then
Array.iter (fun s -> load_file (Filename.concat fn s)) (Sys.readdir fn)
else if Filename.check_suffix fn ".cmt" || Filename.check_suffix fn ".cmti" then begin
let open Cmt_format in
match (read_cmt fn).cmt_annots with
| Implementation impl -> collect_references # structure impl
| Interface intf -> List.iter collect_export intf.sig_type
| _ -> () (* todo: support partial_implementation? *)
end

@vicuna
Copy link
Author

vicuna commented Nov 12, 2012

Comment author: @gasche

About your larger remark: I disagree that if we used .cmti files to produce user information about dependent modules (eg. error messages on the compilation of b.ml that depends on a.mli, based on locations in a.cmti) we would have to add .cmti as a dependency on all those dependent modules. Of course if you do that you have to recompile just like if you change the .cmi, so there is no point.
But given that changing a.mli (and a.cmti, or having a.cmti become stale) only changes the "fancy user interaction" during compilation of b.ml, not the semantics of the actual compiled files produced by the compiler, a.mli/a.cmti do not need to become a dependency of b.ml.

The point is slightly degraded by -annot: if you use -annot you can observe the fact that a.cmti became stale (the output has more external_ref without locations). I would still prefer, as a default behavior, no recompilation and a potentially lossy update of .annot (because it's only meant to cache information to be displayed to the user, which can then decide to update .cmti if she desires more precision). But you're free to tell your build system that you depend on a.cmti (through the same way it currently guesses that b.mli depends on a.cmi) if you want stronger location-availability guarantees. I'm ready to help update ocamldep to add dependencies on .cmti files on demand, for example -- though a simple postprocessing step on its output may suffice.

@vicuna
Copy link
Author

vicuna commented Nov 12, 2012

Comment author: @alainfrisch

To use .cmti files instead of .cmi, I only had to change the load_file function of your tool in the following way

Yes, but this is not the important part: you still need to keep locations in .cmi files, otherwise the .cmt files don't get the proper locations for external declarations, and the tool does not work.

without any modification to the compiler:

I don't think so! Correct me if I'm wrong, but I really don't see how it could work.

(As a minor point: your patch would consider value declarations in .ml files (signatures in the implementation) as candidate for being reported as unused, even though they are not exported).

@vicuna
Copy link
Author

vicuna commented Nov 12, 2012

Comment author: @alainfrisch

But given that changing a.mli (and a.cmti, or having a.cmti become stale) only
changes the "fancy user interaction" during compilation of b.ml, not the
semantics of the actual compiled files produced by the compiler, a.mli/a.cmti
do not need to become a dependency of b.ml.

I'm ready to accept this argument, but I still don't like the idea of complexifying the compiler (and have it read new kinds of files) only for adding locations of external declarations in error messages, especially since a much simpler technical solution exist. I believe that the argument about avoid re-compilation when comments/whitespace change in .mli files is quite theoretical, but if you really insist on it, I'd prefer you accept not to have those locations in error messages than to complexify the compiler for them. In other words, I'd be against a patch which lets the compiler read .cmti file only to enrich error messages; but I'm ok with making it optional to keep locations in .cmi files (through a command-line switch).

@vicuna
Copy link
Author

vicuna commented Nov 15, 2012

Comment author: @alainfrisch

But given that changing a.mli (and a.cmti, or having a.cmti become stale) only
changes the "fancy user interaction" during compilation of b.ml, not the
semantics of the actual compiled files produced by the compiler, a.mli/a.cmti
do not need to become a dependency of b.ml.

After giving it more thought, I'm even more convinced that it's a really bad idea to hide such dependencies to the build system, even when the dependency only matters in case of a type error. For instance, we have in our omake-based build system some implicit rules to copy compiled files from one directory to another one. If the build system does not know that compiling b.ml might need to access a.cmti, it will never copy this file in the right place and the compiler will not be able to find it in case of a type error. (For the same reason, we believe it is problematic that the compiler can open a .cmi file even though the corresponding unit is not reported as a dependency by ocamldep; this can happen when the compiler wants to unroll an external type abbreviation. We have disabled this behavior in our local version.)

@vicuna
Copy link
Author

vicuna commented Jul 8, 2013

Comment author: @alainfrisch

Is anyone against the addition of a "-keep-locs" command-line argument, whose effect is to keep locations in .cmi files (on value and type declarations)?

@vicuna
Copy link
Author

vicuna commented Jul 8, 2013

Comment author: @hcarty

-keep-locs would be nice to have available through OCAMLCOMPPARAM as well.

@vicuna
Copy link
Author

vicuna commented Sep 17, 2013

Comment author: @alainfrisch

Commit 14157 on trunk: new -keep-locs option, also available through OCAMLPARAM.

@vicuna
Copy link
Author

vicuna commented Sep 17, 2013

Comment author: @alainfrisch

I've also committed a slightly improved version of the unused_exported_values in the trunk under experimental/frisch. I've attached the output of running it as:

experimental/frisch/unused_exported_values.exe bytecomp driver parsing tools toplevel typing utils asmcomp

after a "OCAMLPARAM=bin-annot=1,keep-locs=1,_ make clean world opt".

Of course, many of the unused exported values are actually useful for code which is not part of the compiler itself, although there seems to be a few true positives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants