Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocamlbuild does not observe OCAMLPARAM #6999

Closed
vicuna opened this issue Sep 27, 2015 · 7 comments
Closed

ocamlbuild does not observe OCAMLPARAM #6999

vicuna opened this issue Sep 27, 2015 · 7 comments

Comments

@vicuna
Copy link

vicuna commented Sep 27, 2015

Original bug ID: 6999
Reporter: @whitequark
Status: resolved (set by @damiendoligez on 2017-02-24T10:49:26Z)
Resolution: suspended
Priority: normal
Severity: minor
Target version: later
Category: -for ocamlbuild use https://github.com/ocaml/ocamlbuild/issues

Bug description

This will lead to incorrect results if the value of OCAMLPARAM changes between runs, as ocamlbuild will use cached build products.

@vicuna
Copy link
Author

vicuna commented Sep 27, 2015

Comment author: @gasche

Meh: one more reason why OCAMLPARAM, while offering enticing convenience to the user in a hurry, is a pain to handle well. But this needs to be fixed nonetheless.

@vicuna
Copy link
Author

vicuna commented Oct 19, 2015

Comment author: @gasche

So there are many possible approaches to fix this issue, and I would
be interested in your (@whitequark) opinion.

Basically the problem is that the current way OCamlbuild decides
whether a given command must be re-executed does not take into acount
"external environment" parameters (such as OCAMLPARAM, but also for
example findlib.conf) that do affect the reslut of those commands.

The current code taking this rerun-or-not decision can be found in the
Rule.call function; it computes the digest of a series of things,
compare then to what is in cache from the previous run, and either
decides cache_hit (do not recompute) or cache_miss_* (have to
recompute, for some reason).

The information taken into acount for the hit-or-miss decision is: the
digest of all productions of the rule, of all static and dynamic
dependencies, and importantly of the "action digest", which is the
digest of the command returned by the action (computed by the
Command.digest function).

Environment parameters, as/if they do not appear in the final command,
will not be taken into account at all, and in particular never require
rebuilding upon change.

Here are some ways to fix this:

  1. We could add an "environment digest" which contains the environment
    values that we know affect OCaml compilation (but this would be
    hard-coded for OCaml rules only?)

  2. We could add a more general notion of "side-inputs" to all rules,
    that users could use to add arbitrary information to the rule
    digest, and add known-env-variables as side-inputs to all OCaml
    rules. (How to do that while preserving backward-compatibility of
    the Rule API is still to be determined.)

  3. We could add a way for Command values to embed some side-input
    knowledge (instead of making it an independent input of the
    cache check), and in particular stuff the ocaml-relevant
    env-variables in OCaml commands.

I feel that (1) is too specific. (3) is probably the least invasive
option, but (2) may be the best API approach long-term (I think of
a user with an arbitrary new rule that does take some extra
non-deterministic input, embedding it in the command is clever but
maybe not the most natural move, while it seems intuitive to look for
an API to signal this dependency; on the other hand, clearly
documenting the embedding-in-command trick may give o a more
declarative API, which is nice).

@vicuna
Copy link
Author

vicuna commented Oct 19, 2015

Comment author: @whitequark

The fact that the contents or even location of findlib.conf is not recorded has really made my day worse a lot of times with opam-android.

(1) is awful. I do not have a strong preference between (2) and (3), there are pros and cons to both. I think one reason to go (3) is that typically, *nix commands look at all the same things in the environment, not different things based on the rule (invocation).

@vicuna
Copy link
Author

vicuna commented Oct 19, 2015

Comment author: @gasche

I think one reason to go (3) is that typically, *nix commands look at all the same things in the environment, not different things based on the rule (invocation).

I don't understand your point. My idea was to encode/store this extra state in the Command.t value returned by the rule's action. This would not be done implicitly, so the user would write something like, instead of

Cmd (S [!Options.ocamlc; flag; T tags;
        atomize_paths deps; A"-o"; Px out])

something like

Cmd (S [!Options.ocamlc; flag; T tags;
atomize_paths deps; A"-o"; Px out; E ocamlc_env])

where ocamlc_env would be a list of environment variable names that would be turned into (name,value) pairs at the time where the command is turned into a real runnable command.
We could think of a more general extension to the Command datatype than just handling environment variable, the point is that they would be part of the command digest but not part of the command execution, or commented out; well for environment variables we don't need to comment them out except to avoid the command-line-too-long bugs, but for other nondeterministic inputs we would need to include them in the digest but not the final command.

In any case, the code of the rule action (or of the auxiliary functions called by the rule action, etc.) would need to be modified to take this side-information into account. So I don't understand why this better reflect the model where "*nix commands look at all the same things", because we would have to modify every command in our actions to support that (we can factorize the modification under a common name, say for all ocaml-related actions, but that seems equally true of (2)).

(Let's clarify now that I am not suggesting to take a snapshot of the whole user environment, as this changes much too often. I don't think anyone believed that -- but it means we should be explicit about which part of the environment to track, potentially in all rules.)

@vicuna
Copy link
Author

vicuna commented Oct 19, 2015

Comment author: @gasche

I think giving a concrete code example idea above may be useful to drive the discussion -- it's more explicit. For (2), I was thinking of turning a rule of the form

(fun env build ->
...
build dyndeps;
...
Cmd ...)

into something like (assuming we add an optional parameter as a broken way to hope to remain backward-compatible):

(fun env build ~dyn_input ->
...
build dyndeps;
...
dyn_input (ocaml_compiler_variables ());
...
Cmd ...)

which would record whatever dyn_input is called with as a "dynamic side-input of the rule", much like whatever is passed to "build" is recorded as a dynamic dependency. (A monadic API would be nicer, because we could have a small sub-library for dynamic inputs that would pack "compute the dynamic inputs" and "record them" into a single bind action, but that's a much more invasive change for another day.)

Again, that requires modification of all rules, possibly in an uglier way -- but one may claim that hacking the Command datatype into a side-channel container also distorts our API in unpleasant ways.

@vicuna
Copy link
Author

vicuna commented Oct 20, 2015

Comment author: @whitequark

Oh, I see your point now. I can say I do not have any preference between those at all, not the least because I lack any high-level 'architectural vision' of ocamlbuild, I just hack things together using it. I would trust your judgement here.

@vicuna
Copy link
Author

vicuna commented Feb 24, 2017

Comment author: @damiendoligez

ocamlbuild is now a separate project that lives on GitHub.
PR transferred to ocaml/ocamlbuild#147

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant