Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can type-information flow be strengthened for non-recursive lets? #7389

Closed
vicuna opened this issue Oct 18, 2016 · 8 comments
Closed

Can type-information flow be strengthened for non-recursive lets? #7389

vicuna opened this issue Oct 18, 2016 · 8 comments

Comments

@vicuna
Copy link

vicuna commented Oct 18, 2016

Original bug ID: 7389
Reporter: @gasche
Assigned to: @garrigue
Status: acknowledged (set by @garrigue on 2016-10-18T23:00:20Z)
Resolution: open
Priority: normal
Severity: feature
Category: typing
Has duplicate: #7582
Related to: #7344 #7388
Child of: #7409
Monitored by: @ygrek @jmeber @yallop

Bug description

As with #7388, I'm working on examples given by Andreas Rossberg that are (simplified forms) of trying to use type-based disambiguation for an AST built using the variant/record alternation pattern.

The following example warns when arguably it should not:

type 'a w = {a : 'a}

type t = t' w
and t' = {l : int}

type u = u' w
and u' = {l : int}

let f (x : u) = let {l; _} = x.a in l
File "warn-41.ml", line 9, characters 20-26 ("{l; _} ="):
Warning 41: these field labels belong to several types: u' t'
The first one was selected. Please disambiguate if this is wrong.

The issue here is that while (x.a) is known to be of type u (for example, "let a = x.a in a.l" does not warn), there is no information flowing from the let-defining expression to the let-pattern, so the let-pattern is type-checked without this information available.

For non-recursive let bindings, it should be possible to type-check the let-defining expression, and flow the inferred information into the let-pattern, to accept this example without a warning. Would such a strengthening be reasonable?

(It's a bit irritating to have two different type-inference paths for recursive and non-recursive binding, while we currently have a regular implementation for both. But if it comes with a usability win for a reasonable usage pattern...)

Additional information

The type-checker type-checks the left-hand-side (lhs) of the let-binding before the right-hand-side (rhs), and furthermore flows information from the lhs-check to the rhs-check in the form of an updated typing environment:

https://github.com/ocaml/ocaml/blob/2da2fcd/typing/typecore.ml#L2059-L2075

  | Pexp_let(rec_flag, spat_sexp_list, sbody) ->
    [...]
    let (pat_exp_list, new_env, unpacks) =
      type_let env rec_flag spat_sexp_list scp true in
    let body =
      type_expect new_env (wrap_unpacks sbody unpacks) ty_expected in
    [...]

This is necessary for recursive declarations as the pattern-bound variables may be used in the let-defining expression. But I don't see why that would be necessary for non-recursive declarations. (Of course in the function form "let f p1 p2 ... = e" you need to type-check the pi before e.)

Note that there is an equivalent way to write the same code using "match", where the typing information flow is as expected:

let f (x : u) = match x.a with {l; _} -> l

However, this code pattern is a bit unnatural, and as Andreas writes (thinking of non-expert OCaml programmers), "it's unlikely that anybody will understand why they should be doing that".

@vicuna
Copy link
Author

vicuna commented Oct 18, 2016

Comment author: @gasche

Note that GADT patterns may need to flow type equations from a pattern to the code in scope of the pattern variables. I initially thought that this would also prevent my proposal in the non-recursive case, but in fact it does not: it only means that the let-pattern has to be type-checked before the let body, not the let-defining expression. (In general GADT equations flow exactly as binding variable scopes, so you can mostly forget about it and just make sure you never suggest type-checking code with open variables not yet in the environment.)

@vicuna
Copy link
Author

vicuna commented Oct 18, 2016

Comment author: @alainfrisch

I also wished type would flow from the bound expression to the pattern, not the other way around. This would avoid a lot of type annotations. But it could easily break existing code that depend on the current propagation direction, as in:

    let (x : t) = A in  (*  => select A from type t *)

One could try to keep some amount of propagation from the pattern to the bound expression, by extracting "manifest type information" from the pattern (explicit type annotations, including under tuples, etc).

@vicuna
Copy link
Author

vicuna commented Oct 18, 2016

Comment author: @garrigue

I'll think about this.
Indeed, this is strange that whether the pattern contains a GADT or not changes the control flow, and as a result may cause the propagation to occur or not.

By the way, IIRC, the back propagation from the pattern from the expression is already syntactical, i.e. the type constraint is just copied to the expression to allow the propagation, so this is orthogonal.

@vicuna
Copy link
Author

vicuna commented Oct 18, 2016

Comment author: @alainfrisch

let x : t = e is indeed parsed as let x = (e : t), but this is not the case for let (x : t) = e, nor for more complex cases such as let (p : t) as x = e.

@vicuna
Copy link
Author

vicuna commented Oct 19, 2016

Comment author: @garrigue

It looks like my memory is bad.
It is actually the other way round:
[let x : t = e] is handled as [let (x : t) = (e : t)]
so the copy is from the expression to the pattern, not the other way round.

@vicuna
Copy link
Author

vicuna commented Oct 19, 2016

Comment author: @lpw25

I think let (x : t) = is very rare. I suspect that very little code needs information to flow from the pattern to the defining expression (other than t in let x : t = ... but that is already handled specially).

Side-note: Could we make let p : t = e be actually represented directly in the parsetree? I find these syntactic sugars quite annoying.

@vicuna
Copy link
Author

vicuna commented Oct 19, 2016

Comment author: @gasche

When another part of the code complains that there is not enough information for disambiguation, I think sometimes it makes sense to go back to the binding site of x to add a type constraint (typically when it is of record type and the following ambiguity is on its projection). This is very common for function parameters (fun (x : t) -> ...), but it seems reasonable to me to assume that it would also sometime occur in this position, in particular as Alain suggested deeper inside a pattern (let ((x : t), ...) = ...).

That is an argument for why we may have (p : t) in patterns. It is not an argument for why we would need to propagate that information in the let-defining expression (my explanation above covers information needs in the let-body). I would think that not propagating the information except in the top-level (let .. : t) case is a good first choice that can be iterated on, but I see why Alain was thinking of a more complete solution.

(From a designer prespective I wonder if there is a nicer formulation of type-inference with type-based disambiguation that avoids having to make these approximations. For example, in an inference-through-constraint-solving framework, one could think of propagating annotations and delaying disambiguation-requiring fragments as long as possible. Another approach might be to do the inference within a structural presentation of fields and variants, collect type information at the same time, and "close" the polymorphic names after the fact.)

To summarize: I believe that the feature request represents a reasonably well-defined compromise between power and simplicity, even when amended to do the propagation of the (let ... : t) annotation as is currently done.

@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants