Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006000OCamlOCaml typingpublic2013-04-25 16:352014-04-03 04:57
Reporterlpw25 
Assigned Tolpw25 
PrioritynormalSeverityminorReproducibilityN/A
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version4.01.0+dev 
Target Version4.02.0+devFixed in Version 
Summary0006000: Warning 40 ("Constructor or label name out of scope") should be an error
DescriptionWe have had record disambiguation on trunk now for a few months,
and I think that it would be wise to look again at the issue of
out of scope labels and constructors before the next major
release.

The issue is around the new ability to use a constructor or label
even though that constructor or label is not in scope, as long as
the type information is available.

For example, the following is allowed:

  module M = struct
    type foo = Foo
  end

  let x: M.foo = Foo

Note that previously this would be rejected, forcing the user to
write:

  let x = M.Foo

Currently, using this feature produces a warning (40 "Constructor
or label name used out of scope"), however I think that it should
be upgraded to a full error.

This was discussed in 0005759, however more issues have emerged
since that discussion, and I think now is a good time to
re-evaluate this descision.

My arguments against allowing out of scope labels and
constructors are as follows:

- It allows references to constructors to be used in files that
  contain no reference to the file defining those
  constructors. For example, I have already come across the
  following code in the wild:

    (* types.ml *)
      ...
      module Project = struct
        ...
        type project = {
          project_id: string;
          ...
        }
        ...
      end
      ...

    (* data.ml *)
      ...
      open Types
      ...
      module Projects = struct
        open Project
        module Platform = struct
  
          let project =
            { project_id="platform";
              ...
            }
          ...
        end
        ...
        let all = [ Platform.project; ... ]
        ...
      end
  
    (* gantt.ml *)
      ...
      open Data
      ...
      List.map Projects.all
        ~f:(fun proj -> ... proj.project_id ...)
      ...

  Note that gantt.ml uses project_id but contains no reference to
  types.ml where that label is defined. When a bug occurred, it
  was non-trivial to fix due to the difficulty of finding the
  appropriate definition of project_id. This bug caused a lot of
  confusion among less experienced OCaml programmers, and I think
  that it is a very bad idea to allow it.

- It only works "in the presence of type information". This is an
  unspecifed criteria that most users will probably find
  confusing.

- It doesn't work with GADTs. If the constructor in question is
  declared using GADT syntax then it cannot be used out of scope.

- It can't work with exceptions.

- It won't work with "open types" (0005584) or "pattern synonyms"
  which may be added to the language in future.

- It will confuse users by adding a small element of structural
  typing to an otherwise nominative system. I think that most
  users picture variant constructors as named values, and all
  named values in OCaml obey the same scoping rules. To suddenly
  allow these values to be used as if they were not named values,
  but structural values (like the methods in an object) will
  confuse people.

In return for these issues, all that we gain is the ability to
refer to out of scope constructors and labels without using a
module name, but only under certain difficult to define
conditions.

Last time that this issue was discussed, it only appeared to be
discussed by 4 people. I think perhaps it would be a good idea
to get wider feedback before including a potentially risky
feature like this in a release.

It is also worth pointing out that it would be easier to add this
feature in the future than to remove it later if it causes
problems.
TagsNo tags attached.
Attached Files

- Relationships
related to 0005759resolvedgarrigue Using well-disciplined type-propagation to disambiguate label and constructor names 

-  Notes
(0009210)
frisch (developer)
2013-04-25 17:27
edited on: 2013-04-25 17:41

> It allows references to constructors to be used in files that contain no reference to the file defining those constructors.

This is not really new, the same happens e.g. with objects. The problem, for me, is more general: ocamlc should not be allowed to open foo.cmi when compiling bar.ml if bar.ml does not somehow mention Foo. At LexiFi, we have a local patch to do that, and it ensures that "ocamldep" returns (by construction) a safe upper-bound of the dependencies.

Also, I guess that proper tooling (based on -annot or -binannot) should mitigate the bad user experience you mention.

----

What will happen if we remove the feature? Assume we have two modules Ast and TypedAst which define two variant types with the same constructors. In the type checker, we map from one to the other, and it's quite nice to be able to write:

 open TypedAst

 let type_check env (e : Ast.t) =
   let loc = e.loc in
   match e.desc with
   | Foo -> {desc=Foo; loc; typ=...}
   | ....

How would you write this kind of code?


----

Also, it we turn warning 40 into an error, people will be strongly encouraged to use "open" more often. I consider "open" as a dangerous (even though quite useful) feature in itself, and being able to avoid it as much as possible is an advantage. In particular, the "out of scope" styles permit nice library APIs, like:

  type flags = {visible: bool; enabled: bool}
  type mode = Foo | Bar
  val do_something: flags:flags -> mode:mode -> ...

...

  TheLib.do_something ~flags:{visible=true; enabled=false} ~mode:Foo ...

If we disallow the "out of scope" style, either library authors will be encouraged to use structural types (and then we don't gain any readability), or users will use "open" more often. The local open mitigates the problems with "open", but it is not a panacea.

(0009213)
yminsky (reporter)
2013-04-25 18:44

I do agree with Alain on this point. The whole approach of doing constructor disambiguation that is based on having multiple conflicting constructors dumped into the same scope strikes me as an unscaleable mess. Leo's point that this is all very hard to reason about is perhaps right. But without this feature, the whole thing of having constructor disambiguation also strikes me as fairly low-value. From my perspective, it will change how we write programs at Jane Street almost not at all.

And, if this feature doesn't make it, we should figure out a way of doing a constructor-only local open. What I typically want is to get access to the constructors and only the constructors for a given type. There's no clean way of doing this now.
(0009214)
lpw25 (developer)
2013-04-25 18:59
edited on: 2013-04-25 19:27

> This is not really new, the same happens e.g. with objects.

That isn't really the same. Objects are structurally typed and
methods are essentially constants. So for example:

  let f o = o#m + 3

always means the same thing and always has a type. Whereas

  let f r = r.m + 3

only has a type in the presence of a definition of the m label
with type int.

> Also, I guess that proper tooling (based on -annot or -binannot) should mitigate the bad user experience you mention.

Only if the bug has left the correct type information in place,
otherwise they will not be able to help. For instance, changing a
type annotation in one file can cause an "Unbound label" error in
another file, with no easy way to work out which definition the
label originally referred to.

>open TypedAst
>
> let type_check env (e : Ast.t) =
> let loc = e.loc in
> match e.desc with
> | Foo -> {desc=Foo; loc; typ=...}
> | ....
>
>How would you write this kind of code?

Depending on the circumstances, I would use one of:

----
 open Ast
 open TypedAst

 let type_check env (e : Ast.t) =
   let loc = e.loc in
   match e.desc with
   | Foo -> {desc=Foo; loc; typ=...}
   | ....
----
 open TypedAst

 let type_check env (e : Ast.t): t =
   let open module Ast in
   let loc = e.loc in
   match e.desc with
   | Foo -> {desc=Foo; loc; typ=...}
   | ....
----
 open TypedAst

 let type_check env (e : Ast.t) =
   let loc = e.Ast.loc in
   match e.desc with
   | Ast.Foo -> {desc=Foo; loc; typ=...}
   | ....
----

> Also, it we turn warning 40 into an error, people will be strongly encouraged to use "open" more often. I consider "open" as a dangerous (even though quite useful) feature in itself, and being able to avoid it as much as possible is an advantage.

I certainly agree that "open" needs improvements to make it
safer (allowing explicit signatures on them would be a good
start). However, the problems with open should be addressed by
solutions that apply to all values, not to a subset of values
that (to the average user) will appear completely arbitrary.

> In particular, the "out of scope" styles permit nice library APIs, like:
>
> type flags = {visible: bool; enabled: bool}
> type mode = Foo | Bar
> val do_something: flags:flags -> mode:mode -> ...
>
>...
>
> TheLib.do_something ~flags:{visible=true; enabled=false} ~mode:Foo ...

For these APIs what you want is structural types, and I see no
problem with using a polymorphic variant for the "mode" argument
in your example.

For the "flags" argument, you could use one of the following:

  type mode = [`Foo | `Bar];;
  val do_something: visible:bool -> enabled:bool -> mode:mode -> ...
  ...
  TheLib.do_something ~visible:true ~enabled:false ~mode:`Foo ...

or

  type mode = [`Foo | `Bar];;
  type flags = [`Visible | `Enabled];;
  val do_something_else: flags:flags list -> mode:mode -> ...
  ...
  TheLib.do_something ~flags:[`Visible] ~mode:`Foo ...

The second example behaving much like the flags used in C APIs.

However, what your example really shows is a desire for a
light-weight object syntax. That would allow this code:

  type flags = <visible: bool; enabled: bool>
  type mode = [`Foo | `Bar];;
  val do_something: flags:flags -> mode:mode -> ...
  ...
  TheLib.do_something ~flags:{{ visible=true; enabled=false }} ~mode:`Foo ...

(0009215)
lpw25 (developer)
2013-04-25 19:15
edited on: 2013-04-25 19:21

> The whole approach of doing constructor disambiguation that is based on having multiple conflicting constructors dumped into the same scope strikes me as an unscaleable mess.

This is essentially what we do with regular values. The only difference for labels and constructors is that you cannot rename them, hence the need for record disambiguation.

> And, if this feature doesn't make it, we should figure out a way of doing a constructor-only local open. What I typically want is to get access to the constructors and only the constructors for a given type. There's no clean way of doing this now.

I agree, allowing something like:

  open Foo.t

to get access to local access to t and all of its constructors would be very convenient.

In addition, providing:

  open (M: S)

that only adds the elements of M mentioned in the signature S into the environment would make local opens much safer.

(0009217)
garrigue (manager)
2013-04-26 06:35

Leo, I don't understand your point.
The warning is already enabled by default.
It means that using this feature is a deliberate choice.
You may have to take extra care, but this is because you decided to do so.

If what you ask for is a better warning (one that tells you where the constructor comes from for instance), I think this is perfectly doable.
Note that it is already pretty easy to obtain this information by adding a type annotation.

On the different points you mention, I see nothing very convincing.
* On your concrete exemple, why couldn't you just get the type of proj?
   This is enough to know everything needed.
   Was the warning disabled? If this is not the case, Ignoring warnings is bad coding.
* "presence of type information" is well-defined if you use -principal; even otherwise it is relatively predictable.
  Moreover this argument is against type-based ambiguity resolution as a whole, it is not directly related with scope.
* it doesn't work with GADT only by choice; it could work if needed
* it doesn't work for exceptions, but exceptions have a different semantics
* we already had the discussion on nominal vs. structural, and personally I don't buy it, but again nobody forces anybody to use this feature
(0009218)
lpw25 (developer)
2013-04-26 10:35

> The warning is already enabled by default.
> It means that using this feature is a deliberate choice.

Programmers tend to just ignore warnings that they don't understand. Which is what had happened in the case I described.

> If what you ask for is a better warning (one that tells you where the constructor comes from for instance), I think this is perfectly doable.

This would certainly help. The message should also be clearer.

> On your concrete exemple, why couldn't you just get the type of proj?
> This is enough to know everything needed.

It is unusual to have to solve an "unbound value" error by looking at the types. It did not occur to the programmers in question to even consider this. (It barely occurred to me).

This problem was also exacerbated because the program was being edited by some people using trunk and some others using 4.00.1. The improved error messages from trunk might have made the problem clearer. I think Hongbo mentioned similar problems in another bug report.

> "presence of type information" is well-defined if you use -principal; even
> otherwise it is relatively predictable.

Only to those of us who understand the type system. Currently, we are saying:

  "If your value is a constructor and there is type information available and
   it is not a GADT and it is not an exception, then you don't need to write
   the module name"

I think that most programmers will just approximate this to

  "Sometimes you don't need to write the module name"

> it doesn't work with GADT only by choice; it could work if needed

Then it definitely should. The type information criteria is confusing enough without taking into account the syntax used to define the constructor.
(0009219)
lpw25 (developer)
2013-04-26 11:26
edited on: 2013-04-26 11:30

>> If what you ask for is a better warning (one that tells you where the constructor comes from for instance), I think this is perfectly doable.

> This would certainly help. The message should also be clearer.

How about something like:

"Constructor M.Foo is not available as Foo in the current environment"

or

"Constructor M.Foo is not available as Foo in the current scope"

(0009220)
garrigue (manager)
2013-04-26 15:13

I improved the message in commit 13612.

  Warning 40: x was selected from type M.t.
  It is not visible in the current scope, and will not
  be selected if the type becomes unknown.

Note that if you do not believe in using type information,
you might want to enable warning 42 too.
(0009238)
hcarty (reporter)
2013-05-01 21:58

It feels surprising that a value of Types.Project.project can not be created without type annotations and/or explicit pathing of the record fields but one can sometimes access record fields with no annotation or pathing.

The reasoning makes sense but the rules seem difficult to explain, particularly to someone new to the language:

  module M = struct type t = { x : float; y : float } end;;

  (* pre-4.01.0 method which still works *)
  let p_pathed_ok = { M.x = 1.0; y = 1.0 };;
  (* Does not work, no type information available *)
  let p_not_ok = { x = 1.0; y = 1.0 };;
  (* Specifying the type allows this definition to work *)
  let p_annotation_ok : M.t = { x = 1.0; y = 1.0 };;

  (* Accessing p_*_ok.x is fine here because we know the type already... *)
  print_float p_annotation_ok.x;;

  (* ... but this does not work (no type information) even though it
     looks very similar to the 'print_float' line above... *)
  let f_not_ok p = p.x;;
  (* ... except that we were missing an annotation for this 'p' *)
  let f_annotation_ok (p : M.t) = p.x;;
(0009239)
hcarty (reporter)
2013-05-01 22:03

This example is more surprising:

  module M = struct type t = { x : float; y : float } end;;
  type t = { x : float; y : float };;

  (* OK - val f : M.t -> float *)
  let f p = p.M.x +. p.y;;

  (* Not OK -
     Error: The field M.y belongs to the record type M.t
     but a field was expected belonging to the record type t *)
  let f p = p.x +. p.M.y;;
(0009240)
gasche (developer)
2013-05-02 14:36

This example is the sign of the ambivalent situation we are in.

- if out-of-scope label access was not possible, there would be no such confusion
- if out-of-scope label access is considered the correct, usual case (without the warning 40), then "p.x" in the second program below is ambiguous, and the type-checker should at least warn when selecting the candidate "t"; the presence of this warning would make it clear what the problem is
- currently out-of-scope label access has an intermediary status, so it doesn't really makes sense to mark an ambiguity when there is only one choice in scope (and more outside), and that would result in too much warnings to pay for a feature that the user may not use

I would personally support marking the out-of-scope access warning (40) as an error by default, and enlarging the set of ambiguity candidates (for warning 41) with out-of-scope candidates when warning 40 is not an error.
(0009241)
garrigue (manager)
2013-05-02 17:07
edited on: 2013-05-02 17:07

Just for you information:

$ ocaml -principal
....
# let f p = p.M.x +. p.y;;
Warning 40: y was selected from type M.t.
It is not visible in the current scope, and will not
be selected if the type becomes unknown.
Warning 18: this type-based field disambiguation is not principal.
val f : M.t -> float = <fun>

I don't see that much asymmetry between the two cases...

I may be repeating myself, but ignoring warnings, even for beginners,
is not the right attitude. I hope that the new message is self explaining.
Warnings are not errors because they are warnings. This does not mean
you should ignore them.

If you really want to use labels out of scope, the right options are
  ocaml -principal -w -40 -warn-error +18


Your other comment about hard to understand type propagation makes
sense (yet your examples don't look hard to understand).
But then the only solution is to disable all use of inferred type
information. All you examples can be rewritten without out of scope
access, and exhibiting exactly the same problem.

type t = { x : float; y : float };;
type u = { x : float; y : float; z:float };;

let f (p:t) = p.x;;

Does this mean that warning 42 should be enabled by default?
Or something limited to type disambiguation (42 also covers field set
disambiguation) ? This could be useful to remind that -principal
is needed for a precise definition.

(0009293)
doligez (administrator)
2013-05-17 16:07

Jacques,

Does all this mean that we should have -principal as the default? And warning 40 as an error by default?

As for Gabriel's suggestion of making the behaviour of the type-checker dependent on whether a warning is error-enabled, that will be over my dead body. If you want to do something like that, make up a new option.
(0009644)
doligez (administrator)
2013-06-28 19:54

A side question for Jacques about -principal:

  # fun b -> if b then format_of_string "x" else "y";;
  - : bool -> ('a, 'b, 'c, 'd, 'd, 'a) format6 = <fun>
  # fun b -> if b then "x" else format_of_string "y";;
  Error: This expression has type ('a, 'b, 'c, 'd, 'd, 'a) format6
         but an expression was expected of type string

The typing of string constants is not principal because of the overloading with formats. Is there an easy way to make the compiler output a warning for the first expression with -principal?
(0009786)
lpw25 (developer)
2013-07-16 11:41

> And warning 40 as an error by default?

The much improved wording of the warning is probably sufficient.

I still have some concerns about using constructors from files that are not explicitly referenced. Alain mentioned that Lexifi's version of OCaml does not allow an ml file to have an unmentioned module as one of its imports. Maybe we could provide a warning for such cases.
(0009821)
frisch (developer)
2013-07-22 12:38

> Alain mentioned that Lexifi's version of OCaml does not allow an ml file to have an unmentioned module as one of its imports. Maybe we could provide a warning for such cases.

A warning detecting exactly those cases where our modification would fail is difficult to do. In many cases, disallowing a .cmi to be loaded (because its module is not mentioned in the source code being compiled) does not lead to an error: the compiler tries to expand a type abbreviation and since the .cmi cannot be loaded, it considers it as an abstract type, but this works fine (the code doesn't actually require the expansion to be type-checked). This is not to say that the warning would not be useful as well...
(0011192)
lpw25 (developer)
2014-04-03 01:28

I think my original objections were satisfied by Jacques rewording of the warning message, so I'm closing this issue.
(0011194)
garrigue (manager)
2014-04-03 04:57

> The typing of string constants is not principal because of the overloading with formats. Is there an easy way to make the compiler output a warning for the first expression with -principal?

Yes, basically we just need to check that the expected type "format6" is generalizable.
This is done in trunk at revision 14523.

# fun b -> if b then format_of_string "x" else "y";;
Warning 18: this coercion to format6 is not principal.

Sorry for not answering earlier.

As for how much -principal is important, the performance obstacle seems the main problem.
Alain tells me that he couldn't use it on the Lexifi code base because it was too slow.
In general I would say that you don't need to use it, but if you start to get unstable behavior in you code, it can help you to make it more robust, and give you a better intuition of where annotations are needed.

- Issue History
Date Modified Username Field Change
2013-04-25 16:35 lpw25 New Issue
2013-04-25 16:36 lpw25 Relationship added related to 0005759
2013-04-25 16:37 lpw25 Description Updated View Revisions
2013-04-25 17:27 frisch Note Added: 0009210
2013-04-25 17:41 frisch Note Edited: 0009210 View Revisions
2013-04-25 18:44 yminsky Note Added: 0009213
2013-04-25 18:59 lpw25 Note Added: 0009214
2013-04-25 19:15 lpw25 Note Added: 0009215
2013-04-25 19:16 lpw25 Note Edited: 0009215 View Revisions
2013-04-25 19:21 lpw25 Note Edited: 0009215 View Revisions
2013-04-25 19:27 lpw25 Note Edited: 0009214 View Revisions
2013-04-25 19:38 lpw25 Description Updated View Revisions
2013-04-26 06:35 garrigue Note Added: 0009217
2013-04-26 10:35 lpw25 Note Added: 0009218
2013-04-26 11:26 lpw25 Note Added: 0009219
2013-04-26 11:27 lpw25 Note Edited: 0009219 View Revisions
2013-04-26 11:30 lpw25 Note Edited: 0009219 View Revisions
2013-04-26 15:13 garrigue Note Added: 0009220
2013-05-01 21:58 hcarty Note Added: 0009238
2013-05-01 22:03 hcarty Note Added: 0009239
2013-05-02 14:36 gasche Note Added: 0009240
2013-05-02 17:07 garrigue Note Added: 0009241
2013-05-02 17:07 garrigue Note Edited: 0009241 View Revisions
2013-05-17 16:07 doligez Note Added: 0009293
2013-06-28 19:54 doligez Note Added: 0009644
2013-06-28 19:54 doligez Status new => feedback
2013-07-16 11:41 lpw25 Note Added: 0009786
2013-07-16 11:41 lpw25 Status feedback => new
2013-07-16 11:44 doligez Assigned To => doligez
2013-07-16 11:44 doligez Status new => acknowledged
2013-07-17 16:05 doligez Assigned To doligez =>
2013-07-22 12:38 frisch Note Added: 0009821
2013-07-22 12:42 frisch Target Version 4.01.0+dev => 4.02.0+dev
2014-04-03 01:28 lpw25 Note Added: 0011192
2014-04-03 01:29 lpw25 Status acknowledged => resolved
2014-04-03 01:29 lpw25 Resolution open => fixed
2014-04-03 01:29 lpw25 Assigned To => lpw25
2014-04-03 04:57 garrigue Note Added: 0011194


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker