Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007857OCamltypingpublic2018-09-29 15:002018-10-04 05:23
Assigned To 
PlatformAllOSAllOS VersionAll
Product Version4.06.1 
Target VersionFixed in Version 
Summary0007857: Incomplete Abbreviations not allowed
DescriptionFor recursive polymorphic types incomplete abbreviations are rejected even when textual copy/paste works. For example:

type t = [`A of t | `B of e]
and e = [t | `C of t | `D of e | `E]

fails even though this works:

type t = [`A of t | `B of e]
and e = [`A of t | `B of e | `C of t | `D of e | `E]

I happen to have about 20 mutually recursive types and the t and e cases have 20 and 40 constructors or so, and had to use the copy and paste method.

Additional InformationI have no idea if allowing incomplete abbreviations in *all* such cases makes sense but in those cases it does, it would be nice if the type system could handle it to avoid fragile textual cut and paste.

The type abbreviation *can* be used as the argument of a constructor even though it is not complete.

Although this problem can be solved with open recursion it is too hard when there are a lot of types involved (20 types would need to become 20 types with 20 parameters and then you'd need 20 fixations, which is more work than copy/paste).
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
gasche (administrator)
2018-09-29 15:09

Jacques, do you know what the restriction is there for soundness? If this is simply a productivity restriction (all cycles through the definitions have to be guarded by at least one constructor), I would be curious to maybe look at implementing this.
skaller (reporter)
2018-09-30 00:39

Cyclic dependence can't be allowed though:

type a = [`A | b] and b = [`B | a]

Also, if the types involved are parametric, a different kind
of cycle might be present:

type a = [`A | a t] (* some type t with parameter *)
garrigue (manager)
2018-10-01 03:57

The restriction is not for soundness, but removing it would introduce some symmetry issue.
I.e., should we only allow including definitions occurring before (while these are recursive definitions), or do something more clever (topological sorting)?
Note that first behavior coincides with what is done for class types (and even classes?)
IIRC, this issue was discussed when I introduced this feature, and the consensus among developers was that it was better just to forbid it. In particular some people were insistent that the behavior with class types was _not_ a good reason to do it for polymorphic variants.

On simple cases, the work around is the same as with combinations of class types and normal types: make the definition you want to include parametric in those that will use it.

type 'e t_ = [`A of 'e t_ | `B of 'e]
type t = e t
and e = [e t_ | `C of t | `D of e | `E]

But of course, depending on the mutual dependencies, this may be harder to write.
skaller (reporter)
2018-10-02 12:05

Unfortunately the workaround is not always viable.
I actually looked at doing this in my particular code.

For reference here is the file: [^]

and the type which is "copied" is typecode_t, lines 65 to 127, copied into expr_t, lines 145 to 330. It is, I think, possible in this case but cut and paste was just a lot easier and more reliable.

In my language Felix everything is automatically mutually recursive and polymorphic variants in Felix can use abbreviations of any polymorphic type. The compiler simply keeps a trail, and bombs out if it finds a circularity.

In C++, member functions can call others that aren't defined yet, the compiler does a pass to find the interfaces first, then a second pass. A recursive type definition can be handled the same way. In fact probably one pass is enough if you use back-patching. C has no problem with incomplete types, its embarrassing that C does recursion better than Ocaml!
gasche (administrator)
2018-10-02 12:21

Jacques: my proposal was to allow mutually-recursive declarations as long as cycles between the type names are guarded by ocurring under a constructor (which guarantees that the definition is "productive" in the usual sense, that there exists a unique infinite unfolding). So

  type t = [u | ...]
   and u = [t | ...]

would be rejected, but

  type t = [u | ...]
   and u = [`Foo of (.. t ...) | ...]

would be accepted.

I think that we don't need a full topological sort to decide this, but we do need to check that the graph of the "inherits from" relation between the recursive definitions is acyclic. We can also do this for classes.
garrigue (manager)
2018-10-02 14:44

Actually you need a full topological sort, because polymorphic variant expansion is a purely syntactic feature: expansion is done immediately when translating the type, so the type you include must be already defined. Same problem for classes.
gasche (administrator)
2018-10-02 15:16

I see.

If I understand correctly:

- The expansion you mention is done in the Rinherit case of the reference to the inherited type is kept (goes into ctype_desc), but the global/inlined set of fields/variants is computed at the same time (goes into ctype_type).

- The way recursive declaration blocks are handled by Typedecl.transl_type_decl is to first add "unitialized" bindings for the block type names in the typing environment, whose definitions are fresh unification variables, compute "real" type-checked declarations in this environment (so (non-dereferencing) references to mutually-defined types are accepted), and backpatch the typing environment with these a-posteriori computations.

The two-phase approach of typedecl wouldn't quite work here without a topological sort, because Rinherit typing would encounter an inherited type name whose fields are not yet known, so their expansion could not be precomputed.

Conceptually might be possible to extend to a three-phase approach (delaying the computation of the final list of fields in ctype_type to after all the ctype_type have been computed in the temporary environment), but this sounds like an invasive change to the current approach, while a second phase in topological order suffices and would be easier to implement (if we do it right, it may not even change the actual traversal order in all currently-accepted cases).
skaller (reporter)
2018-10-04 05:23

As an added issue (heh): if you're "doing it for polymorphic variants" and "doing it for classes" can't that be included to "do it for mixed types and classes"?

Its the same kind of problem: lack of recursion. Sometimes a term algebra needs to include class types as arguments, and the classes have methods with terms of the algebra as arguments.

Also .. I am wondering if there is a syntax which "automates" open recursion. Consider you have 10 mutually recursive types, then to open them each needs 10 parameters. Then you need to close them in a recursion with each taking 10 arguments. Fine. But, suppose you want to change something, add a type or add a constructor .. thats a LOT of editing to get right. Especially if you're adding a 2 constructors, 2 new types, and modifying some things. And that's just the types. If you have open/closed functions as well they have to all be edited as well. What i'm getting at is that if there is a systematic way to take a bunch of recursive types and open/close them .. then there has to be an automatic algorithm that does it as well (if only we can dream up syntax for it).

- Issue History
Date Modified Username Field Change
2018-09-29 15:00 skaller New Issue
2018-09-29 15:09 gasche Note Added: 0019393
2018-09-30 00:39 skaller Note Added: 0019394
2018-10-01 03:57 garrigue Note Added: 0019395
2018-10-02 12:05 skaller Note Added: 0019396
2018-10-02 12:21 gasche Note Added: 0019397
2018-10-02 14:44 garrigue Note Added: 0019398
2018-10-02 15:16 gasche Note Added: 0019399
2018-10-04 05:23 skaller Note Added: 0019401

Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker