Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007783OCamldocumentationpublic2018-04-23 19:032018-04-29 16:18
Reporterpmetzger 
Assigned Tooctachron 
PrioritynormalSeverityminorReproducibilityN/A
StatusassignedResolutionopen 
PlatformN/AOSN/AOS VersionN/A
Product Version4.06.1 
Target VersionFixed in Version 
Summary0007783: Better explanation of multi-argument vs. tuple constructors in sum types
DescriptionThe manual does not sufficiently explain a subtle point. Consider the following declaration:

type foo =
| A of int*int
| B of (int*int)

Although these look similar, A and B are quite different. A is a constructor taking two arguments, even though it looks like it takes a tuple, while B of course takes a tuple.

This means that the two cannot be used interchangeably and in fact have different enough meanings to matter in a lot of contexts.

This is confusing to a beginner because it would seem like the parentheses are just meaningless precedence grouping, but they aren’t, they’re syntactically quite significant, and it would seem like the * in the A case means “tuple” but it does not.

(Note that I myself didn't understand this distinction. The manual does not make it particularly clear. That is why I am filing this bug report. A discussion on the discord chat channel seemed to reveal that _many_ beginners were confused about this.)

It would be good to make this _very_ explicit in the manual.
Steps To Reproduce1. Try learning OCaml
2. Become confused about multi-argument constructors vs. tuple constructors in sum types.
TagsNo tags attached.
Attached Files

- Relationships
related to 0005883closedgarrigue BNF insufficient for S (3, 4) 

-  Notes
(0019044)
pmetzger (reporter)
2018-04-23 19:19

BTW, during discussion, what cemented the distinction in my mind was someone noting that, if constructors had function types (as in Haskell), then:

A: int -> int -> foo
while
B: (int * int) -> foo

This made the difference much clearer to me. (The use of * in the A case is unfortunate I think, but that decision was made decades ago.)
(0019045)
sampo (reporter)
2018-04-24 12:44

OCaml provides pairs, triples, quadruples and so forth. For n >= 2, the collection of n values is called an n-tuple, or just a tuple. The tuple whose components are x1,x2,...,xn is written (x1,x2,...,xn). Such a value is created by an expression of the form (E1,E2,...,En). With functions, tuples give the effect of multiple arguments and results.
#type foo =
 | A of int * int
 | B of (int * int);;
type foo = A of int * int | B of (int * int)
# A(1,2);;
- : foo = A (1, 2)
# B(1,2);;
- : foo = B (1, 2)
#
# let average(x,y) = (x+y) / 2;;
val average : int * int -> int = <fun>

This writting is also the same:
val average : (int * int) -> int = <fun>

I hope this will help.
(0019046)
pmetzger (reporter)
2018-04-24 14:53
edited on: 2018-04-24 15:20

Sampo, I'm not sure what you were suggesting? The point is that there's a difference between a constructor that takes multiple values and a constructor that takes a single tuple, but the syntax is so similar that beginners have problems with this, and it is important to document the difference carefully. I am not talking about a misunderstanding that I myself am experiencing, I'm suggesting that documentation on this subtlety be improved. If you were proposing documentation, I'm not sure that really explains it well enough.

(0019047)
pmetzger (reporter)
2018-04-24 15:22

Here is some possible documentation that could be added in section 1.4 of the manual ("Records and Variants"):

Note that although the syntax is almost the same, there is a significant difference between a variant constructor that takes multiple arguments and one that takes a single tuple as an argument. Consider the following type:

type foo =
| A of int*int
| B of (int*int)

Although the syntax for A and B seems nearly the same, the parentheses are quite significant. A is a constructor that takes two arguments, both of which are integer typed values, while B is a constructor that takes a single argument, a tuple of two integers. These behave quite differently in many circumstances, such as when pattern matching. Be careful not to confuse them.
(0019049)
sampo (reporter)
2018-04-24 17:43

@pmetzger.
A is a constructor taking a tuple, not arguments. This is a "Cartesian product". The values themselves can be grouped in pairs or even in tuples.
int * int is a pair or a tuple likewise (int * int). There is no difference.
However, external parentheses are not obligatory. The "Cartesian product" is denoted by the symbol *.
type such as these:
t1 * t2 * t3
(t1 * t2) * t3
t1 * (t2 * t3)
The first indicates a set of triples.
The second indicates a set of pairs where the first component is also a pair.
The third indicates a set of pairs where the second component is also a pair.
(0019050)
octachron (developer)
2018-04-24 17:54

@sampo, I am afraid that you are mistaken. Constructor arguments are not packed in a tuple by default: there are differences in both behaviors in pattern matching and memory representations between type t = A of int * int and type s = B of (int * int). I recommand playing with caml-inspect and reading the memory representation chapter of RWO to learn about the difference in memory representation.
(0019051)
sampo (reporter)
2018-04-24 18:50

@octachron,a pair or a 2-tuple will always remain a pair or a 2-tuple, with or without parentheses and this has existed since the birth of CAML and continues with Ocaml. This is a fact.
The representation in memory is another matter.
(0019052)
trefis (manager)
2018-04-24 19:31

No.

# type t = A of int * string;;
type t = A of int * string
# let x = (3, "haha");;
val x : int * string = (3, "haha")
# A x;;
Characters 0-3:
  A x;;
  ^^^
Error: The constructor A expects 2 argument(s),
       but is applied here to 1 argument(s)
(0019053)
pmetzger (reporter)
2018-04-24 19:44

Sampo: apparently, the documentation update I'm requesting is even more necessary than I had naively expected.
(0019054)
sampo (reporter)
2018-04-24 21:31

@trefis, yes, this is an example that shows the power of Ocaml typing. But the two expected arguments have a name and are called a pair or a 2-tuple. And that makes the difference that apparently you do not discern. (respectfully speaking) As I said above, a type int * int or (int * int) is the same and is a component value written in parenthesis like this (1,2), a comma serves as the separator between elements. In this case (your example) the argument is not a value and another value but it is a pair or a 2-tuple that is expected. Reading the error that the compiler gives can cause confusion in the understanding, and apparently that's
what he does.
(0019055)
pmetzger (reporter)
2018-04-24 21:59
edited on: 2018-04-24 21:59

Sampo:

Abusing notation (since unlike Haskell, constructors don't have types):

The type of "A" in "type t = A of int * int" would be "A: int -> int -> int"
The type of "B" in "type s = B of (int * int)" would be "B: (int * int) -> int"

These are not the same. The "*" in the first case is rather unfortunate notation, but we're stuck with it.

This is why I think this really needs to be documented. It confuses people, even quite smart people, and nothing in the documentation really explains it.

(0019056)
pmetzger (reporter)
2018-04-24 23:48

(And again note that if you feed a tuple to A, it won't work it is expecting two arguments, but if you feed a tuple to B it will. They are _not_ the same.)
(0019058)
sampo (reporter)
2018-04-25 11:37

@pmetzger
This is a matter of writing. Let's go back to the past, at the time of CAML.
Write these examples in the Caml top level.
#type t = A of int * int;;
Le type t est défini.
#type s = B of (int * int);;
Le type s est défini.
As you can see there is no signature representing the types with or without parenthesis. Only an acknowledgment of receipt. Then write:
>A = B;;
> ^
Cette expression est de type int * int -> s,
mais est utilisée avec le type int * int -> t.
#B = A;;
Entrée interactive:
>B = A;;
> ^
Cette expression est de type int * int -> t,
mais est utilisée avec le type int * int -> s.
The corresponding signature is int * int, no parenthesis.

Come back to today, with Ocaml write those same sentences in the top level:
# type t = A of int * int;;
type t = A of int * int
# type s = B of (int * int);;
type s = B of (int * int)
The writing (the answer) of the interpreter is different. Then write:
# A = B;;
Characters 0-1:
  A = B;;
  ^
Error: The constructor A expects 2 argument(s),
       but is applied here to 0 argument(s)
# B = A;;
Characters 0-1:
  B = A;;
  ^
Error: The constructor B expects 1 argument(s),
       but is applied here to 0 argument(s)
#
As you can see, the writing (the answer) of the interpreter is different.
However, it's still the same thing in all these past years.

Perhaps, to summarize this situation, it is necessary to simply improve the error message written in the top level, for a better comprehension?
(0019060)
octachron (developer)
2018-04-25 11:59

Sampo, you are getting confused by CAML promotion of constructor to function when they are partially applied. I assure you that constructors do have an arity in OCaml, as stated by the compiler error messages and the reference manual(e.g. http://caml.inria.fr/pub/docs/manual-ocaml/expr.html#sec146 [^]).
(0019066)
hhugo (reporter)
2018-04-27 00:33

Would it help to make the parallel with inline record ?
(0019067)
pmetzger (reporter)
2018-04-27 00:44

See: https://discuss.ocaml.org/t/sum-type-constructor-declaration-subtlety-and-documentation-wanted/1898 [^] for some proposed language.
(0019068)
sampo (reporter)
2018-04-27 11:43

I read the debate with particular attention. Many agree and apparently I will be the only one who does not agree.
In fact, I understand your reasoning at all, but I think that in this particular situation you have to take a solid foundation, a base, that is the Cartesian product. And a Cartesian product always indicates an n-tuple, composed of argument(s), singleton, 2-tuples, 3-tuples,..., n-tuples with or without parenthesis.
So to resume the example of bobbypriambodo I will say that:
type account =
 | Facebook(string, int) /* not 2 arguments but a 2-tuple or pair*/;
type account2 =
 | Instagram((string, int)) /* not 1 argument -not happens to be a 2-tuple but also a 2-tuple or pair*/;
This is my point of view and I do not get confused.
(0019080)
sampo (reporter)
2018-04-29 11:31

@octachron. I would like to say one more thing before you change the documentation. I resumed the example of trefis with a small change.
# type bar = int * string;;
type bar = int * string
# type t = A of bar;;
type t = A of bar
# let x = (3, "haha");;
val x : int * string = (3, "haha")
# A x;;
- : t = A (3, "haha")
Here, below, another example that shows some difficulty in calculating the number of arguments.
# let addpoint ((x1,y1), (x2,y2)) = (x1+.x2, y1+.y2);;
val addpoint : (float * float) * (float * float) -> float * float = <fun>
This function take a pair of points. Its arguments pattern is a pair of pairs.
Look again at the argument pattern of addpoint. We may equivalently view this function as taking:
- one argument: a pair of pairs of float numbers
- two arguments: each a pair of float numbers
- four arguments: all float numbers, oddly grouped
There is a probability that the compiler is mistaken in calculating the arity of the expressions given to it in a constructor with or without parentheses, between an argument and two arguments like the example above.

- Issue History
Date Modified Username Field Change
2018-04-23 19:03 pmetzger New Issue
2018-04-23 19:19 pmetzger Note Added: 0019044
2018-04-24 12:44 sampo Note Added: 0019045
2018-04-24 14:53 pmetzger Note Added: 0019046
2018-04-24 15:20 pmetzger Note Edited: 0019046 View Revisions
2018-04-24 15:22 pmetzger Note Added: 0019047
2018-04-24 16:14 octachron Assigned To => octachron
2018-04-24 16:14 octachron Status new => assigned
2018-04-24 17:43 sampo Note Added: 0019049
2018-04-24 17:54 octachron Note Added: 0019050
2018-04-24 18:50 sampo Note Added: 0019051
2018-04-24 19:31 trefis Note Added: 0019052
2018-04-24 19:44 pmetzger Note Added: 0019053
2018-04-24 21:31 sampo Note Added: 0019054
2018-04-24 21:59 pmetzger Note Added: 0019055
2018-04-24 21:59 pmetzger Note Edited: 0019055 View Revisions
2018-04-24 23:48 pmetzger Note Added: 0019056
2018-04-24 23:50 octachron Relationship added related to 0005883
2018-04-25 11:37 sampo Note Added: 0019058
2018-04-25 11:59 octachron Note Added: 0019060
2018-04-27 00:33 hhugo Note Added: 0019066
2018-04-27 00:44 pmetzger Note Added: 0019067
2018-04-27 11:43 sampo Note Added: 0019068
2018-04-29 11:31 sampo Note Added: 0019080


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker