Re: Constructive criticism

Andrew Conway (arc@labri.u-bordeaux.fr)
Wed, 30 Aug 1995 15:07:34 +0200

Message-Id: <9508301307.AA27653@arazek>
From: Andrew Conway <arc@labri.u-bordeaux.fr>
To: Pierre Weis <Pierre.Weis@inria.fr>
Subject: Re: Constructive criticism
In-Reply-To: Your message of "Tue, 29 Aug 1995 20:08:11 +0200."
<199508291808.UAA11790@pauillac.inria.fr>
Date: Wed, 30 Aug 1995 15:07:34 +0200

>Thank you very much for the bugs report, we'll correct them as soon as
>possible.
>
>It's a pleasure to answer to your numerous and so constructive suggestions:

Thank you for your very extensive answers.

>Patterns in fun matches:
>------------------------
>> - The following error seems difficult to understand:
>>
>> #let testit = fun | 4 -> 2 | 1 | 2 -> 3;;
>> Toplevel input:
>> >let testit = fun | 4 -> 2 | 1 | 2 -> 3;;
>> > ^
>> Syntax error.
>> #
>
>> Why is this OK with "function" but not with "fun"? It is
>> not possible to match something to the symbol "|" is it?
>
>The rule of thumb is that complex patterns have to be bracketed when
>used inside ``fun'' pattern matchings. And since ``or'' patterns, that
>is patterns separated by the or symbol ``|'', are not considered basic.
>
>Technically, this strange treatment of patterns is mainly due to Yacc
>restrictions, and partly due to real syntactic ambiguities, since fun
>patterns may bind several parameters (when function patterns only bind
>one). Consider the pattern ``C D'', where C and D are constructors.
>This patterns is syntactically ambiguous: is it the application of
>constructor C to constructor D, or two successive parameters,
>parameter C followed by parameter D ? In a function pattern the first
>interpretation is used, in a fun pattern the second one is assumed
>(hence, when you need the first meaning in a fun pattern you must
>bracket the application of constructor C to D, writing (C D)).

I appreciated the reason for the existance of "function", I
just hadn't realised that or patterns are not considered basic.

>Library considerations:
>-----------------------

(
Important comment --- I realise that the implementations I
gave for the routines below are far from optimal, and just
provided them as a definition for exactly what I meant the
routine to do, such as

>> let list_of_n n x = list_of_vect (make_vect n x);;

whose operation is more transparent than a more efficient
version. I understand the dislike of increasing the core
library...it is mainly the assymetries that I missed...I would
see that there existed map for lists and map for vects, and assumed
that the other things would carry over. Similarly for compy_vect
and copy_string.
)

The examples you gave of similar functions to mine --- were they
from the caml 3.1 library? Can this library be used with caml-light?

>Packed arrays:
>--------------
>
>>Probably futile wish list for implementors with infinite amounts of time.
>
>> - Packed arrays of enumerated (sum) types with only a small number of
>+
>> possibilities and with no arguements -- like boolean.
>
>Why do you want them packed ? You can use vectors. I agree that in
>some cases vectors waste memory locations, in particular for caracters
>(but Caml provides strings).
>In some cases, you can consider to use bit_vectors (I can send you such a
>library if you want), otherwise you can use integers and logical
>shift operations as in C, but this is very low-level. The best
>solution is probably to consider these packed arrays as a compiler
>optimisation, but is it worth the work it needs ?

Unfortunately it is worth the work. One of my main applications spends most
of its time testing, compying, manipulating, etc. some arrays of up to
about 30 terms where each element is 1 bit --- perfect for using integers,
especially since some of the most frequent operations naturally become bit
shifts and logical operations. Unfortunately, I now have to change this
to two bits per element, which means they will no longer for into an
integer...and jumping up to 30 element arrays (120 bytes) is a little more
inefficient than I would like. Especially since I have to store many
of them permanently and memory use (70MB or so) is becoming a problem.

I would appreciate the bit_vector library thanks.

>Unary minus:
------------
> [ there is a really tough syntactical problem ]

I understand this problem. I cannot suggest a good solution. I'm
just commenting that it is an annoyance for me (which will probably
go away with practice).

>Extended pattern matching:
>--------------------------
>
>> - I would like to be able to use a symbol twice in
>> pattern matching. For instance,
>
>> let compare = fun
>> | x x -> 0
>> | x y -> if x>y then 1 else -1
>> ;;
>
>You can encode this using a guard (a ``when'' side-condition) in your
>pattern matching clause:
>let compare = function
> | (x, y) when x = y -> 0
> | (x, y) -> if x > y then 1 else -1;;
>compare : 'a * 'a -> int = <fun>
>
>> I realise that this involves (in some cases) a generalised
>> comparison which is not always possible for functions, but a
>> generalised "=" is used elsewhere, and the exception for functional
>> values seems like a reasonable solution.
>
>You're right. It is conceivable to add this kind of ``non-linear''
>patterns to Caml: the compiler just has to generate the right ``when''
>conditions. This may be done some day. This was not done to maintain a
>good property of Caml pattern matching: pattern matching selection
>is fast, and in particular it cannot loop for ever. In the presence of
>cyclic data, the equality test generated by non linear patterns may
>need an unpredictable amount of computation to complete, or may even
>loop for ever.
>
>In fact we prefered to implement guards rather non linear patterns,
>since guards are more general, allowing to test not only variable
>equalities but arbitrary boolean conditions (in particular they allow
>user defined equality rather than the built-in ad hoc equality).

I realise that guards are more general, and I use them extensively.
It is just that testing for equality in a guard is less readable
(in my opinion --- I realise that it forces you to notice the
duplication) than having it in the pattern. I.e.

>> | x x -> 0

is more readable than

> | (x, y) when x = y -> 0

By the way: I apologise if I seem like I am complaining... I
know that such an implementation takes time and has other tradeoffs,
I'm just putting in my opinion for what should take precedence
in future work.

>Continue in patterns:
>---------------------

> [ It exists in Caml V3.1 ]

>Generalized orpats:
> -------------------

> [ It exists in Caml V3.1 ]

>Definitions of record values:
>-----------------------------
>
>> - I would like to be able to define a new structure instance with
>> a declaration like { x ; Field3=5 ; Field7 = 9 } which would make a
>> copy of the structure x, whilst changing the two given fields to th
>+ e
>> appropriate values. Why? So I can write a program that has lots of
>> things that it will eventually do with slight changes to a structur
>+ e,
>> and I don't have to go through and change routines that are
>> irrelevant to the change.
>
>This a well known problem: allocation of record values is much more
>messy than allocation of sum type values. As you mentioned, this is
>particularly evident when you have to modify a few part of a given
>record.

>However, in Caml you get the capability to treat the fields that have to be
>modified in an imperative way: you just declare those fields ``mutable''
>when defining the record type. This way you share the record and its
>modified version, and just have to change what have to be modified.
>Your example would be something like:
>r -> r.Field3 <- 5; r.Field7 <- 9; r
>
>Unfortunately, when you need a new (fresh) data structure, this
>solution does not work.
>Some partial solutions have been proposed: for instance Caml V3.1 provides a
>short-hand to define the content of a record field: when no expression
>is associated to a field name then a variable expression equal to the
>field name is assumed.
>This way { x; y} is read as {x = x; y = y}. Your example is then a bit
>simplified
>{ Field1; Field2; Field3; Field4; Field5; Field6; Field7 } ->
>{ Field1; Field2; Field3 = 5 ; Field4; Field5; Field6; Field7 = 9 }
>
>Unfortunately, this does not solve your maintenance problem.
>By contrast, your generalised ellipse is an exiting alternative, since
>it allows an easy modification of allocating routines.

I ended up using a list of sum type values, and had specialised
routines that went through and extracted/modified/whatever the
particular value. For the specific application I was using, there
were only 2 or 3 or so terms in it in any specific compilation, so
efficiency was not a big problem here. I considered making everything
mutable and then making a mutable_structure_copy function (perhaps in C), but
that seemed rather ugly.

>Overloading:
>------------
>>Really grasping at straws:
>
>> - I would like to be able to perform "function overloading" as
>> in C++. Furthurmore, greedy person that I am, I would like this
>> overloaded-ability to automagically be passed through into
>> functions that use the overloaded function ...
>
>This is really difficult to provide, and is a long-time research problem.
>However, there exists a restricted form of overloading in Caml V3.1
>that we called ``static overloading''.
>
>We recently made progress on that point and proposed new type systems
>and new compilation schemes to provide a general overloading facility
>(see the article ``Extensional Polymorphism'' in POPL 95).
>This work still has to be fully incorporated into a working Caml system.

Great!

Thanks for your comments,

Andrew.