array operations on abstract types are never specialized aka "Information Hiding Considered Harmful to Performance" #7442

vicuna · 2016-12-26T02:46:53Z

Original bug ID: 7442
Reporter: markghayden
Status: acknowledged (set by @xavierleroy on 2017-01-14T15:25:26Z)
Resolution: open
Priority: normal
Severity: feature
Platform: AMD64
OS: MacOS
OS Version: 10.12.2
Version: 4.04.0
Target version: later
Category: middle end (typedtree to clambda)
Has duplicate: #7440 #7441
Monitored by: @dbuenzli

Bug description

When you create an array of a abstract but concrete type, array operations on the type are never specialized. For instance see AbsInt2 below;

module AbsInt2 :
sig
type t val zero : t
val add_int : int -> t -> int

val array_length : t array -> int
val array_get : t array -> int -> t
end = struct
type t = int
let zero = 0
let add_int = (+)
let array_length a = Array.length (a:t array)
let array_get a i = Array.unsafe_get (a:t array) i
end

If you use Array.get to access an array of type AbsInt.t then the compiled code will check for whether the array is a float array (even though the type t is internally "int") and will compile code to allocate the boxed float if the array is a float array. On the other hand, using AbsInt2.array_get will be specialized as operations on an int array, avoiding the need to check for a float array. This is even though the operations are in the same compilation unit and the compiler has knowledge of the exact type. Using information hiding in the type system should not prevent optimizations by the compiler.

Note that the need to check for float array in the compiled code presumably has follow-on effects in the compiler, since the resulting code is larger I'd guess that this becomes less likely to be inlined -- even though much of it is "dead code".

Steps to reproduce

Compile the attached file with -O3 -unbox-closures with 4.05.0 compiler and review the compiled assembly code. In includes variations on the AbsInt module above. You can expose the internal type as int ("type t =int" in the signature. Or make use of specialized "array_length" functions to avoid the "float array" checks.

File attachments

b.ml

vicuna · 2016-12-26T04:24:33Z

Comment author: @garrigue

Private abbreviations can help in this case:

type t = private int

It allows to break abstraction, but only in a safe way (in this case).

vicuna · 2016-12-26T05:01:20Z

Comment author: markghayden

Thanks for the suggestion. I don't want my program to break abstraction. I want the compiler to break it, but only for the purposes of optimization. I would think we would want the compiler to do this 100% of the time when it can find some benefit, such as it does for inlining where the abstraction barrier is pierced to extract code for use in external modules. Are there cases where we do not want the compiler to have this information available in case it could help in code generation (such as specializing arrays)? If not, then I don't understand why an annotation should be required when arguably should be used 100% of the time. Maybe I'm missing something?

vicuna · 2016-12-28T06:11:03Z

Comment author: markghayden

Thanks again for the suggestion. I can tell you this is quite a bit of trouble to use 'private' annotations. We have over 300,000 lines of ML code in our application (see http://seaiq.com) and 100's of source files that we need to review looking for cases where 'private' may help. Also, it seems the 'private' annotation does break abstraction, "leaking" to the rest of our program what the underlying type is. We then need to be careful to ensure this information is not actually used. If people think that requiring programmers to make these 'private' annotation is a good resolution to this issue, I strongly disagree. The goal of a programming language and compiler should be to allow programmers to use good programming style but then still compile with good performance. Requiring that programmers not use good programming style (information hiding) in order to avoid run-time checks for floating-point-array on array accesses seems to violate this principle. Maybe I'm missing something here?

vicuna · 2016-12-28T09:47:52Z

Comment author: @garrigue

My answer was just meant as a temporary workaround.
I believe there is already work in flambda to alleviate your concerns (i.e. propagate information under the hood), and there will be more in the future.
But what I would like to emphasize is that this propagation under the hood does not amount to running the type inference again, with abstraction off, but to some more adhoc approach. So it may not be exactly what you would expect in the end.

vicuna · 2017-01-14T15:25:26Z

Comment author: @xavierleroy

See my comment in #7440

github-actions · 2020-05-09T04:16:47Z

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

vicuna added the middle-end label Mar 14, 2019

This was referenced Mar 7, 2017

comparison operations on abstract types and max/min are not specialized #7440

Closed

Array operations not specialization during inlining. #7441

Closed

vicuna added the feature-wish label Mar 20, 2019

github-actions bot added the Stale label May 9, 2020

github-actions bot closed this as completed Jun 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

array operations on abstract types are never specialized aka "Information Hiding Considered Harmful to Performance" #7442

array operations on abstract types are never specialized aka "Information Hiding Considered Harmful to Performance" #7442

vicuna commented Dec 26, 2016

vicuna commented Dec 26, 2016

vicuna commented Dec 26, 2016

vicuna commented Dec 28, 2016

vicuna commented Dec 28, 2016

vicuna commented Jan 14, 2017

github-actions bot commented May 9, 2020