Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

array operations on abstract types are never specialized aka "Information Hiding Considered Harmful to Performance" #7442

Closed
vicuna opened this issue Dec 26, 2016 · 6 comments

Comments

@vicuna
Copy link

vicuna commented Dec 26, 2016

Original bug ID: 7442
Reporter: markghayden
Status: acknowledged (set by @xavierleroy on 2017-01-14T15:25:26Z)
Resolution: open
Priority: normal
Severity: feature
Platform: AMD64
OS: MacOS
OS Version: 10.12.2
Version: 4.04.0
Target version: later
Category: middle end (typedtree to clambda)
Has duplicate: #7440 #7441
Monitored by: @dbuenzli

Bug description

When you create an array of a abstract but concrete type, array operations on the type are never specialized. For instance see AbsInt2 below;

module AbsInt2 :
sig
type t val zero : t
val add_int : int -> t -> int

val array_length : t array -> int
val array_get : t array -> int -> t
end = struct
type t = int
let zero = 0
let add_int = (+)
let array_length a = Array.length (a:t array)
let array_get a i = Array.unsafe_get (a:t array) i
end

If you use Array.get to access an array of type AbsInt.t then the compiled code will check for whether the array is a float array (even though the type t is internally "int") and will compile code to allocate the boxed float if the array is a float array. On the other hand, using AbsInt2.array_get will be specialized as operations on an int array, avoiding the need to check for a float array. This is even though the operations are in the same compilation unit and the compiler has knowledge of the exact type. Using information hiding in the type system should not prevent optimizations by the compiler.

Note that the need to check for float array in the compiled code presumably has follow-on effects in the compiler, since the resulting code is larger I'd guess that this becomes less likely to be inlined -- even though much of it is "dead code".

Steps to reproduce

Compile the attached file with -O3 -unbox-closures with 4.05.0 compiler and review the compiled assembly code. In includes variations on the AbsInt module above. You can expose the internal type as int ("type t =int" in the signature. Or make use of specialized "array_length" functions to avoid the "float array" checks.

File attachments

@vicuna
Copy link
Author

vicuna commented Dec 26, 2016

Comment author: @garrigue

Private abbreviations can help in this case:

type t = private int

It allows to break abstraction, but only in a safe way (in this case).

@vicuna
Copy link
Author

vicuna commented Dec 26, 2016

Comment author: markghayden

Thanks for the suggestion. I don't want my program to break abstraction. I want the compiler to break it, but only for the purposes of optimization. I would think we would want the compiler to do this 100% of the time when it can find some benefit, such as it does for inlining where the abstraction barrier is pierced to extract code for use in external modules. Are there cases where we do not want the compiler to have this information available in case it could help in code generation (such as specializing arrays)? If not, then I don't understand why an annotation should be required when arguably should be used 100% of the time. Maybe I'm missing something?

@vicuna
Copy link
Author

vicuna commented Dec 28, 2016

Comment author: markghayden

Thanks again for the suggestion. I can tell you this is quite a bit of trouble to use 'private' annotations. We have over 300,000 lines of ML code in our application (see http://seaiq.com) and 100's of source files that we need to review looking for cases where 'private' may help. Also, it seems the 'private' annotation does break abstraction, "leaking" to the rest of our program what the underlying type is. We then need to be careful to ensure this information is not actually used. If people think that requiring programmers to make these 'private' annotation is a good resolution to this issue, I strongly disagree. The goal of a programming language and compiler should be to allow programmers to use good programming style but then still compile with good performance. Requiring that programmers not use good programming style (information hiding) in order to avoid run-time checks for floating-point-array on array accesses seems to violate this principle. Maybe I'm missing something here?

@vicuna
Copy link
Author

vicuna commented Dec 28, 2016

Comment author: @garrigue

My answer was just meant as a temporary workaround.
I believe there is already work in flambda to alleviate your concerns (i.e. propagate information under the hood), and there will be more in the future.
But what I would like to emphasize is that this propagation under the hood does not amount to running the type inference again, with abstraction off, but to some more adhoc approach. So it may not be exactly what you would expect in the end.

@vicuna
Copy link
Author

vicuna commented Jan 14, 2017

Comment author: @xavierleroy

See my comment in #7440

@github-actions
Copy link

github-actions bot commented May 9, 2020

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant