Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose a GADT representation for Bigarray.kind #6064

Closed
vicuna opened this issue Jul 4, 2013 · 7 comments
Closed

Expose a GADT representation for Bigarray.kind #6064

vicuna opened this issue Jul 4, 2013 · 7 comments

Comments

@vicuna
Copy link

vicuna commented Jul 4, 2013

Original bug ID: 6064
Reporter: @yallop
Assigned to: @gasche
Status: closed (set by @xavierleroy on 2016-12-07T10:34:36Z)
Resolution: fixed
Priority: normal
Severity: feature
Version: 4.00.1
Fixed in version: 4.02.0+dev
Category: otherlibs
Related to: #3962 #6263 #6523
Monitored by: @lpw25 @hcarty @avsm @dbuenzli @Chris00

Bug description

The current interface to Bigarray.kind looks like this:

type ('a, 'b) kind

val float32 : (float, float32_elt) kind
val float64 : (float, float64_elt) kind
[...]
val char : (char, int8_unsigned_elt) kind

The kind type is almost completely abstract -- there are no operations on kinds exposed except for bigarray construction and querying the element kind of an existing bigarray.

Exposing the representation of kind as a GADT would make kinds much more useful. The interface is the obvious one -- there's a nullary constructor for each kind value currently in the signature -- and we can even keep the same in-memory representation (with the small exception described below), since kinds are currently ints:

type ('a, 'b) kind =
| Float32 : (float, float32_elt) kind
| Float64 : (float, float64_elt) kind
[...]
| Char : (char, int8_unsigned_elt) kind

With this change in place it becomes possible to write intensionally-polymorphic functions over bigarrays, such as the following trivial one that sets every element to zero:

let zero : type a b. (a, b) kind -> a = function
| Float32 -> 0.0 | Complex32 -> Complex.zero
| Float64 -> 0.0 | Complex64 -> Complex.zero
| Int8_signed -> 0 | Int8_unsigned -> 0
| Int16_signed -> 0 | Int16_unsigned -> 0
| Int32 -> 0l | Int64 -> 0L
| Int -> 0 | Nativeint -> 0n
| Char -> '\000'

let zero_fill arr = Genarray.fill arr (zero (Genarray.kind arr))

Exposing the representation of kinds in this way would make bigarray/ctypes integration easier, and is apparently useful to other people (e.g. #3962).

I've attached a patch, written with Leo White, that makes this change and also exposes a GADT representation for Bigarray.layout. It's mostly backwards-compatible, but the representation of char is now distinct from the representation of int8_unsigned.

File attachments

@vicuna
Copy link
Author

vicuna commented Jul 4, 2013

Comment author: @lpw25

That version of the patch was incomplete. I've attached the finished version.

@vicuna
Copy link
Author

vicuna commented Jul 4, 2013

Comment author: @gasche

This patch is doing three different things:

(1) it adds a new CHAR kind to the C runtime part of bigarrays (previously only INT8 was added, and both kind were aliased on the OCaml side)

(2) it adds GADTs to allow richly-typed destruction of kinds by pattern-matching

(3) it modifies the C runtime to take as input from the OCaml side not the previous integers representing kind and layouts, but directly the GADTs values. As constant constructors are represented as integers from 0 to N-1, the kind-handling code (whose OCaml values where consecutive integers) mostly doesn't change, but the layout-handling code (which used 0 and 0x100) needs to change.

This is all rather clever, maybe a tad too much. Another possible option would be to separate the GADT-values used for their rich typing, and the integer-values used to communicate with the runtime. One could either:

(A) keep the code as is (Bigarray.int32 would still be an integer), add GADTs definitions, and add functions turning the integers into the GADTs values; I suspect those int-to-GADT functions need to be unsafe, but that's a reasonable kind of unsafe under the usual phatom type hypotheses

(B) make the GADT definitions the authoritative values on the OCaml side (Bigarray.int32 would be a GADT constructor as in your patch), but write proper tag-to-int conversion functions to avoid having to change the C runtime.

Approach (A) has the defect of making the OCaml-side interface more complex that it is now (two different presentations of kind, one phantom and one GADT).

Approach (B) has the same OCaml-side interface than you propose, is less invasive on the runtime part, but may introduce a slight performance cost (non-magic, non-dubious) conversion. However, one problem is that the Bigarray interface (.mli file) advertizes that, for example, GenArray.create, GenArray.kind and stuff are the C externals, so there is no room for intermediate OCaml-side translations between the input of those functions and the values passed on the C side.

All in all, I think your choice of changing the C side is reasonable. However, I would rather see more explicit and systematic conversions between the OCaml tags and the C-side values. You have added logic for layouts because you had to, but cleverly used 1 as a kind value to make it work transparently. I would rather see back-and-forth macros for both (possibly doing nothing for kinds) to help reviewing the code.

@vicuna
Copy link
Author

vicuna commented Jul 4, 2013

Comment author: @gasche

What are the backward-compatibility implications of changing the char kind?

I guess that bigarrays marshalled with previous versions of OCaml may cause trouble.

@vicuna
Copy link
Author

vicuna commented Jul 5, 2013

Comment author: @yallop

Thanks for the review, Gabriel. I've uploaded an updated patch with macros that convert between the OCaml and C representations of kinds and layouts.

@vicuna
Copy link
Author

vicuna commented Jul 16, 2013

Comment author: @lpw25

What are the backward-compatibility implications of changing the char kind?

I think that there are no backward-compatibility implications for type-safe programs. However, some type unsafe code (e.g. marshalling and C bindings) could potentially break, but probably only if it is quite fragile.

As long as code does not assume that Bigarray.int8_unsigned = Bigarray.char it will be fine.

I guess that bigarrays marshalled with previous versions of OCaml may cause trouble.

Old char bigarrays would be read as int8_unsigned bigarrays, other than that there should be no problems.

@vicuna
Copy link
Author

vicuna commented Jan 9, 2014

Comment author: @gasche

Included in trunk. Thanks for the patch!

I think this is technically the first use of GADT for greater good included in the OCaml codebase, but to be fair I think that Benoît Vaugon's enormous work in #6017, which I hope to include as well soon-ish, will by release time have taken precedence in scope and practical impact.

@vicuna
Copy link
Author

vicuna commented Jan 30, 2014

Comment author: @dbuenzli

Btw. I don't know if there's a way to add this without breaking everything. But one thing I miss is kinds for unsigned int32 and int64. They would still be read/written with int32 and int64 but you'd actually know how you are supposed to interpret them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants