Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006064OCamlOCaml otherlibspublic2013-07-04 16:112016-12-07 11:34
Assigned Togasche 
PlatformOSOS Version
Product Version4.00.1 
Target VersionFixed in Version4.02.0+dev 
Summary0006064: Expose a GADT representation for Bigarray.kind
DescriptionThe current interface to Bigarray.kind looks like this:

  type ('a, 'b) kind

  val float32 : (float, float32_elt) kind
  val float64 : (float, float64_elt) kind
  val char : (char, int8_unsigned_elt) kind

The kind type is almost completely abstract -- there are no operations on kinds exposed except for bigarray construction and querying the element kind of an existing bigarray.

Exposing the representation of kind as a GADT would make kinds much more useful. The interface is the obvious one -- there's a nullary constructor for each kind value currently in the signature -- and we can even keep the same in-memory representation (with the small exception described below), since kinds are currently ints:
  type ('a, 'b) kind =
    | Float32 : (float, float32_elt) kind
    | Float64 : (float, float64_elt) kind
    | Char : (char, int8_unsigned_elt) kind

With this change in place it becomes possible to write intensionally-polymorphic functions over bigarrays, such as the following trivial one that sets every element to zero:

  let zero : type a b. (a, b) kind -> a = function
    | Float32 -> 0.0 | Complex32 ->
    | Float64 -> 0.0 | Complex64 ->
    | Int8_signed -> 0 | Int8_unsigned -> 0
    | Int16_signed -> 0 | Int16_unsigned -> 0
    | Int32 -> 0l | Int64 -> 0L
    | Int -> 0 | Nativeint -> 0n
    | Char -> '\000'

  let zero_fill arr = Genarray.fill arr (zero (Genarray.kind arr))

Exposing the representation of kinds in this way would make bigarray/ctypes integration easier, and is apparently useful to other people (e.g. 0003962).

I've attached a patch, written with Leo White, that makes this change and also exposes a GADT representation for Bigarray.layout. It's mostly backwards-compatible, but the representation of char is now distinct from the representation of int8_unsigned.
TagsNo tags attached.
Attached Filespatch file icon bigarray_kind_gadt2.patch [^] (13,831 bytes) 2013-07-04 16:41 [Show Content]
patch file icon bigarray_kind_gadt_macros.patch [^] (14,347 bytes) 2013-07-05 15:25 [Show Content]

- Relationships
related to 0003962closedgasche polymorphic Bigarray.kind_eq 
related to 0006263closed A function Bigarray.kind_byte_size is missing 
related to 0006523resolvedshinwell caml_ba_get_N treats char elements as signed 

-  Notes
lpw25 (developer)
2013-07-04 16:41

That version of the patch was incomplete. I've attached the finished version.
gasche (developer)
2013-07-04 16:42

This patch is doing three different things:

(1) it adds a new CHAR kind to the C runtime part of bigarrays (previously only INT8 was added, and both kind were aliased on the OCaml side)

(2) it adds GADTs to allow richly-typed destruction of kinds by pattern-matching

(3) it modifies the C runtime to take as input from the OCaml side not the previous integers representing kind and layouts, but directly the GADTs values. As constant constructors are represented as integers from 0 to N-1, the kind-handling code (whose OCaml values where consecutive integers) mostly doesn't change, but the layout-handling code (which used 0 and 0x100) needs to change.

This is all rather clever, maybe a tad too much. Another possible option would be to separate the GADT-values used for their rich typing, and the integer-values used to communicate with the runtime. One could either:

(A) keep the code as is (Bigarray.int32 would still be an integer), add GADTs definitions, and add functions turning the integers into the GADTs values; I suspect those int-to-GADT functions need to be unsafe, but that's a reasonable kind of unsafe under the usual phatom type hypotheses

(B) make the GADT definitions the authoritative values on the OCaml side (Bigarray.int32 would be a GADT constructor as in your patch), but write proper tag-to-int conversion functions to avoid having to change the C runtime.

Approach (A) has the defect of making the OCaml-side interface more complex that it is now (two different presentations of kind, one phantom and one GADT).

Approach (B) has the same OCaml-side interface than you propose, is less invasive on the runtime part, but may introduce a slight performance cost (non-magic, non-dubious) conversion. However, one problem is that the Bigarray interface (.mli file) advertizes that, for example, GenArray.create, GenArray.kind and stuff are the C externals, so there is no room for intermediate OCaml-side translations between the input of those functions and the values passed on the C side.

All in all, I think your choice of changing the C side is reasonable. However, I would rather see more explicit and systematic conversions between the OCaml tags and the C-side values. You have added logic for layouts because you had to, but cleverly used 1 as a kind value to make it work transparently. I would rather see back-and-forth macros for both (possibly doing nothing for kinds) to help reviewing the code.
gasche (developer)
2013-07-04 16:46

What are the backward-compatibility implications of changing the char kind?

I guess that bigarrays marshalled with previous versions of OCaml may cause trouble.
yallop (developer)
2013-07-05 15:26

Thanks for the review, Gabriel. I've uploaded an updated patch with macros that convert between the OCaml and C representations of kinds and layouts.
lpw25 (developer)
2013-07-16 12:03

> What are the backward-compatibility implications of changing the char kind?

I think that there are no backward-compatibility implications for type-safe programs. However, some type unsafe code (e.g. marshalling and C bindings) could potentially break, but probably only if it is quite fragile.

As long as code does not assume that Bigarray.int8_unsigned = Bigarray.char it will be fine.

> I guess that bigarrays marshalled with previous versions of OCaml may cause trouble.

Old char bigarrays would be read as int8_unsigned bigarrays, other than that there should be no problems.
gasche (developer)
2014-01-09 17:28

Included in trunk. Thanks for the patch!

I think this is technically the first use of GADT for greater good included in the OCaml codebase, but to be fair I think that BenoƮt Vaugon's enormous work in PR#6017, which I hope to include as well soon-ish, will by release time have taken precedence in scope and practical impact.
dbuenzli (reporter)
2014-01-31 00:37

Btw. I don't know if there's a way to add this without breaking everything. But one thing I miss is kinds for *unsigned* int32 and int64. They would still be read/written with int32 and int64 but you'd actually know how you are supposed to interpret them.

- Issue History
Date Modified Username Field Change
2013-07-04 16:11 yallop New Issue
2013-07-04 16:11 yallop File Added: bigarray_kind_gadt.patch
2013-07-04 16:41 lpw25 File Added: bigarray_kind_gadt2.patch
2013-07-04 16:41 lpw25 Note Added: 0009693
2013-07-04 16:42 gasche Note Added: 0009694
2013-07-04 16:46 gasche Note Added: 0009695
2013-07-04 23:07 doligez Status new => acknowledged
2013-07-05 15:25 yallop File Added: bigarray_kind_gadt_macros.patch
2013-07-05 15:26 yallop Note Added: 0009699
2013-07-05 15:47 gasche File Deleted: bigarray_kind_gadt.patch
2013-07-16 12:03 lpw25 Note Added: 0009787
2013-07-16 12:10 lpw25 Relationship added related to 0003962
2013-12-07 18:59 gasche Relationship added related to 0006263
2014-01-09 17:28 gasche Note Added: 0010786
2014-01-09 17:30 gasche Status acknowledged => resolved
2014-01-09 17:30 gasche Fixed in Version => 4.02.0+dev
2014-01-09 17:30 gasche Resolution open => fixed
2014-01-09 17:30 gasche Assigned To => gasche
2014-01-31 00:37 dbuenzli Note Added: 0010870
2014-08-28 19:11 yallop Relationship added related to 0006523
2016-12-07 11:34 xleroy Status resolved => closed

Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker