Re: ocaml limitations

From: Xavier Leroy (Xavier.Leroy@inria.fr)
Date: Tue Dec 14 1999 - 13:31:14 MET


Date: Tue, 14 Dec 1999 13:31:14 +0100
From: Xavier Leroy <Xavier.Leroy@inria.fr>
To: John Carol Langford <jcl@gs176.sp.cs.cmu.edu>, caml-list@inria.fr
Subject: Re: ocaml limitations
In-Reply-To: <199912130434.XAA12728@gs176.sp.cs.cmu.edu>; from John Carol Langford on Sun, Dec 12, 1999 at 11:34:02PM -0500

> The performance problems from two limitations. The first is in the
> compiler and runtime - it's the limitation on the array size on 32 bit
> machines - I only have linux PC's available to work on. It appears
> that the garbage collector needs some type information to work with
> arrays and enough bits are set aside for type information that not
> enough bits are allowed to specify a large array size. You can get
> around this large array size problem by simulating a large array with
> an array of arrays, but there is a significant performance penalty.

This is correct. We plan to address this by adding "large arrays"
that would be allocatd outside the Caml heap, and handled via an extra
indirection.

> The second problem is a language failure - there is no 'short int'
> type in ocaml. Due to the combinatorics of my problem it would be
> very convenient to use 16 bit integers. Using 32 bit integers instead
> doubles the footprint of the program - which is unacceptable in this
> case. Consequently, I simulate 16 bit integers using masking games -
> which again incurs a performance penalty.

Right. Notice that the code you sent only simulates 15 bit integers
(because there are only 31 bits in a value of type "int"). Arrays of
16 bit integers can be expressed using strings and byte accesses, but
the performance isn't going to be much better than with your code.

> Consequently, I'm wondering if there are plans to
> remove either (or both) of these limitations in the near future and
> lacking that if there are better workarounds.

We have plans to support large integer and float arrays with various
sizes of elements at some point in the future, but the design isn't
complete yet. In particular, there is a tension between flexible but
slow accesses (e.g. multidimensionnal arrays with automatic conversion
between the C and Fortran layouts) on the one hand, and basic but fast
primitives.

- Xavier Leroy



This archive was generated by hypermail 2b29 : Sun Jan 02 2000 - 11:58:29 MET