Browse thread
Array 4 MB size limit
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Aleksey Nogin <nogin@c...> |
| Subject: | Re: [Caml-list] Re: immutable strings (Re: Array 4 MB size limit) |
On 24.05.2006 22:56, Martin Jambon wrote:
>> I think it's OK to have (mutable) byte arrays, but strings should simply
>> always be immutable.
>
> OCaml strings are compact byte arrays which serve their purpose well.
Yes, however immutable strings are also very useful and that
functionality is simply missing in OCaml. The usage I am very interested
in is essentially using strings as "printable tokens". In other words, a
data type that is easy to compare and has an obvious I/O representation.
> Having a whole different type for immutable strings is in my opinion a
> waste of energy. The problem is that freezing or unfreezing a string
> safely involves a copy of the whole string. And obviously it's not
> possible to handle only immutable strings since somehow you have to
> create them, and unlike record fields, they won't be set in one
> operation but in n operations, n being the length of the string.
This is not true. All I want is having a purely functional interface with:
- Constants (a compiler flag for turning "..." constants into immutable
strings instead of mutable ones).
- Inputing from a channel
- Concatenation
- Things like string_of_int for immutable string.
Of course, it might be the case that the standard library might have to
use some sort of "unsafe" operations that would "inappropriately" mutate
the newly created immutable string buffer, but this is IMHO no different
than how the unsafe operations are already used in standard library for
arrays and strings.
> So I'd really love to see actual examples where using immutable strings
> would be such an improvement over mutable strings.
> If the problem is just to ensure that string data won't be changed by
> the user of a library, then it is trivial using module signatures and
> String.copy for the conversions.
Such a copy operation can be extremely prohibitive in a setting that
assumes that a data structure is immutable and tries really hard to
preserve sharing (including using functions like a sharing-preserving
version of map (*), etc). In such a setting, these extra copies can
potentially have a devastating effect on memory usage, cache
performance, etc. And this situation is exactly what we have in our
MetaPRL project - there we have resorted to simply using strings and
pretending they are immutable, but this is clearly suboptimal.
----
(*)
let rec smap f = function
[] -> []
| (hd :: tl) as l ->
let hd' = f hd in
let tl' = smap f tl in
if hd == hd' && tl == tl' then l else hd' :: tl'
--
Aleksey Nogin
Home Page: http://nogin.org/
E-Mail: nogin@cs.caltech.edu (office), aleksey@nogin.org (personal)
Office: Moore 04, tel: (626) 395-2200