[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Xavier Leroy <xavier.leroy@i...> |
| Subject: | Re: [Caml-list] internal representation of string |
> What is the internal representation of string? Is it basically a C-string
> [with or without terminating '\0'] plus integer storing its size? Or is it
> something more sophisticated?
Like all heap blocks, strings contain a header defining the size of
the string in machine words. The actual block contents are:
- the characters of the string
- padding bytes to align the block on a word boundary.
The padding is one of
00
00 01
00 00 02
00 00 00 03
on a 32-bit machine, and up to 00 00 .... 07 on a 64-bit machine.
Thus, the string is always zero-terminated, and its length can be
computed as follows:
number_of_words_in_block * sizeof(word) + last_byte_of_block - 1
The null-termination comes handy when passing a string to C, but is
not relied upon to compute the length (in Caml), allowing the string
to contain nulls.
> Also, do functions like String.sub implement
> copy-on-write mechanism or do they copy when they are called?
They copy when they are called. Caml strings really behave like
compactly-represented character arrays.
Hope this answers your question,
- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners