Browse thread
Correct way of programming a CGI script
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Loup Vaillant <loup.vaillant@g...> |
| Subject: | Re: [Caml-list] Re: Rope is the new string |
2007/10/9, Vincent Hanquez <tab@snarc.org>: > On Tue, Oct 09, 2007 at 02:40:48PM +0100, Jon Harrop wrote: > > Out of curiosity, do your ropes handle UTF-8 and UTF-16? > > Out of curiosity, why would a string implementation (has a handle of > chars bundle together) has to handle UTF-X ? My 2 cents: It is more convenient to consider strings as characters arrays. Then, these characters are handled as atoms, even if they take several bytes in the chosen encoding. Of course, multi-byte characters must be supported as well. Still, I can use byte arrays as strings. But it limits me to ASCII and Latin-like encodings: if I want to do UTF-X, then I have to worry about multi-bytes characters myself. Internationalization made hard... I would find very convenient to have plain unicode strings (and chars), with appropriate scan, print, byte_array_from_string, and string_from_byte_array functions, one bundle per supported encoding. So I don't need to think about the internals of such a string. Loup Vaillant