Browse thread
Unicode (was RE: JIT-compilation for OCaml?)
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2001-01-12 (09:02) |
From: | Xavier Leroy <Xavier.Leroy@i...> |
Subject: | Re: Unicode (was RE: JIT-compilation for OCaml?) |
> I thought Unicode was a recognised subset of ISO-10646, corresponding to the > range 0-2^16. Also, don't Windows NT/2000 use Unicode? Yes, Win32 (i.e. 95, 98, ME, NT, 2000 and whatnot) uses 16-bit characters. Java too. But Unix C libraries that support wide chars seem to prefer 32-bit characters. Remember: "Standards are great: there are so many to choose from." > (I realise this isn't directly on-topic, but it may be relevant for future > extensions to OCaml?) It is very relevant indeed. We've been contemplating adding some simple support for wide characters and wide strings, e.g. as two new library modules, but the stumbling point is whether to use 16-bit or 32-bit wide characters. While 32 bits is probably the wave of the future, 16 bits is what we need to interface easily with Java and with many Microsoft products (e.g. COM dispatch components, Visual Basic, various Win32 APIs). Shall we "do it right" (for some notion of "right") or favor interoperability? Hard question. My current answer is to procrastinate... Actually, multi-byte encoded strings (UTF-8) are not so bad and already have full support in OCaml :-) - Xavier Leroy