[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2001-09-11 (09:13) |
From: | Jun P. FURUSE <Jun.Furuse@i...> |
Subject: | Re: [Caml-list] Q: multibyte encoding for CJK |
Hi, > When I tested mutibyte variables in caml-light, > it showed "Illegal character". > > Do you have any Idea > how to use multibyte variable for Chinese, Japan, Korean > in caml-light or ocaml? Camllight (and O'Caml) is not designed for multibyte Asian languages. In Camllight, the identifiers (variables) must begin with an "alphabet" followed by alphabets, numbers, _, or '. The "alphabets" are A-Z, a-z and the accented characters like á ç (in the HTML encoding). However, if you have enough luck, you can still use your Asian keywords. The condition is: you must use EUC (= extended unix code) encoding, and your identifier cannot contain any character code except 0xc0-0xd6 0xd8-0xf6 0xf8-0xff in Unix... (The legal upper-byte characters for identifiers are restricted to the European accented alphabets.) Well, as far as I know, this means that the use of Japanese identifiers is practically impossible. I am not an expert of Asian encodings, but I am afraid that so do Chinese and Korean. BTW, the use of your language inside strings "..." has no problem, if you use EUC encoding. But of course you will have trouble with string_length, sub_string, etc... Hope this helps, -- JPF ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr