Re: localization, internationalization and Caml

From: Francis Dupont (
Date: Fri Oct 15 1999 - 10:26:12 MET DST

Message-Id: <>
From: Francis Dupont <>
To: skaller <>
Subject: Re: localization, internationalization and Caml
Date: Fri, 15 Oct 1999 10:26:12 +0200

 In your previous mail you wrote:

           The current 'support' for 8 bit characters in ocaml should be
   deprecated immediately. It is an extremely bad thing to have, since
   Latin-1 et al are archaic 8 bit standards incompatible with the
   international standard for ISO10646 communication, namely
   the UTF-8 encoding.

=> there is a rather strong opposition against UTF-8 in France
because it is not a natural encoding (ie. if ASCII maps to ASCII
it is not the case for ISO 8859-* characters, imagine a new UTF-X
encoding maps ASCII to strange things and you'd be able to understand
our concern).

   Yes, I know Latin-1 is useful now for French.

=> it is more than useful, Latin-1 (soon ISO IS 8859-15) is necessary
if you need really readable texts in French.

   The way forward may well be to provide an input filter to convert
   Latin-1 (or any other encoding) to UTF8, and have ocaml process that.

=> my problem is the output of the filter will be no more readable when
I've put too much French in the program (in comments for instance).

   This requires almost no changes to the compiler: the design should
   open the set of characters acceptable in identifiers, probably
   to some subset of the set recommended in one of the ISO10646 related
   documents; the other change required is to accept \uXXXX and \UXXXXXXXX
   escapes in strings. String processing functions should generally
   continue to be 8 bit [per octet]: full internationalisation of client
   string handling functions is a very complex, non-trivial, task]
=> I believe internationalization should not be done by countries
where English is the only used language: this is at least awkward...


This archive was generated by hypermail 2b29 : Sun Jan 02 2000 - 11:58:27 MET