Version française
Home     About     Download     Resources     Contact us    
Browse thread
localization, internationalization and Caml
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: skaller <skaller@m...>
Subject: Re: localization, internationalization and Caml
STARYNKEVITCH Basile wrote:
> 
> By the way, I more and more believe that the printf interface is (in C
> as in Ocaml) a big mistake (which could easily be avoided in Ocaml,
> thanks to it typing)

	I agree but ..
 
> We should code
> 
>   print [Int 2; String " < "; Float 3.14]
> 
> instead of
> 
>   printf "%d < %g" 2 3.14

	However, I do not agree with the solution.
The correct method, IMHO, is to provide some proper formatting
functions (ocamls are plain WRONG!) such as

	formatted_string_of_int justify width value

[where justify is LeftSpace |  RightSpace | LeftZero]

	and then use the power of functional programming
to create output strings. {the above is only a quick exemplary
interface,
not a well considered one]
 
> Again, I am *not* asking for localization in Ocaml, but if somebody
> needs it (I don't) I still hope it would be implemented better than in
> C. And I think that Unicode would be more useful than localization.

	Please, ISO10646 not unicode. 
We have International Standards. There is a lot of work to be done in
internationalisation. If it is worth doing, it is worth doing right.

	The current 'support' for 8 bit characters in ocaml should be
deprecated immediately. It is an extremely bad thing to have, since
Latin-1 et al are archaic 8 bit standards incompatible with the
international standard for ISO10646 communication, namely 
the UTF-8 encoding. Yes, I know Latin-1 is useful now for French.
The way forward may well be to provide an input filter to convert
Latin-1 (or any other encoding) to UTF8, and have ocaml process that.
This requires almost no changes to the compiler: the design should
open the set of characters acceptable in identifiers, probably
to some subset of the set recommended in one of the ISO10646 related
documents; the other change required is to accept \uXXXX and \UXXXXXXXX
escapes in strings. String processing functions should generally
continue to be 8 bit [per octet]: full internationalisation of client
string handling functions is a very complex, non-trivial, task]