English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
[Caml-list] Bug with really_input under cygwin
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2004-03-11 (03:38)
From: skaller <skaller@u...>
Subject: Re: [Caml-list] Bug with really_input under cygwin
On Thu, 2004-03-11 at 02:25, Nuutti Kotivuori wrote:

> > even if you're processing text. Never depend on the
> > language or OS conversion functions, its very unlikely
> > they'll be right. Do all the conversions needed yourself.
> > At least when you find a problem you're not handling
> > correctly you can fix it.
> Luckily not everybody sees the world as glum :-)

I'm not seeing it as glum. I'm pointing out that
today the situation is vastly more complex due to
belated recognition of the need for Standards to
support I18N issues.

Because of this the idea that \r\n <-> \n is the
only real encoding issue across platforms is wrong.
If only that were the case today, it would be a trivial
problem to resolve.

For example, text files may contain certain header bytes
that indicate if the file is UTF8 encoded, or UCS-2
with big or little endian: these bytes if found must not
be considered as 'text', they're just encoding indicators.

Even within Unicode/ISO-10646 there are myrriad
'encoding' problems, the famous ones being the use
of combining characters -- and that's *after* you have found
the ISO10646 code points :)

So, if you want to handle *text* in a portable way,
you have some work ahead of you. Don't even try to render
it correctly, the required algorithm competes with Mr Ackermann
in performance :D

As long as these kinds of comments are labelled as 'rants'
people will continue to write non-portable software and
fail to face up to the issues.

John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net

To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners