English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
Storing UTF-8 in plain strings
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2009-08-13 (09:37)
From: Dario Teixeira <darioteixeira@y...>
Subject: Re: [Caml-list] Storing UTF-8 in plain strings

> I'm using Ulex + Menhir to parse UTF-8 encoded source code, and I'm relying
> on plain strings for processing and storing data.  I *think* I can get away
> with using only the String module to handle this variable-length encoding
> as long as I am careful with the way I treat these strings.  Here are the
> assumptions I am making:

Thank you all for your comments.  Ulex has caught all the intentionally
malformed code points I've inserted in the stream, so I'm fairly confident
it's up to the task.  But if I find a problem I'll keep Netconversion's
and Extlib's validation functions in mind...

Best regards,
Dario Teixeira