Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
Windows filenames and Unicode
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2010-09-29 (06:24)
From: David Allsopp <dra-news@m...>
Subject: RE: [Caml-list] Windows filenames and Unicode
Paul Steckler wrote:
> In Windows, NTFS filenames are specified in Unicode (UTF-16).  Am I right
> in thinking that OCaml file primitives, like open_in, readdir, etc. cannot
> handle NTFS filenames containing characters with codepoints greater than
> 255?

Given that the WinAPI "wide" functions use UTF-16, you can of course fake UTF-16 on top of normal OCaml strings but I think that you'll hit a brick wall because the I/O primitives are based on the underlying C library functions which at the end of the day will be using the ANSI versions of the Windows API functions, not the Unicode ones.

> I'm aware of the Camomile library, which gives the ability to manipulate
> UTF-16 strings inside of OCaml.  But it looks like crucial points of
> OCaml's I/O, like Sys.argv and file primitives are strictly limited to 8-
> bit characters.
> Is there a way around this limitation, other than rewriting the file I/O
> primitives?

A way (but not foolproof on Windows 7 and Windows 2008 R2 because you can disable it) would be to wrap the GetShortPathName Windows API function[1] which will convert the pathname to its DOS 8.3 format which will not contain Unicode characters. Another way might be to wrap the Unicode version of CreateFileEx and convert the result into a handle compatible with the standard library functions but I reckon that could be tricky!



> -- Paul
> _______________________________________________
> Caml-list mailing list. Subscription management:
> Archives:
> Beginner's list:
> Bug reports: