Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
Windows filenames and Unicode
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2010-09-29 (07:59)
From: David Allsopp <dra-news@m...>
Subject: RE: [Caml-list] Windows filenames and Unicode
Paul Steckler wrote:
> On Wed, Sep 29, 2010 at 4:23 PM, David Allsopp <>
> wrote:
> > A way (but not foolproof on Windows 7 and Windows 2008 R2 because you
> > can disable it) would be to wrap the GetShortPathName Windows API
> > function[1] which will convert the pathname to its DOS 8.3 format which
> > will not contain Unicode characters. Another way might be to wrap the
> > Unicode version of CreateFileEx and convert the result into a handle
> > compatible with the standard library functions but I reckon that could be
> > tricky!
> For Linux, I was planning on enforcing the invariant that all strings
> inside my program are UTF-8.
> For Windows, I could use the same invariant, and modify the OCaml runtime
> so that all calls to Windows file primitives have those strings translated
> to UTF-16 (and return values translated back to UTF-8).  That is, I'd have
> to build a custom version of OCaml and wrap CreateFile, etc. with such
> Unicode translation functions.

Rather than hacking the OCaml runtime (the relevant code is {byte,asm}run/sys.c, btw) personally I'd produce a separate module of my own with two implementations - one for Linux which just uses the built-in primitives and then one for Windows using WinAPI functions directly. A cursory glance at the runtime code suggests that hacking wide support onto the runtime is not a "one-liner".

> All this is made slightly more complicated by the fact that I'm using the
> MinGW version of OCaml.

Shouldn't make it (too much) harder - I use the MinGW build of OCaml without issue for C stubs, after a slightly steep learning-curve. The w32api package in Cygwin provides all of the headers and link libraries for Windows libraries (it's installed by default with the gcc-mingw). If you ever have to link with more exotic libraries, dlltool is (sort of) your friend (it with a little bit of help allows you to generate the .a files needed for the DLL you're linking with - 3rd party libraries on Windows tend only to ship the MSVC .lib files)

> Hmmm, I shouldn't have to do this.  Are there plans afoot to modernize
> OCaml's string-handling?

Can o' worms, I expect! Windows (by which I mean Windows NT) took the simplest route of 16-bit wchars for Unicode but it's not necessarily the best way of programming in general... the problem with going Unicode is *how* you go Unicode.