Version française
Home     About     Download     Resources     Contact us    
Browse thread
8-bit characters on command line
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Dmitry Bely <dmitry.bely@g...>
Subject: Re: [Caml-list] 8-bit characters on command line
On Fri, May 14, 2010 at 8:36 AM, Paul Steckler
<Paul.Steckler@nicta.com.au> wrote:
> I have an OCaml 3.11 program that prints out the arguments on the command line:
>
>  let main =Array.iter (Printf.printf "arg = %s\n") Sys.argv
>
> On Linux, if I provide a command line argument containing 8-bit characters,
> like é (an e with an acute accent), the program above, compiled with ocamlopt
> or ocamlc, prints them faithfully.
>
> For Windows, I can compile the program above with ocamlc on Windows, or cross-compile
> it with MinGW-ocaml on Linux.  In both cases, any 8-bit characters in the command
> line are printed as garbage.  I've tried running the program from rxvt (a shell for
> Cygwin) and Windows cmd.exe.

I believe that's because there are actually two current code pages in
Windows: "OEM" code page for console input/output and "ANSI" one for
everything else. In mode detail:

http://msdn.microsoft.com/en-us/library/dd317752%28VS.85%29.aspx

E.g. in my system ANSI/OEM code pages are 1251/866. In your case they
are probably 1252/437.

Program arguments and any 8-bit character strings inside an
application are considered to have ANSI encoding (as that's what
non-Unicode Windows API functions expect), but console output
functions perform ANSI->OEM code page translation. So you see a
garbage.

- Dmitry Bely