New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A few character codes do not match in ISO 8859-1 with an azerty keyboard #7737
Comments
Comment author: @Octachron The encoding error is on your side: you are not using the ISO 8859-1 character encoding and may be using the code page 437 which does encode 'é' and '£' as 130 and 156 respectively. In other words, when you are typing 'é', you are indeed sending '\130' to the REPL. May I encourage you to address your further inquiries to https://discuss.ocaml.org/ ? |
Comment author: vanto I do not agree. With Caml Light version 0.74, the result is:
#int_of_char (
Reference manual of Caml Light. Page 12 . Book from Xavier Leroy and Pierre Weis. It is written: Are you convinced of what I say? |
Comment author: @Octachron Am I wrong to think that you are using a windows terminal and that you are comparing with the graphical version of Caml Light? If so please check the codepage of your terminal (probably 850). Otherwise, it would be nice to detail your system and terminal settings. |
Comment author: vanto Do not focus on Caml Light. |
Comment author: vanto For Code Pages, please look at these two addresses. |
Comment author: @dra27 It's not obvious that there's a bug in OCaml, because there isn't one - you'd do better to answer the questions asked! As @Octachron notes, the issue is not having code page 1252 selected (here foo.ml contains C:\Users\DRA>chcp Active code page: 437 C:\Users\DRA>type foo.ml let c = '?';; C:\Users\DRA>ocaml OCaml version 4.05.0 # #use "foo.ml";; val c : char = '\233' # let c2 = 'é';; val c2 : char = '\130' # #quit;; C:\Users\DRA>chcp 1252 Active code page: 1252 C:\Users\DRA>type foo.ml let c = 'é';; C:\Users\DRA>ocaml OCaml version 4.05.0 # #use "foo.ml";; val c : char = '\233' # let c2 = 'é';; val c2 : char = '\233' # #quit;; The Caml Light graphical application will use Code Page 1252, and hence "work". It's more time than I be bothered to invest to spin-up an x86 Windows box to be able to run the binary distribution's CAML.EXE, but I would put considerable metamoney on the behaviour of Caml Light run from a Console being the same as OCaml. Haskell in interactive mode uses Haskeline which directly calls ReadConsoleW thus bypassing code pages completely. OCaml, via ReadFile, uses ReadConsoleA which performs code page conversion on the input. As you can see from my example, if you supply the correct input (e.g. via a file) then it performs correctly - note that even the console itself cannot display the character correctly when it bounces through 1252->437/850 conversion and back. Note that Haskell's Data.Char type is equivalent to OCaml's Uchar.t rather than char. |
Original bug ID: 7737
Reporter: vanto
Assigned to: @Octachron
Status: resolved (set by @Octachron on 2018-02-20T22:02:48Z)
Resolution: not a bug
Priority: normal
Severity: minor
Version: 4.06.0
Category: compiler driver
Related to: #7740
Bug description
Reference manual page 110. It is written: "The current implementation interprets character codes between 128 and 255 following the ISO 8859-1 standard."
But the given codes are wrong after code 126 (7Eh).
examples:
int_of_char('£');;
instead of 163 (A3h)
int_of_char('é');;
instead of 233 (E9h)
... and so on.
The result is the same if I open the module Char.
open Char;;
code 'é';;
code '£';;
... and so on.
The text was updated successfully, but these errors were encountered: