Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unix.open_process_full has trouble with UTF-8 environment variables on Linux #7561

Closed
vicuna opened this issue Jun 21, 2017 · 6 comments
Closed

Comments

@vicuna
Copy link

vicuna commented Jun 21, 2017

Original bug ID: 7561
Reporter: @nojb
Status: resolved (set by @xavierleroy on 2017-06-22T18:06:23Z)
Resolution: not a bug
Priority: normal
Severity: minor
Category: otherlibs
Monitored by: @dbuenzli

Bug description

I found the following puzzling behaviour while working on #1200.

Unix.open_process_full seems to have problems with with UTF-8 characters in the "env" parameter. See steps to reproduce below.

  • When passing ASCII characters it works fine.

  • Using the Unix.execve call (rather than Unix.open_process_full) seems to work fine.

  • It also works fine on Windows (where a different implementation is used).

  • I tried it with trunk and 4.05.

Steps to reproduce

Put the following in printenv.c:

#include 

int main(int argc, char ** argv, char ** envp)
{
  int i = 0;
  while (envp[i]) printf("%s\n", envp[i++]);
  return 0;
}

and in test.ml:

let envp = [|
  "e?te?=????????";
  "simple=??";
  "sœur=????";
  "??=????";
|]

let () =
  let (ic, _, _) as proc = Unix.open_process_full "./printenv.exe" envp in
  let rec loop () =
    match input_line ic with
    | s -> s :: loop ()
    | exception End_of_file -> []
  in
  List.iter print_endline (loop ());
  ignore (Unix.close_process_full proc)

Compile & run by doing

gcc printenv.c -o printenv.exe
ocaml unix.cma test.ml

In my machine I see only the second line of envp and PWD as output.

File attachments

@vicuna
Copy link
Author

vicuna commented Jun 21, 2017

Comment author: @nojb

I uploaded test.ml since Mantis messed up the non-ASCII characters.

@vicuna
Copy link
Author

vicuna commented Jun 22, 2017

Comment author: @nojb

This morning in an Arch Linux box (I tried it in a Ubuntu VM yesterday) I cannot reproduce the problem.

@vicuna
Copy link
Author

vicuna commented Jun 22, 2017

Comment author: @nojb

However, it is still there in my Ubuntu 16.04 (Xenial) VM. It would be useful if someone could try it out in a non-VM setting.

@vicuna
Copy link
Author

vicuna commented Jun 22, 2017

Comment author: @nojb

It seems the problem is when the "keys" (to the left of the equal sign) are not ASCII.

@vicuna
Copy link
Author

vicuna commented Jun 22, 2017

Comment author: @nojb

Aha! mystery solved. There are some shells (e.g. dash, but not bash) which do not allow any non ASCII character in variable names. Unix.open_process_full uses /bin/sh to run the command. In Ubuntu, /bin/sh is a symlink to dash, hence the observed behaviour. Under OS X (and I bet Arch, but can't check now), /bin/sh is an alias to /bin/bash so no problem there.

So, not a bug after all!

@vicuna
Copy link
Author

vicuna commented Jun 22, 2017

Comment author: @xavierleroy

Thanks for the detective work! Marking this as "resolved" because it's not an OCaml issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant