Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005256OCamlOCaml generalpublic2011-04-25 20:032015-12-08 11:54
Assigned To 
PlatformOSOS Version
Product Version3.11.2 
Target VersionlaterFixed in Version 
Summary0005256: Processes opened using Unix.open_process* inherit all opened file descriptors (including sockets)

Processes opened using Unix.open_process in POSIX systems are spawned using fork(). Therefore, they inherit all file descriptors opened by the calling process.

This leads to many unfortunate minor issues. For instance, if the calling process has opened a port and crashes after calling the external process, then the port remains open until the external process terminates..

More importantly, this is a source of potentially important security issues. By default, processes spawned with Unix.open_process* will have access to any opened file or socket, allowing them to read files' content or sniff network traffic..

I am attaching a simple file that spawns an external process reading a "secret" file.

I do not know whether this is considered as a bug or not for you (or maybe even already known) but I think this may be of sufficent importance to put a Private status until you decide what to do.

Concerning a possible fix, I think the problem is not obvious. In believe that at least this issue should be properly documented in Unix.mli.

I have been reading through different source code and I found the following interesting portable implementation of "closefrom" from openssh's code: [^]

I believe that a good OCaml solution would consists of the following:
 * A optional hook in Unix.open_process* to execute any given function before spawning a process
 * A default implementation of closefrom
This way, the programmer could simply pass fun () -> Unix.closefrom 3 to Unix.open_process* and make sure that all file descriptors >= 3 are closed before spawning the external process.

Finally, I have no idea at all whether this issue is present or not under windows. It should be relatively easy to test, though.

I am willing to provide some patches/fixes if you wish so.
TagsNo tags attached.
Attached Files? file icon [^] (508 bytes) 2011-04-25 20:03 [Show Content]

- Relationships
related to 0005568closed Unix.open_flag lacks O_CLOEXEC 
related to 0005569closed missing Unix.dup_cloexec, Unix.get_cloexec and Unix.set_cloexec 
related to 0006947acknowledged Missing support for SOCK_NONBLOCK and SOCK_CLOEXEC 

-  Notes
toots (reporter)
2011-04-25 20:33

I forgot to add: the issue also happends with Unix.create_process.

A quick check through ocsigen's source code shows that create_process is isued for their implementation of CGI script handlers. I have not actually written an exploit but I would not be surprised if in this case CGI scripts executed through ocsigen could possibly do some non expected things..

Also, I am aware of Unix.set_close_on_exec. However, file descriptors can be opened by external modules and it may be impossible to track them all and mark them close_on_exec...
toots (reporter)
2011-04-28 03:56

Hi again,

Just a follow-up concerning the possible fix mentioned above. I realize now that "closefrom" is not ideal because by default Unix fd are abstract objects..

I see two possibilities:
 * Implement "closeall" which closes all fd except std{in,out,err}. Win32 has such function and it would not be hard to implement it for POSIX. However, this is not a POSIX function per-say..
 * Implement a int_of_fd function in OCaml and, then, "closefrom".

I personally preffer the second one because I also believe int_of_fd could be useful for other uses, such as passing the C descriptor to an external process/code.
toots (reporter)
2012-03-14 00:22


Any input on this one? I really think it should be taken care of.. If nothing, I'm considering making it public so that it is at least documented..
xleroy (administrator)
2012-03-14 09:22

Feel free to publicize this PR so that Unix experts out there can chime in.

This is an old issue that popped up during the development of Cash (the Caml shell), in particular. The "ideal" solution is to have all file descriptors in close-on-exec mode, but it requires discipline from the programmers. (File descriptors created by Pervasives.open_in and Pervasives.open_out are close-on-exec, but not those created by Unix.openfile or Unix.socket, for compatibility with the Unix specs.)

Also, a long time ago we experimented with a closeall()/closefrom() function that did not use /proc/<pid>/fd, and it was really slow to call close() on thousands of potential file descriptors. The /proc/<pid>/fd trick is nice but not terribly portable.
toots (reporter)
2012-03-15 01:08

Thanks for the details! I totally understand your concern from the language point of view.

I'll write about the issue but I do not know of any way to switch this into a public issue. If you know one, it'd be nice to do it in order to preserve the history, otherwise I'll submit another one..
xleroy (administrator)
2012-03-15 09:30

PR is in public mode now.
gerd (reporter)
2012-03-15 15:26

The Shell library in Ocamlnet solved this problem in deed by iterating over all possible descriptors, and by closing them, ignoring errors. I cannot say that this solution is terribly slow - but it is of course Unix-only. I'd not expect that reading /proc/<pid>/fd is generally faster, at least for the default 1024 descriptors (but if the limit is higher it could be).

Generally spoken, this is a design error of the Unix API. Descriptors should have better been close-on-exec by default. Currently, there is even a race condition if you exec when you have multiple threads, because the exec can happen between open() and the fcntl() setting close-on-exec. So you'd not catch this descriptor, and it remains open. (Recently, Linux added a new bunch of syscalls to address this issue, e.g. pipe2 allows one to set the flag from the beginning.)

That said, requiring from the programmer to set close-on-exec is not a safe solution (in multi-threaded programs).

My wisdom points into the direction to let the user select what is best in his case, i.e. add a switch to open_process_* whether to close.
goswin (reporter)
2012-04-04 18:50

There seems to be 4 ways to close all FDs:

- close 3-1024 - totaly unsave if dup2() is used
- closefrom(lowfd) - not present on my linux (Debian unstable, kernel 3.2)
- fcntrl F_CLOSEM - not present on my linux (Debian unstable, kernel 3.2)
- /proc/<pid>/fd - /proc might not be mounted, I would rather not rely on that

The first three also will be lots of fun if some library uses dup2(0, max_int) and the user needs that FD to remain open. There is a function fdwalk() that might be more suitable there, but again not present on my linux.

In conclusion: There simply is no sane way to close all filedescriptors, which is why CLOEXEC was added. If defaulting to CLOEXEC is not an option then the Unix API should be extended to allow the user to specify the flag (e.g. open_files) or have 2 flavours of functions (e.g. socket + socket_close_on_exec).
gerd (reporter)
2012-04-04 20:49

goswin: under Unix, you cannot do dup2(0,max_int). At most you can do dup2(0,sysconf(_SC_OPEN_MAX)-1), where sysconf(_SC_OPEN_MAX)-1 is the current maximum value for a file descriptor (e.g. 1023 on Linux or 255 on OS X). So refined method 1 is totally applicable, although inelegant. This means that there is a safe fallback method if the OS does not provide something better.
goswin (reporter)
2012-04-05 11:21

Yes, there usualy is an upper limit. My point was rather that the limit might be large:

#include <stdio.h>
#include <unistd.h>

int main() {
  int max = sysconf(_SC_OPEN_MAX);
  int fd = dup2(0, 1000000);
  printf("max = %d, dup2(0, 1000000) = %d\n", max, fd);
  return 0;
# ./foo
max = 1000001, dup2(0, 1000000) = 1000000

The usual value of 1024 (or 256 on OS X) isn't a hard limit. It can be raised and there are situations where more than 1024 file descriptors can be needed by an application under normal operations. Specifically P2P software or fuse filesystems can easily exceed the normal limit.

Having to iterate over all possible file descriptors on every exec is a fallback option but it is not something I want to do. POSIX does provide something better called CLOEXEC so I would rather use that.
gmelquiond (reporter)
2012-08-02 20:56

I just want to chime in to say that I just encountered this issue. And it is not just a matter of security; it is also a matter of usability! In order to achieve some portability, I tried to replace fork with create_process in a program; the program stopped working on Linux as a consequence (and presumably on any fork-based system too).

Basically, using create_process + pipe does not work (except on Windows). Here is a toy example that just executes the pipeline "echo toto | cat":

open Unix
let (pin,pout) = pipe ()
let p2 = create_process "cat" [|"cat"|] pin stdout stderr
let _ = close pin
let p1 = create_process "echo" [|"echo"; "toto"|] stdin pout stderr
let _ = close pout
let _ = waitpid [] p2

On Linux, this program never terminates!

Indeed, cat never receives an EOF on its input pipe, since fork has duplicated the output pipe and it can no longer be closed. Obviously, changing the creation+closing order solves the issue. But in my actual use case, this is not possible because the second process has to be created before the output pipe stops being useful.

Using closefrom in the create_process call was a lightweight and efficient solution, but it kind of defeated my goal of having a portable program. So, in the end, the best I could come with was to redefine create_process and pipe (and all the other file-descriptor producers):

let pipe () =
  let (a,b) = pipe () in
  set_close_on_exec a; set_close_on_exec b;

let create_process m n a b c =
  clear_close_on_exec a; clear_close_on_exec b; clear_close_on_exec c;
  let p = create_process m n a b c in
  set_close_on_exec a; set_close_on_exec b; set_close_on_exec c;

For the sake of readability, this is just the short version without any error handling. The correct code is much uglier.
xleroy (administrator)
2012-08-06 18:09

I agree something should be done to address this issue. Trying to make progress, let me list several options.

- Option 1: change the Unix library so that all file descriptors it creates are in close-on-exec mode by default.

Probably what we should have done in the first place, in retrospect. However, this change will break all codes that use fork + exec instead of create_process. Too risky?

- Option 2: record all file descriptors created by the Unix library (we have a nifty skip list implementation in the OCaml runtime system that would be perfect for this usage); provide a "Unix.close_all_except [list of FDs]" function that uses this list; use it in Unix.create_process.

Pros: backward compatible; portable. Cons: doesn't close FDs not created by OCaml.

- Option 3: provide and use a "Unix.close_all_except" function, as in option 2, but that works over all file descriptors, created by OCaml or not. The implementation would use all tricks in the book: scanning /proc/fd, using closefrom() if available, and falling back on a "for" loop up to SC_OPEN_MAX.

Pros: a radical solution to the problem; backward compatible. Cons: #ifdef fest; can be slow in the worst cases.

[ FYI: I ran once into an IBM AIX server that had SC_OPEN_MAX = 2^15. ]


gmelquiond (reporter)
2012-08-06 21:51

Forget what I said about my small testcase working fine on Windows, it doesn't. It was just a local setup issue that was causing the waitpid call to fail, and therefore the program to terminate properly. The program does deadlock on Windows too. And that is not really surprising: the Windows port goes to great lengths to ensure that all the open file descriptors are actually inherited by the child process (while the default behavior on Windows is that they are not). So on Windows too, set_close_on_exec is mandatory for my testcase to pass.

Anyway, back to the topic at hand. If one were to start from a blank state, then option 1 would be the most sensible solution in my opinion. Breaking current programs isn't even too bad, since it is easy to modify them in a way that would work for both newer and older versions of the caml runtime. But it still feels a bit too disruptive unfortunately.

For option 2, the cons you mention are probably mitigated in practice. On Windows, a library would presumably not bother marking its descriptors as inheritable, so they would be closed anyway. And on Unix-like systems, hopefully libraries are good citizens that mark them cloexec.

Regarding option 3, I'm not so sure any longer that closefrom is a good idea. I'm thinking about bug 0005266 here. Note that I don't know if this bug can actually occur in practice; I somehow doubt it. But still, assuming it is a valid concern, what the forked process can do is rather limited. Thus it might be impossible to implement a clever closefrom, if the system does not provide it.

In the end, I'm not even sure there is something that needs to be done, except perhaps on the documentation side. Explicitly telling the user in the documentation of open_file, pipe, dup, and so on, that set_close_on_exec should systematically be called on the values returned by these functions would go a long way toward solving the problem. Then create_process could be changed to take care that the passed fds are not flagged cloexec.
gmelquiond (reporter)
2012-08-07 08:06

And for the sake of completeness, here is the suggested addendum to the documentation of openfile, dup, dup2, pipe, socket, socketpair, accept (I may have missed some):

"For backward-compatibility reasons, the file descriptor is created in a way such that it is automatically inherited by any child process (fork+exec, system, create_process, open_process*). This can lead to unexpected behaviors and has severe security implications. As such, it is mandatory to call set_close_on_exec on the returned file descriptor. For multi-threaded processes, use instead the variant XXX, which combines both calls into a single one to avoid race conditions."
gerd (reporter)
2012-08-07 12:46

gmelquiond: So far I understand, bug 5266 makes it mainly impossible to call malloc - most system calls are in fact async-signal safe, but library calls are often not. But you are right, using the Ocaml runtime is not allowed, and an implementation of the post-fork part needs to be done in C.

IMHO, the most reasonable option is number 3, together with a fast path the user can enable where descriptors are not closed - for the case the user controls the file descriptors carefully. I don't really see the slowness - even if you increase OPEN_MAX you normally do this only for processes also using that many descriptors (or you did not understand how Unix works).

Option 1 is too risky, and it will create a lot of confusion when the functions in the Unix module have different defaults than specified by POSIX. At most, one could add a ?cloexec:bool optional parameter everywhere.

Option 2: The real problem are descriptors that are partly controlled from Ocaml, and partly from a C library. You cannot e.g. track when the C library closes or dups a descriptor that was opened before on the OCaml side. So, it is also possible that there are too many descriptors in the skip list, not only too few. Overall, I think this will make the behavior more random.

I'd like to point out that I went through the implementation of option 3 already, and the result is the Netsys_posix.spawn function: [^] [^]

Note that it always iterates over the file descriptors and closes them (controlled by the Fda_close_except array), i.e. it does not even try to use closefrom or another accelerator. A different kind of optimization seems to be far more important, namely using posix_spawn instead of fork/exec. posix_spawn is often implemented via vfork or is a system call itself (OS X), and it is way faster then fork/exec even on an OS like Linux that delays the most expensive parts of duplicating a process. (The downside of posix_spawn: closing possibly closed descriptors requires a trick, and not all OS have it - although the "important" OS do have it - but you need a traditional fallback using fork/exec.)

If there is anything useful in netsys_c_spawn.c, feel free to copy it.
gerd (reporter)
2012-08-07 14:38

Note that we are not the only ones thinking about the problem. There was a hot debate in LKML about whether to add a nextfd() syscall:!topic/fa.linux.kernel/YCmgXbgH7qo [^]

AFAIK, there is no resolution yet, neither in the kernel nor in glibc.

There is also a very interesting description of the problem space from the Austin Group (i.e. the POSIX maintainers): [^]

But they seemed to have delayed any work on the problem. The opinion is that it is more important to work on new system calls allowing to atomically set the CLOEXEC flag (like they are available in Linux now), and avoid the problem altogether.

Interestingly, if you add up all arguments, the conclusion is that there is no generic full solution (what we've called option 3 is - from the standpoint of POSIX - illegal, because you may also close internal descriptors the system needs for proper function).
xleroy (administrator)
2013-06-14 13:55
edited on: 2013-06-14 13:55

Ten months later, do we have new information to help choose a fix for this issue? For what is worth, I have a first cut of solution number 3 ("close_all_except" with a reasonably efficient implementation), but it won't be ready for release 4.01.

gmelquiond (reporter)
2014-03-17 20:59

For the record, in the freshly released Python (that is, 3.4), they went from inheritable by default to non-inheritable by default. They also switched to non-inheritable by default in Ruby 2.0, which was released last year. As for Go, they went for non-inheritable right from the start, but that is only in 1.3 (about be released) that it will be effective on all supported platforms. [^]
frisch (developer)
2015-12-02 16:08

No obvious strategy, so I don't think anything will be done for 4.03.
goswin (reporter)
2015-12-08 10:47

Might I suggest going a step further than "close_all_except"? When you start an external program you often want to catch the output and not have it go to stdout, which means running dup2() after fork. So closing unwanted FDs is not the only issue. So instead of a "close_all_except" list I would like the option of a "map_fds_to_and_close_rest" list.
xleroy (administrator)
2015-12-08 11:54

@goswin : for this use case you'd better use higher-level functions such as Unix.create_process rather than fork + dup2 + exec. (For one thing, Unix.create_process also works under Windows.) The "close_all_except" function mentioned here would be a low-level function used e.g. in the implementation of Unix.create_process.

- Issue History
Date Modified Username Field Change
2011-04-25 20:03 toots New Issue
2011-04-25 20:03 toots File Added:
2011-04-25 20:33 toots Note Added: 0005869
2011-04-28 03:56 toots Note Added: 0005871
2011-04-29 11:43 doligez Status new => acknowledged
2012-03-14 00:22 toots Note Added: 0007061
2012-03-14 09:22 xleroy Note Added: 0007067
2012-03-15 01:08 toots Note Added: 0007083
2012-03-15 09:30 xleroy Note Added: 0007084
2012-03-15 09:30 xleroy View Status private => public
2012-03-15 15:26 gerd Note Added: 0007088
2012-04-03 15:18 xleroy Relationship added related to 0005568
2012-04-03 15:18 xleroy Relationship added related to 0005569
2012-04-04 18:50 goswin Note Added: 0007279
2012-04-04 20:49 gerd Note Added: 0007280
2012-04-05 11:21 goswin Note Added: 0007283
2012-07-06 16:23 doligez Target Version => 4.01.0+dev
2012-07-31 13:36 doligez Target Version 4.01.0+dev => 4.00.1+dev
2012-08-02 20:56 gmelquiond Note Added: 0007877
2012-08-06 18:09 xleroy Note Added: 0007907
2012-08-06 18:09 xleroy Status acknowledged => feedback
2012-08-06 18:09 xleroy Target Version 4.00.1+dev => 4.01.0+dev
2012-08-06 21:51 gmelquiond Note Added: 0007921
2012-08-07 08:06 gmelquiond Note Added: 0007926
2012-08-07 12:46 gerd Note Added: 0007927
2012-08-07 14:38 gerd Note Added: 0007928
2012-09-06 19:23 frisch Target Version 4.01.0+dev => 4.00.2+dev
2013-06-06 23:11 frisch Target Version 4.00.2+dev => 4.01.0+dev
2013-06-14 13:55 xleroy Note Added: 0009490
2013-06-14 13:55 xleroy Target Version 4.01.0+dev => 4.02.0+dev
2013-06-14 13:55 xleroy Note Edited: 0009490 View Revisions
2013-07-12 18:15 doligez Target Version 4.02.0+dev => 4.01.1+dev
2014-03-17 20:59 gmelquiond Note Added: 0011048
2014-05-25 20:20 doligez Target Version 4.01.1+dev => 4.02.0+dev
2014-07-17 16:16 doligez Target Version 4.02.0+dev => 4.02.1+dev
2014-09-04 00:25 doligez Target Version 4.02.1+dev => undecided
2014-09-24 20:21 doligez Target Version undecided => 4.03.0+dev
2015-11-16 11:41 xleroy Relationship added related to 0006947
2015-12-02 16:08 frisch Note Added: 0014952
2015-12-02 16:09 frisch Target Version 4.03.0+dev => later
2015-12-08 10:47 goswin Note Added: 0015071
2015-12-08 11:54 xleroy Note Added: 0015072

Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker