Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005266OCamlotherlibspublic2011-05-16 17:482016-12-07 15:16
Assigned To 
PlatformOSOS Version
Product Version 
Target VersionlaterFixed in Version 
Summary0005266: Unix.{system,open_process_{in,out,full}} are not threadsafe
DescriptionWhen calling fork in a multi-threaded program the forked side can only call async-signal-safe functions [ [^]] this means that we probably cannot rely at all on using the ocaml runtime on the forked side.

Even if we can use the ocaml runtime (because we do no allocation and therefor are compiled down to fairly straghtforward c calls) we shouldn't call the c [exit] function on errors but rather the [_exit] function which doesn't call the [at_exit] registered handler (similarly we do not want to run the ones registered in ocaml). These are not necessarily async-signal-safe and might perform flushing/cleaning up that only makes sense in the parent process (e.g.: cleaning up temporary files).
Attached Filesc file icon fork_exec.c [^] (9,020 bytes) 2011-05-17 00:17 [Show Content]
c file icon create_process.c [^] (9,351 bytes) 2012-09-19 17:51 [Show Content]

- Relationships

-  Notes
till (reporter)
2011-05-16 18:30

Attached a file that implements a thread safe fork-exec.

The signature of the function is:

external fork_exec :
  stdin : file_descr
  -> stdout : file_descr
  -> stderr : file_descr
  -> ?working_dir : string
  -> ?env : (string) array
  -> string
  -> string array
  -> int
  = "extended_ml_create_process_bc" "extended_ml_create_process"

I'd be happy to clean up this file and make it a patch to the ocaml Unix library if you want.
gmelquiond (reporter)
2012-08-05 23:25
edited on: 2012-08-05 23:25

I have been experimenting with your code since it has some nice properties that the original implementation of create_process lacks. In particular, it has proper error handling (e.g. calling _exit in case of error) and sends back to the parent process the errno value. It also avoids a rare race condition that can bring the system to its knees, since it does not allow the parent process to perform a major collection before the child process has finished discarding the obsolete address space (at least for single-threaded programs). And finally, it supports setting the working directory, as does the system call on Windows. Thanks a lot.

I have encountered a few issues though, that I will list here in case someone else stumbles upon your code:
- closing pfd[1] must be performed before reading the pipe, otherwise the parent process will deadlock,
- the pipe descriptors must be moved above fd 3 before forking, otherwise they might get overwritten by the standard descriptors, hence breaking error reporting,
- clearing cloexec on the duplicated file descriptors is useless, since duplication does not preserve this attribute,
- however, cloexec should be set on the original file descriptors (either that or they should be manually erased), otherwise the child process will possibly end up with up to 6 file descriptors instead of 3 only.

till (reporter)
2012-09-19 17:35

Good points. This code was surprisingly hard to write and I am sad to see that I still let mistakes slip through (especially the fd leak). I will try to submit a fix soon.

Coming back to the pfd issue: can you think of a way that would make the parent deadlock? pdf[0] being set cloexec should be closed when the forked-side fails (and thus exits) or succeed (and exec's)
gmelquiond (reporter)
2012-09-19 17:54

It has been some time, so my memories are a bit hazy. The read on pfd[0] will abort only if all the write ends of the pipe are closed, which is not the case if pfd[1] is still open in the parent, hence the deadlock. Am I missing something? Anyway, I have uploaded the variant of your code I have been using successfully.
till (reporter)
2012-09-19 18:27

Oh. Palm/forehead! that is definitely a bug. I must have uploaded that file before fixing it. There's no way this could work. Thanks!
shinwell (developer)
2016-12-07 15:16

FWIW, [^] contains a process creation library that shouldn't suffer from async signal safety problems. (I think there may be an update to that library soon.)

- Issue History
Date Modified Username Field Change
2011-05-16 17:48 till New Issue
2011-05-16 18:30 till Note Added: 0005895
2011-05-17 00:17 till File Added: fork_exec.c
2011-05-17 14:16 doligez Status new => acknowledged
2012-07-10 15:11 doligez Target Version => 4.01.0+dev
2012-07-31 13:36 doligez Target Version 4.01.0+dev => 4.00.1+dev
2012-08-05 23:25 gmelquiond Note Added: 0007898
2012-08-05 23:25 gmelquiond Note Edited: 0007898 View Revisions
2012-09-19 15:37 doligez Target Version 4.00.1+dev => 4.00.2+dev
2012-09-19 17:35 till Note Added: 0008118
2012-09-19 17:51 gmelquiond File Added: create_process.c
2012-09-19 17:54 gmelquiond Note Added: 0008119
2012-09-19 18:27 till Note Added: 0008120
2013-06-07 15:11 xleroy Target Version 4.00.2+dev => 4.02.0+dev
2013-07-12 18:15 doligez Target Version 4.02.0+dev => 4.01.1+dev
2013-10-08 15:33 doligez Tag Attached: patch
2014-05-25 20:20 doligez Target Version 4.01.1+dev => 4.02.0+dev
2014-07-31 11:51 doligez Target Version 4.02.0+dev => 4.02.1+dev
2014-09-04 00:25 doligez Target Version 4.02.1+dev => undecided
2014-09-26 19:39 doligez Target Version undecided => 4.02.2+dev / +rc1
2015-01-16 19:42 doligez Severity minor => feature
2015-01-16 19:42 doligez Target Version 4.02.2+dev / +rc1 => 4.03.0+dev / +beta1
2016-01-27 09:28 frisch Target Version 4.03.0+dev / +beta1 => later
2016-12-07 15:14 shinwell Category OCaml general => OCaml otherlibs
2016-12-07 15:16 shinwell Note Added: 0016720
2017-02-23 16:42 doligez Category OCaml otherlibs => otherlibs

Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker