Communication Between Processes

Communication Between Processes

The use of processes in application development allows you to delegate work. Nevertheless, these jobs may not be independent and it may be necessary for the processes to communicate with each other.

We introduce two methods of communication between processes: communication pipes and signals. This chapter does not discuss all possibilities of process communication. It is only a first approach to the applications developed in chapters 19 and 20.

Communication Pipes

It is possible for processes to communicate directly between each other in a file oriented style.

Pipes are something like virtual files from which it is possible to read and to write with the input / output functions read and write. They are of limited size, the exact limit depending from the system. They behave like queues: the first input is also the first output. Whenever data is read from a pipe, it is also removed from it.

This queue behavior is realized by the association of two descriptors with a pipe: one corresponding to the end of the pipe where new entries are written and one for the end where they are read. A pipe is created by the function:


# Unix.pipe ;;
- : unit -> Unix.file_descr * Unix.file_descr = <fun>

The first component of the resulting pair is the exit of the pipe used for reading. The second is the entry of the pipe used for writing. All processes knowing them can close the descriptors.

Reading from a pipe is blocking, unless all processes knowing its input descriptor (and therefore able to write to it) have closed it; in the latter case, the function read returns 0. If a process tries to write to a full pipe, it is suspended until another process has done a read operation. If a process tries to write to a pipe while no other process is available to read from it (all having closed their output descriptors), the process trying to write receives the signal sigpipe, which, if not indicated otherwise, leads to its termination.

The following example shows a use of pipes in which grandchildren tell their process number to their grandparents.

let output, input = Unix.pipe();;

let write_pid input =
  try
    let m = "(" ^ (string_of_int (Unix.getpid ())) ^ ")" 
    in ignore (Unix.write input m 0 (String.length m)) ;
       Unix.close input 
   with
     Unix.Unix_error(n,f,arg) -> 
       Printf.printf "%s(%s) : %s\n" f arg (Unix.error_message n) ;;

match Unix.fork () with
    0 -> for i=0 to 5 do 
           match Unix.fork() with
               0 -> write_pid input ; exit 0
             | _ -> ()
         done ;
         Unix.close input 
  | _  -> Unix.close input;     
          let s = ref ""  and  buff = String.create 5 
          in while true do
               match Unix.read output buff 0 5 with
                   0 -> Printf.printf "My grandchildren are %s\n" !s ; exit 0
                 | n -> s := !s ^ (String.sub buff 0 n) ^ "."
            done ;;

We obtain the trace:

My grandchildren are (1067.3).(1067.4).(1067.8).(1067.7).(1067.6).(1067.5).

We have introduced points between each part of the sequence read. This way it is possible to read from the trace the succession of contents of the pipe. Note how the reading is desynchronized: whenever an entry is made, even a partial one, it is consumed.

Named pipes.

Some Unix systems support named pipes, which look as if they were normal files. It is possible then to communicate between two processes without a descendence relation using the name of the pipe. The following function allows you to create such a pipe.


# Unix.mkfifo ;;
- : string -> Unix.file_perm -> unit = <fun>

The file descriptors necessary to use the pipe are obtained by openfile, as for usual files, but their behavior is that of pipes. In particular, the command lseek can not be used, since we have waiting lines.

Warning

mkfifo is not implemented for Windows.

Communication Channels

The Unix module provides a high level function allowing you to start a program associating with it input or output channels of the calling program:


# Unix.open_process ;;
- : string -> in_channel * out_channel = <fun>

The argument is the name of the program, or more precisely the calling path of the program, as we would write it to a command line interpreter. The string may contain arguments for the program to execute. The two output values are file descriptors associated with the standard input / output of the started program. It will be executed in parallel with the calling program.

Warning

The program started by open_process is executed via a call to the Unix command line interpreter /bin/sh.
The use of that function is therefore only possible for systems that have this interpreter.

We can end the execution of a program started by open_process by using:


# Unix.close_process ;;
- : in_channel * out_channel -> Unix.process_status = <fun>

The argument is the pair of channels associated with a process we want to close. The return value is the execution status of the process whose termination we wait.

There are variants of that functions, opening and closing only one input or output channel:


# Unix.open_process_in ;;
- : string -> in_channel = <fun>
# Unix.close_process_in ;;
- : in_channel -> Unix.process_status = <fun>
# Unix.open_process_out ;;
- : string -> out_channel = <fun>
# Unix.close_process_out ;;
- : out_channel -> Unix.process_status = <fun>

Here is a nice small example for the use of open_process: we start ocaml from ocaml!


# let n_print_string s = print_string s ; print_string "(* <-- *)" ;;
val n_print_string : string -> unit = <fun>
# let p () =
   let oc_in, oc_out = Unix.open_process "/usr/local/bin/ocaml" 
   in n_print_string (input_line oc_in) ; print_newline() ;
      n_print_string (input_line oc_in) ; print_newline() ;
      print_char (input_char oc_in) ;
      print_char (input_char oc_in) ;
      flush stdout ;
      let s = input_line stdin 
      in output_string oc_out s ;
         output_string oc_out "#quit\n" ;
         flush oc_out ;
         let r = String.create 250 in
         let n = input oc_in r 0 250 
         in n_print_string (String.sub r 0 n) ;
            print_string "Thank you for your visit\n" ;
            flush stdout ;
            Unix.close_process (oc_in, oc_out) ;;
val p : unit -> Unix.process_status = <fun>

The call of the function p starts a toplevel of Objective CAML. We note that it is version 2.03 which is in directory /usr/local/bin. The first four read operations allow us to get the header, which is shown by toplevel. The line let x = 1.2 +. 5.6;; is read from the keybard, then sent to oc_out (the output channel bound to the standard input of the new process). This one evaluates the passed Objective CAML expression and writes the result to the standard output which is bound to the input channel oc_in. This result is read and written to the output by the function input. Also the string "Thank you for your visit" is written to the output. We send the command #quit;; to exit the new process.

# p();;
        Objective Caml version 2.03

# let x = 1.2 +. 5.6;;
val x : float = 6.8
Thank you for your visit
- : Unix.process_status = Unix.WSIGNALED 13
#

Signals under Unix

One possibility to communicate with a process is to send it a signal. A signal may be received at any moment during the execution of a program. Reception of a signal causes a logical interruption. The execution of a program is interrupted to treat the received signal. Then the execution continues at the point of interruption. The number of signals is quite restricted (32 under Linux). The information carried by a signal is quite rudimentary: it is only the identity (the number) of the signal. The processes have a predefined reaction to each signal. However, the reactions can be redefined for most of the signals.

The data and functions to handle signals are distributed between the modules Sys and Unix. The module Sys contains signals conforming to the POSIX norm (described in [Ste92]) as well as some functions to handle signals. The module Unix defines the function kill to send a signal. The use of signals under Windows is restricted to sigint.

A signal may have several sources: the keyboard, an illegal attempt to access memory, etc. A process may send a signal to another by calling the function


# Unix.kill ;;
- : int -> int -> unit = <fun>

Its first parameter is the PID of the receiver. The second is the signal which we want to send.

Handling Signals

There are three categories of reactions associated with a signal. For each category there is a constructor of type signal_behavior:

Signal_default: the default behavior defined by the system. In most of the cases this is the termination of the process, with or without the creation of a file describing the process state (core file).
Signal_ignore: the signal is ignored.
Signal_handle: the behavior is redefined by an Objective CAML function of type int -> unit which is passed as an argument to the constructor. For the modified handling of the signal, the number of the signal is passed to the handling function.

On reception of a signal, the execution of the receiving process is diverted to the function handling the signal. The function allowing you to redefine the behavior associated with a signal is provided by the module Sys:


# Sys.set_signal;;
- : int -> Sys.signal_behavior -> unit = <fun>

The first argument is the signal to redefine. The second is the associated behavior.

The module Sys provides another modification function to handle signals:


# Sys.signal ;;
- : int -> Sys.signal_behavior -> Sys.signal_behavior = <fun>

It behaves like set_signal, except that it returns in addition the value associated with the signal before the modification. So we can write a function returning the behavioral value associated with a signal. This can be done even without changing this value:


# let signal_behavior s =
  let b = Sys.signal s Sys.Signal_default 
  in Sys.set_signal s b ; b ;;
val signal_behavior : int -> Sys.signal_behavior = <fun>
# signal_behavior Sys.sigint;;
- : Sys.signal_behavior = Sys.Signal_handle <fun>

However, the behavior associated with some signals can not be changed. Therefore our function can not be used for all signals:


# signal_behavior Sys.sigkill ;;
Uncaught exception: Sys_error("Invalid argument")

Some Signals

We illustrate the use of some essential signals.

sigint.

This signal is generally associated with the key combination CTRL-C. In the following small example we modify the reaction to this signal so that the receiving process is not interrupted until the third occurence of the signal.

We create the following file ctrlc.ml:

let sigint_handle =
 let n = ref 0 
 in function _ -> incr n ;
                  match !n with
                      1 -> print_string "You just pushed CTRL-C\n"
                    | 2 -> print_string "You pushed CTRL-C a second time\n"
                    | 3 -> print_string "If you insist ...\n" ; exit 1
                    | _ -> () ;;
Sys.set_signal Sys.sigint (Sys.Signal_handle sigint_handle) ;;
match Unix.fork () with 
    0 -> while true do () done 
  | pid -> Unix.sleep 1 ; Unix.kill pid Sys.sigint  ;
           Unix.sleep 1 ; Unix.kill pid Sys.sigint  ;
           Unix.sleep 1 ; Unix.kill pid Sys.sigint  ;;

This program simulates the push of the key combination CTRL-C by sending the signal sigint. We obtain the following execution trace:

$ ocamlc -i -o ctrlc ctrlc.ml
val sigint_handle : int -> unit
$ ctrlc
You just pushed CTRL-C
You pushed CTRL-C a second time
If you insist ...

sigalrm.

Another frequently used signal is sigalrm, which is associated with the system clock. It can be sent by the function


# Unix.alarm ;;
- : int -> int = <fun>

The argument specifies the number of seconds to wait before the sending of the signal sigalrm. The return value indicates the number of remaining seconds before the sending of a second signal, or if there is no alarm set.

We use this function and the associated signal to define the function timeout, which starts the execution of another function and interrupts it if neccessary, when the indicated time is elapsed. More precisely, the function timeout takes as arguments a function f, the argument arg expected by f, the duration (time) of the ``timeout'' and the value (default_value) to be returned when the duration time has elapsed.

A timeout is handled as follows:

We modify the behavior associated with the signal sigalrm so that a Timeout exception is thrown.
We take care to remember the behavior associated originally with sigalrm, so that it can be restored.
We start the clock.
We distinguish two cases:
1. If everything goes well, we restore the original state of sigalrm and return the value of the calculation.
2. If not, we restore sigalrm, and if the duration has elapsed, we return the default value.

Here are the corresponding definitions and a small example:


# exception Timeout ;;
exception Timeout
# let sigalrm_handler = Sys.Signal_handle (fun _ -> raise Timeout) ;;
val sigalrm_handler : Sys.signal_behavior = Sys.Signal_handle <fun>
# let timeout f arg time default_value =
   let old_behavior = Sys.signal Sys.sigalrm sigalrm_handler in
   let reset_sigalrm () = Sys.set_signal Sys.sigalrm old_behavior 
   in ignore (Unix.alarm time) ;
      try  let res = f arg in reset_sigalrm () ; res  
      with exc -> reset_sigalrm () ;
                  if exc=Timeout then default_value else raise exc ;;
val timeout : ('a -> 'b) -> 'a -> int -> 'b -> 'b = <fun>
# let iterate n = for i = 1 to n do () done ; n ;;
val iterate : int -> int = <fun>


Printf.printf "1st execution : %d\n" (timeout iterate 10 1 (-1));
Printf.printf "2nd execution : %d\n" (timeout iterate 100000000 1 (-1)) ;;

1st execution : 10
2nd execution : -1
- : unit = ()

sigusr1 and sigusr2.

These two signals are provided only for the programer. They are not used by the operating system.

In this example, reception of the signal sigusr1 by the child triggers the output of the content of variable i.

let i = ref 0  ;;
let write_i s = Printf.printf "signal received (%d) -- i=%d\n" s !i ;
                  flush stdout ;;
Sys.set_signal Sys.sigusr1 (Sys.Signal_handle write_i) ;;

match Unix.fork () with 
    0 -> while true do incr i done 
  | pid -> Unix.sleep 0  ; Unix.kill pid Sys.sigusr1  ;
           Unix.sleep 3 ; Unix.kill pid Sys.sigusr1  ;
           Unix.sleep 1  ; Unix.kill pid Sys.sigkill

Here is the trace of a program execution:

signal received (10) -- i=0
signal received (10) -- i=167722808

When we examine the trace, we can see that after having executed the code associated with signal sigusr1 the first time, the child process continues to execute the loop and to increment i.

sigchld.

This signal is sent to a parent on termination of a process. We will use it to make a parent more attentive to the evolution of its children. Here's how:

We define a function handling the signal sigchld. It handles all terminated children on reception of this signal⁵ and terminates the parent when he does not have any more children (exception Unix_error). In order not to block the parent if not all his children are dead, we use waitpid instead of wait.
The main program, after having redefined the reaction associated with sigchld, loops to create five children. After this, the parent does something else (loop while true) until his children have terminated.

let rec sigchld_handle s =
  try  let pid, _ = Unix.waitpid [Unix.WNOHANG] 0 
       in if pid <> 0 
          then ( Printf.printf "%d is dead and buried at signal %d\n" pid s ;
                 flush stdout ;
                 sigchld_handle s )
  with Unix.Unix_error(_, "waitpid", _) -> exit 0 ;;

let i = ref 0 
in Sys.set_signal Sys.sigchld (Sys.Signal_handle sigchld_handle) ;
   while true do
     match Unix.fork() with
         0 -> let pid = Unix.getpid () 
              in Printf.printf "Creation of %d\n" pid ; flush stdout ;
                 Unix.sleep (Random.int (5+ !i)) ;
                 Printf.printf "Termination of %d\n" pid ; flush stdout ;
                 exit 0
       | _ -> incr i ; if !i = 5 then while true do () done 
   done ;;

We obtain the trace:

Creation of 10658
Creation of 10659
Creation of 10662
Creation of 10661
Creation of 10660
Termination of 10662
10662 is dead and buried at signal 17
Termination of 10658
10658 is dead and buried at signal 17
Termination of 10660
Termination of 10659
10660 is dead and buried at signal 17
10659 is dead and buried at signal 17
Termination of 10661
10661 is dead and buried at signal 17