Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

threads and marshalling with I/O #2359

Closed
vicuna opened this issue Jan 14, 2000 · 4 comments
Closed

threads and marshalling with I/O #2359

vicuna opened this issue Jan 14, 2000 · 4 comments
Labels

Comments

@vicuna
Copy link

vicuna commented Jan 14, 2000

Original bug ID: 25
Reporter: administrator
Status: closed
Resolution: fixed
Priority: normal
Severity: minor
Category: ~DO NOT USE (was: OCaml general)

Bug description

Full_Name: Gerd Stolpmann
Version: 2.04 -with-pthread
OS: Linux-2.2.13 with glibc-6.1
Submission from: master.proxy.ision.net (195.180.208.40)

Marshalling, including the input/output_value functions in Pervasives, is not
reentrant at all. This very clear from the C source code in byterun/extern.c
and
intern.c:

extern.c: There is a static variable extern_block which is actually the buffer
that is written in do_write; writing happens in a "blocking section" and so
concurrent Marshal.to_channel calls can lead to situations where more than one
thread is executing the code within this section with the same extern_block
buffer.

intern.c: The same problems with the buffer intern_input which is also filled
in a blocking section.

This piece of code often segfaults with Marshal.to_channel:

let l = ref [] in
let s = String.make 10000 'x' in
for i = 1 to 50 do
let id =
Thread.create
(fun () ->
for j = 1 to 100 do
let ch = open_out "/dev/null" in
Marshal.to_channel ch s [];
close_out ch;
done;
)
()
in
l := id :: !l
done;

List.iter
(fun id ->
Thread.join id)
!l
;;

The same with Marshal.from_channel:

let s = String.make 10000 'x' in
let out = open_out "r.data" in
Marshal.to_channel out s [];
close_out out;
let l = ref [] in
for i = 1 to 50 do
let id =
Thread.create
(fun () ->
for j = 1 to 100 do
let ch = open_in "r.data" in
ignore(Marshal.from_channel ch);
close_in ch;
done;
)
()
in
l := id :: !l
done;

List.iter
(fun id ->
Thread.join id)
!l
;;

@vicuna
Copy link
Author

vicuna commented Jan 17, 2000

Comment author: administrator

Marshalling, including the input/output_value functions in Pervasives, is not
reentrant at all. This very clear from the C source code in byterun/extern.c
and intern.c:
[global variables]

Please keep in mind that the two thread libraries for Caml are set up in such a
way
that only one thread at a time is in Caml code or in most parts of the runtime
system. For native threads, this is achieved by having a master lock that
any thread must acquire before running Caml code, and releases only when it
is about to perform a blocking I/O operation. While the main motivation for
the master lock is that the allocator and GC aren't concurrent, the master lock
also protects a bunch of global variables such as those in extern.c and
intern.c.

Marshalling is a two-step process: first, the stream of bytes is built in
memory,
without releasing the master lock; then, it is written on the channel,
releasing
the master lock for the duration of the (potentially blocking) write. But the
global variables are used only during the first step (I think -- I'll have to
check the code again!).

This piece of code often segfaults with Marshal.to_channel:

I'll look at that. One quick question: is this with bytecode threads or native
threads?

Best regards,

  • Xavier Leroy

@vicuna
Copy link
Author

vicuna commented Jan 17, 2000

Comment author: administrator

On Mon, 17 Jan 2000, you wrote:

Marshalling, including the input/output_value functions in Pervasives, is not
reentrant at all. This very clear from the C source code in byterun/extern.c
and intern.c:
[global variables]

Please keep in mind that the two thread libraries for Caml are set up in such a
way
that only one thread at a time is in Caml code or in most parts of the runtime
system. For native threads, this is achieved by having a master lock that
any thread must acquire before running Caml code, and releases only when it
is about to perform a blocking I/O operation. While the main motivation for
the master lock is that the allocator and GC aren't concurrent, the master lock
also protects a bunch of global variables such as those in extern.c and
intern.c.

I already took the master lock into account. The buffers that are used for
the "read" and "write" system calls are not protected (enough). This is very
clear from "output_val":

void output_val(struct channel *chan, value v, value flags)
{
long len;
if (! channel_binary_mode(chan))
failwith("output_value: not a binary channel");
alloc_extern_block();
len = extern_value(v, flags);
really_putblock(chan, extern_block, len);
stat_free(extern_block);
}

Although "extern_block" is copied into the channel's private buffer, it is still
possible that several threads use "extern_block" in parallel. Why that? Consider
thread A and thread B want to "output_val" almost at the same time. Now this
can happen:

  • thread A runs until the "write" system call is reached (somewhere in
    really_putblock). The master lock is released at this time.
  • "write" blocks for a moment
  • thread B can now continue, do "alloc_extern_block" again although the previous
    buffer is still in use
  • thread B does not block; it frees extern_block
  • thread A wakes up and tries to free extern_block again. Crash.

A possible solution would be to free extern_block immediately after it has
been copied to the channel buffer, and before the master lock will be released.

The same problem applies to "input_val", because "intern_input" can be used by
several threads in parallel, too. Here, the solution is very simple: do not
use the global variable "intern_input" while "really_getblock" is in progress
but a local pointer. Assign this local pointer to "intern_input" after the input
operation is over.

There is another problem with input_val:

value input_val(struct channel *chan)
{
...
intern_input = (unsigned char *) stat_alloc(block_len);
...
if (really_getblock(chan, (char *)intern_input, block_len) == 0) {
...
}
...
}

really_getblock simply does getblock multiple times with the same buffer.

int getblock(struct channel *channel, char p, long int len)
{
...
if (n <= avail) { ...
} else if (avail > 0) { ...
} else if (n< IO_BUFFER_SIZE) { ...
} else {
nread = do_read(channel->fd, p, n); /
WRONG */
channel->offset += nread;
return nread;
}
}

"do_read" then passes "p" over to "read", with released master lock.

This may also be the reason for the Pervasives.input bug I reported, too,
because "caml_input" passes a Caml-allocated string to "getblock". This
string can be corrupted by a garbage collection at the same time.

I'll look at that. One quick question: is this with bytecode threads or native
threads?

Native threads, of course.

Gerd


Gerd Stolpmann Telefon: +49 6151 997705 (privat)
Viktoriastr. 100
64293 Darmstadt EMail: Gerd.Stolpmann@darmstadt.netsurf.de (privat)
Germany

@vicuna
Copy link
Author

vicuna commented Feb 7, 2000

Comment author: administrator

I already took the master lock into account. The buffers that are used for
the "read" and "write" system calls are not protected (enough). This is very
clear from "output_val":
Although "extern_block" is copied into the channel's private buffer, it is
still
possible that several threads use "extern_block" in parallel.

You are correct, "stat_free(extern_block)" in output_val isn't protected by
the master lock.

A possible solution would be to free extern_block immediately after it has
been copied to the channel buffer, and before the master lock will be
released.

Unfortunately, the channel buffer doesn't always have enough free room to copy
extern_block in it at once.

My fix is to save the value of extern_block (the address of the extern buffer)
in a local variable before calling really_putblock, then pass that local
variable
to stat_free at the end.

The same problem applies to "input_val", because "intern_input" can be used
by
several threads in parallel, too. Here, the solution is very simple: do not
use the global variable "intern_input" while "really_getblock" is in progress
but a local pointer. Assign this local pointer to "intern_input" after the
input
operation is over.

Right. (My fix above is symmetrical). This will be in 3.00.

[In getblock:]
"do_read" then passes "p" over to "read", with released master lock.
This may also be the reason for the Pervasives.input bug I reported, too,
because "caml_input" passes a Caml-allocated string to "getblock". This
string can be corrupted by a garbage collection at the same time.

Correct. There was a similar bug in Unix.read and Unix.write that was fixed a
while ago, but I didn't realize the same problem occurred in buffered I/O as
well.

I have fixed this by suppressing the optimization in getblock (that we read
directly into the Caml string, bypassing entirely the buffer, if the buffer is
empty
and the request large enough). In other terms, all data read goes first
through
the buffer (which doesn't move during GC), then is copied back to the Caml
string
while the master lock is held. It's slightly less efficient (all reads are now
decomposed into 4K chunks) but I don't see any other way to remain GC-safe.

Thanks for the bug reports and diagnostics,

  • Xavier Leroy

I'll look at that. One quick question: is this with bytecode threads or
native
threads?

Native threads, of course.

Gerd


Gerd Stolpmann Telefon: +49 6151 997705 (privat)
Viktoriastr. 100
64293 Darmstadt EMail: Gerd.Stolpmann@darmstadt.netsurf.de (privat)
Germany

@vicuna
Copy link
Author

vicuna commented Feb 7, 2000

Comment author: administrator

Fixed in 3.00

@vicuna vicuna closed this as completed Feb 7, 2000
@vicuna vicuna added the bug label Mar 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant