| Anonymous | Login | Signup for a new account | 2013-05-19 16:35 CEST | ![]() |
| Main | My View | View Issues | Change Log | Roadmap |
| View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | |||||||
| ID | Project | Category | View Status | Date Submitted | Last Update | |||
| 0000025 | OCaml | OCaml general | public | 2000-01-14 23:05 | 2000-02-07 14:46 | |||
| Reporter | administrator | |||||||
| Assigned To | ||||||||
| Priority | normal | Severity | minor | Reproducibility | always | |||
| Status | closed | Resolution | fixed | |||||
| Platform | OS | OS Version | ||||||
| Product Version | ||||||||
| Target Version | Fixed in Version | |||||||
| Summary | 0000025: threads and marshalling with I/O | |||||||
| Description | Full_Name: Gerd Stolpmann Version: 2.04 -with-pthread OS: Linux-2.2.13 with glibc-6.1 Submission from: master.proxy.ision.net (195.180.208.40) Marshalling, including the input/output_value functions in Pervasives, is not reentrant at all. This very clear from the C source code in byterun/extern.c and intern.c: extern.c: There is a static variable extern_block which is actually the buffer that is written in do_write; writing happens in a "blocking section" and so concurrent Marshal.to_channel calls can lead to situations where more than one thread is executing the code within this section with the same extern_block buffer. intern.c: The same problems with the buffer intern_input which is also filled in a blocking section. This piece of code often segfaults with Marshal.to_channel: let l = ref [] in let s = String.make 10000 'x' in for i = 1 to 50 do let id = Thread.create (fun () -> for j = 1 to 100 do let ch = open_out "/dev/null" in Marshal.to_channel ch s []; close_out ch; done; ) () in l := id :: !l done; List.iter (fun id -> Thread.join id) !l ;; The same with Marshal.from_channel: let s = String.make 10000 'x' in let out = open_out "r.data" in Marshal.to_channel out s []; close_out out; let l = ref [] in for i = 1 to 50 do let id = Thread.create (fun () -> for j = 1 to 100 do let ch = open_in "r.data" in ignore(Marshal.from_channel ch); close_in ch; done; ) () in l := id :: !l done; List.iter (fun id -> Thread.join id) !l ;; | |||||||
| Tags | No tags attached. | |||||||
| Attached Files | ||||||||
Notes |
|
|
(0000297) administrator (administrator) 2000-01-17 13:37 |
> Marshalling, including the input/output_value functions in Pervasives, is not > reentrant at all. This very clear from the C source code in byterun/extern.c > and intern.c: > [global variables] Please keep in mind that the two thread libraries for Caml are set up in such a way that only one thread at a time is in Caml code or in most parts of the runtime system. For native threads, this is achieved by having a master lock that any thread must acquire before running Caml code, and releases only when it is about to perform a blocking I/O operation. While the main motivation for the master lock is that the allocator and GC aren't concurrent, the master lock also protects a bunch of global variables such as those in extern.c and intern.c. Marshalling is a two-step process: first, the stream of bytes is built in memory, without releasing the master lock; then, it is written on the channel, releasing the master lock for the duration of the (potentially blocking) write. But the global variables are used only during the first step (I think -- I'll have to check the code again!). > This piece of code often segfaults with Marshal.to_channel: I'll look at that. One quick question: is this with bytecode threads or native threads? Best regards, - Xavier Leroy |
|
(0000298) administrator (administrator) 2000-01-17 20:38 |
On Mon, 17 Jan 2000, you wrote: >> Marshalling, including the input/output_value functions in Pervasives, is not >> reentrant at all. This very clear from the C source code in byterun/extern.c >> and intern.c: >> [global variables] > >Please keep in mind that the two thread libraries for Caml are set up in such a >way >that only one thread at a time is in Caml code or in most parts of the runtime >system. For native threads, this is achieved by having a master lock that >any thread must acquire before running Caml code, and releases only when it >is about to perform a blocking I/O operation. While the main motivation for >the master lock is that the allocator and GC aren't concurrent, the master lock >also protects a bunch of global variables such as those in extern.c and >intern.c. I already took the master lock into account. The buffers that are used for the "read" and "write" system calls are not protected (enough). This is very clear from "output_val": void output_val(struct channel *chan, value v, value flags) { long len; if (! channel_binary_mode(chan)) failwith("output_value: not a binary channel"); alloc_extern_block(); len = extern_value(v, flags); really_putblock(chan, extern_block, len); stat_free(extern_block); } Although "extern_block" is copied into the channel's private buffer, it is still possible that several threads use "extern_block" in parallel. Why that? Consider thread A and thread B want to "output_val" almost at the same time. Now this can happen: - thread A runs until the "write" system call is reached (somewhere in really_putblock). The master lock is released at this time. - "write" blocks for a moment - thread B can now continue, do "alloc_extern_block" again although the previous buffer is still in use - thread B does not block; it frees extern_block - thread A wakes up and tries to free extern_block again. Crash. A possible solution would be to free extern_block *immediately* after it has been copied to the channel buffer, and before the master lock will be released. The same problem applies to "input_val", because "intern_input" can be used by several threads in parallel, too. Here, the solution is very simple: do not use the global variable "intern_input" while "really_getblock" is in progress but a local pointer. Assign this local pointer to "intern_input" after the input operation is over. There is another problem with input_val: value input_val(struct channel *chan) { ... intern_input = (unsigned char *) stat_alloc(block_len); ... if (really_getblock(chan, (char *)intern_input, block_len) == 0) { ... } ... } really_getblock simply does getblock multiple times with the same buffer. int getblock(struct channel *channel, char *p, long int len) { ... if (n <= avail) { ... } else if (avail > 0) { ... } else if (n< IO_BUFFER_SIZE) { ... } else { nread = do_read(channel->fd, p, n); /* WRONG */ channel->offset += nread; return nread; } } "do_read" then passes "p" over to "read", with released master lock. This may also be the reason for the Pervasives.input bug I reported, too, because "caml_input" passes a Caml-allocated string to "getblock". This string can be corrupted by a garbage collection at the same time. >I'll look at that. One quick question: is this with bytecode threads or native >threads? Native threads, of course. Gerd -- ---------------------------------------------------------------------------- Gerd Stolpmann Telefon: +49 6151 997705 (privat) Viktoriastr. 100 64293 Darmstadt EMail: Gerd.Stolpmann@darmstadt.netsurf.de (privat) Germany ---------------------------------------------------------------------------- |
|
(0000299) administrator (administrator) 2000-02-07 14:41 |
> I already took the master lock into account. The buffers that are used for > the "read" and "write" system calls are not protected (enough). This is very > clear from "output_val": > Although "extern_block" is copied into the channel's private buffer, it is still > possible that several threads use "extern_block" in parallel. You are correct, "stat_free(extern_block)" in output_val isn't protected by the master lock. > A possible solution would be to free extern_block *immediately* after it has > been copied to the channel buffer, and before the master lock will be released. Unfortunately, the channel buffer doesn't always have enough free room to copy extern_block in it at once. My fix is to save the value of extern_block (the address of the extern buffer) in a local variable before calling really_putblock, then pass that local variable to stat_free at the end. > The same problem applies to "input_val", because "intern_input" can be used by > several threads in parallel, too. Here, the solution is very simple: do not > use the global variable "intern_input" while "really_getblock" is in progress > but a local pointer. Assign this local pointer to "intern_input" after the input > operation is over. Right. (My fix above is symmetrical). This will be in 3.00. > [In getblock:] > "do_read" then passes "p" over to "read", with released master lock. > This may also be the reason for the Pervasives.input bug I reported, too, > because "caml_input" passes a Caml-allocated string to "getblock". This > string can be corrupted by a garbage collection at the same time. Correct. There was a similar bug in Unix.read and Unix.write that was fixed a while ago, but I didn't realize the same problem occurred in buffered I/O as well. I have fixed this by suppressing the optimization in getblock (that we read directly into the Caml string, bypassing entirely the buffer, if the buffer is empty and the request large enough). In other terms, all data read goes first through the buffer (which doesn't move during GC), then is copied back to the Caml string while the master lock is held. It's slightly less efficient (all reads are now decomposed into 4K chunks) but I don't see any other way to remain GC-safe. Thanks for the bug reports and diagnostics, - Xavier Leroy > >>I'll look at that. One quick question: is this with bytecode threads or native >>threads? > > Native threads, of course. > > Gerd > -- > ---------------------------------------------------------------------------- > Gerd Stolpmann Telefon: +49 6151 997705 (privat) > Viktoriastr. 100 > 64293 Darmstadt EMail: Gerd.Stolpmann@darmstadt.netsurf.de (privat) > Germany > ---------------------------------------------------------------------------- > > |
|
(0000300) administrator (administrator) 2000-02-07 14:46 |
Fixed in 3.00 |
Issue History |
|||
| Date Modified | Username | Field | Change |
| 2005-11-18 10:13 | administrator | New Issue | |
| Copyright © 2000 - 2011 MantisBT Group |