Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006276OCamlOCaml runtime systempublic2013-12-17 10:202014-02-08 12:49
Reporterdim 
Assigned Todim 
PrioritynormalSeverityminorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version 
Target Version4.01.1+devFixed in Version4.01.1+dev 
Summary0006276: BLocking stubs not releasing the runtime in stdlib and unix
DescriptionThe following functions might block and do not release the runtime lock:

  Sys.chdir
  Sys.close
  Sys.file_exists
  Sys.is_directory
  Sys.readdir
  Sys.remove
  Sys.rename
  Unix.readdir
  Unix.opendir
  Unix.close
  Unix.closedir
  Unix.rename
  Unix.remove

This can be a problem when using NFS for instance (that's how we noticed it).
TagsNo tags attached.
Attached Filesdiff file icon release-runtime.diff [^] (25,866 bytes) 2013-12-18 09:32 [Show Content]
? file icon traverse.ml [^] (530 bytes) 2013-12-18 09:32 [Show Content]

- Relationships
related to 0004801closedshinwell calls to lseek() don't release the runtime lock 
has duplicate 0004995assignedshinwell calls to stat() don't release the runtime lock 

-  Notes
(0010736)
dim (developer)
2013-12-17 10:21

I don't think there is any particular reason not to release the runtime in these functions? I'm preparing a patch.
(0010740)
dim (developer)
2013-12-17 18:05

Patch attached. If there is no objections I'll commit it.
(0010742)
gasche (developer)
2013-12-17 23:16

Have I correctly understood that then change involves doing copy for the string arguments (usually file paths) of those functions, when there often was not before? Would that make a performance difference for, say, recursive traversal of a large filesystem?
(0010746)
dim (developer)
2013-12-18 09:38
edited on: 2013-12-18 09:38

> Have I correctly understood that then change involves doing copy for the string arguments (usually file paths) of those functions, when there often was not before?

Yes

> Would that make a performance difference for, say, recursive traversal of a large filesystem?

It is going to be slightly slower, yes. With the attached traverse.ml, on a hierarchy of 1M files, I get a 3% slowdown.

I can understand that all these copy are a bit frustrating, but not releasing the lock means that you can block all threads because one of them is accessing a NFS.

(0010748)
gasche (developer)
2013-12-18 11:30

> With the attached traverse.ml, on a hierarchy of 1M files, I get a 3% slowdown.

That seems reasonable.

> I can understand that all these copy are a bit frustrating,
> but not releasing the lock means that you can block all threads
> because one of them is accessing a NFS.

I agree copying and releasing is a better compromise.


I'm not familiar with the FFI, so I can't say whether your patch is correct. (There are good explanations of enter_section/leave_section in http://d.hatena.ne.jp/camlspotter/20100309/1268111257 [^] ).

Can you explain a bit when exactly do you need to add CAMLparam/CAMLreturn? My understanding was that they were need when the GC is called (and caml_stat_alloc{_string} does not call the GC). Does releasing the lock require using those? Why does your patch add them to caml_sys_remove, but not caml_sys_rename, for example?
(0010749)
dim (developer)
2013-12-18 11:42

> Can you explain a bit when exactly do you need to add CAMLparam/CAMLreturn?

Some of the stubs are using the [path] variable for the error message after the call. So we need to ensure it is still valid.

> My understanding was that they were need when the GC is called (and caml_stat_alloc{_string} does not call the GC). Does releasing the lock require using those?

Yes, the GC might run in another thread for instance.

> Why does your patch add them to caml_sys_remove, but not caml_sys_rename, for example?

caml_sys_remove uses [path] for the error message but caml_sys_rename doesn't. I guess it wouldn't harm to add them everywhere.
(0010763)
dim (developer)
2013-12-23 17:25

commited, revision 14384 and 14385.
(0010764)
yallop (developer)
2013-12-24 11:32

I think 'File_offset_val' needs to be moved outside the blocking section in 'unix_truncate_64', and the call to 'access' in 'unix_access' needs to use 'p' instead of 'String_val(path)'.
(0010765)
dim (developer)
2013-12-24 13:10

Indeed. Fixed in commits 14387 and 14388. Thanks!
(0010773)
yallop (developer)
2014-01-02 02:31

Another thought: it's probably very slightly better to have the calls to caml_stat_free inside the blocking sections.
(0010855)
dsheets (reporter)
2014-01-29 02:26
edited on: 2014-01-29 02:34

I believe ftruncate, fchown, and fchmod also need to have the lock released.

(0010857)
dim (developer)
2014-01-29 10:36

Indeed. Fixed in commits 14425 and 14426. Thanks!

- Issue History
Date Modified Username Field Change
2013-12-17 10:20 dim New Issue
2013-12-17 10:20 dim Status new => assigned
2013-12-17 10:20 dim Assigned To => dim
2013-12-17 10:21 dim Note Added: 0010736
2013-12-17 17:59 dim File Added: release-runtime.diff
2013-12-17 18:05 dim Note Added: 0010740
2013-12-17 23:16 gasche Note Added: 0010742
2013-12-18 09:31 dim File Deleted: release-runtime.diff
2013-12-18 09:32 dim File Added: release-runtime.diff
2013-12-18 09:32 dim File Added: traverse.ml
2013-12-18 09:38 dim Note Added: 0010746
2013-12-18 09:38 dim Note Edited: 0010746 View Revisions
2013-12-18 11:30 gasche Note Added: 0010748
2013-12-18 11:42 dim Note Added: 0010749
2013-12-23 17:25 dim Note Added: 0010763
2013-12-23 17:26 dim Status assigned => resolved
2013-12-23 17:26 dim Fixed in Version => 4.01.1+dev
2013-12-23 17:26 dim Resolution open => fixed
2013-12-24 11:32 yallop Note Added: 0010764
2013-12-24 13:10 dim Note Added: 0010765
2014-01-02 02:31 yallop Note Added: 0010773
2014-01-03 01:36 yallop Relationship added related to 0004801
2014-01-29 02:26 dsheets Note Added: 0010855
2014-01-29 02:33 dsheets Note Added: 0010856
2014-01-29 02:34 dsheets Note Edited: 0010855 View Revisions
2014-01-29 02:34 dsheets Note Deleted: 0010856
2014-01-29 10:36 dim Note Added: 0010857
2014-02-08 12:49 yallop Relationship added has duplicate 0004995


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker