|Anonymous | Login | Signup for a new account||2013-05-21 13:54 CEST|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0005325||OCaml||OCaml general||public||2011-08-02 22:50||2012-05-24 19:11|
|Target Version||Fixed in Version||4.01.0+dev|
|Summary||0005325: Blocked Unix.recv in one thread blocks Unix.send in another thread under Windows|
|Description||It appears not be possible to write to a socket in one thread if another thread is blocked in a [recv] call on the same socket (and presumably vice versa).|
This behaviour only seems to affect Windows - tried on both Windows 7 with 3.12.0 and Windows XP with 3.10.1 with the same effect.
|Additional Information||The attached file should be compiled with ocamlopt -o foo.exe -thread unix.cmxa threads.cmxa Foo.ml|
It can then be invoked with, say:
and the line
GET / HTTP/1.0
followed by enter twice. On Linux, you'll see an HTTP redirect response. On Windows, the program will hang after the first carriage return eventually crashing with an exception when Google (or whoever) closes the socket at the other end on a timeout.
The equivalent code in C definitely works, so it appears to be a problem with Unix module / OCaml runtime.
|Tags||No tags attached.|
|Attached Files|| Foo.ml [^] (1,723 bytes) 2011-08-02 22:50|
foo.c [^] (1,680 bytes) 2011-12-21 10:21
patch-delete.patch [^] (2,483 bytes) 2011-12-22 09:35 [Show Content]
patch-invert.patch [^] (2,182 bytes) 2011-12-22 09:36 [Show Content]
|I checked the Win32 implementation of send and recv, and the master lock is properly released, allowing both system calls to run in parallel. So, I suspect the mutual exclusion occurs in the Win32/Winsock system calls themselves. Concerning your comparison with C, did you keep in mind that OCaml creates sockets in blocking mode, which may not be the case in your C program? That could at least explain the difference.|
You're correct - I hadn't seen the setsockopt calls in select.c and accept.c. Attached is my hacked up C equivalent code (compiled with gcc -mno-cygwin -o foo.exe foo.c -lws2_32)
According to MSDN (http://msdn.microsoft.com/en-us/library/windows/desktop/ms740532(v=vs.85).aspx [^]), SO_OPENTYPE is deprecated and WSASocket should be used instead with WSA_FLAG_OVERLAPPED not given in dwFlags. If you alter the call to WSASocket in my example to have 0 for the last parameter then the C code exhibits the same problem as the OCaml code - so it's Winsock, not OCaml specifically.
But - why is this Microsoft specific socket option being set in the first place? As far as I can see, all you're doing is creating sockets which can't use overlapping operations (which aren't used anyway) - can't the call simply be removed? Note that overlapped I/O and blocking/non-blocking in Microsoft parlance are *unrelated* concepts: see first para of http://msdn.microsoft.com/en-us/library/windows/desktop/ms740087(v=VS.85).aspx [^] - ioctlsocket still controls blocking or non-blocking settings.
The "SO_OPENTYPE = SO_SYNCHRONOUS_NONALERT" hack probably goes back to the early days of the win32unix library, i.e. Windows NT 4 / Windows 98 (or maybe even 3.5 / 95). At that time, I think it was necessary so that sockets would be created in blocking mode. But many things have changed since then. We'd gladly consider a patch to modernize this aspect of win32unix (and maybe others too), provided you give it some testing on your applications first.
|I'm just putting a Windows 7 virtual PC together for some serious testing ... I agree that I think all these bugs are related to this. Will get back to you later today!|
OK - I've tested two possible ways of fixing it.
The first is simply to remove lines 31-41 & 49 in otherlibs/win32unix/select.c and 29, 34-42 & 48-52 in otherlibs/win32unix/accept.c
The second is to invert the logic of the getsockopt block instead - i.e. ensure that SO_OPENTYPE is zero rather than non-zero before calling socket (or accept)
I don't know which you'd prefer - the first option (deleting the problem code) was my instinct; the second one guards against another linked in C library messing around with Winsock options (I don't know whether this is something you'd want to care about in general or not, though).
Happy to provide a patch for either, if necessary.
Note that neither option fixes PR 5327, surprisingly.
|Patches against trunk for both options attached...|
Thanks for the patches. For the sake of simplicity, I go with the "delete" patch, which is now applied in SVN trunk, commit 11966.
|Because we lack time to fix PR#5578 in time for release 4.00, I reverted the patch on branch version/4.00 but left it on trunk.|
|2011-08-02 22:50||dra||New Issue|
|2011-08-02 22:50||dra||File Added: Foo.ml|
|2011-12-20 13:40||xleroy||Note Added: 0006410|
|2011-12-20 13:40||xleroy||Status||new => feedback|
|2011-12-21 10:21||dra||File Added: foo.c|
|2011-12-21 10:36||dra||Note Added: 0006427|
|2011-12-21 10:36||dra||Status||feedback => new|
|2011-12-21 12:08||xleroy||Relationship added||related to 0005327|
|2011-12-21 12:21||xleroy||Note Added: 0006439|
|2011-12-21 12:21||xleroy||Status||new => feedback|
|2011-12-21 12:41||dra||Note Added: 0006440|
|2011-12-21 12:41||dra||Status||feedback => new|
|2011-12-21 18:14||dra||Note Added: 0006470|
|2011-12-22 09:35||dra||File Added: patch-delete.patch|
|2011-12-22 09:36||dra||File Added: patch-invert.patch|
|2011-12-22 09:36||dra||Note Added: 0006484|
|2011-12-28 11:40||xleroy||Note Added: 0006547|
|2011-12-28 11:40||xleroy||Status||new => resolved|
|2011-12-28 11:40||xleroy||Resolution||open => fixed|
|2011-12-28 11:40||xleroy||Fixed in Version||=> 3.13.0+dev|
|2012-04-07 12:32||xleroy||Relationship added||related to 0005578|
|2012-05-24 19:11||xleroy||Note Added: 0007455|
|2012-05-24 19:11||xleroy||Fixed in Version||3.13.0+dev => 4.01.0+dev|
|Copyright © 2000 - 2011 MantisBT Group|