Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005578OCamlOCaml windowspublic2012-04-06 17:372013-09-17 15:45
Reporterfrisch 
Assigned To 
PrioritynormalSeveritymajorReproducibilityhave not tried
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version4.00.0 
Summary0005578: Windows: Exception raised when reading from a socket
DescriptionIn recent versions of the trunk, the following piece of code raises an exception on input_line:

let () =
  let sock = Unix.socket Unix.PF_INET Unix.SOCK_STREAM 0 in
  Unix.setsockopt sock Unix.SO_REUSEADDR true;
  Unix.bind sock (Unix.ADDR_INET (Unix.inet_addr_any, 8090));
  Unix.listen sock 10;
  let (client, _) = Unix.accept sock in
  Unix.set_close_on_exec client;
  let ic = Unix.in_channel_of_descr client in
  print_endline (input_line ic)


(compiled with: ocamlc -custom -o test_http.exe unix.cma test_http.ml
 to trigger the error, just connect to 127.0.0.1:8090 (with telnet, a browser, ...)

The exception is:

Fatal error: exception Sys_error("Invalid argument")

I've tested this with the trunk (msvc and mingw ports). I'm pretty sure this used to work with previous versions.

Replacing input_line with Unix.recv seems to work.
TagsNo tags attached.
Attached Files

- Relationships
related to 0005325confirmed Blocked Unix.recv in one thread blocks Unix.send in another thread under Windows 

-  Notes
(0007289)
gerd (reporter)
2012-04-06 20:51

As far as I remember, the issue always existed - the Ocaml runtime does not make an attempt to recv instead of read from a socket. So it depends only on Windows whether this works. What I can imagine is that some (newer) Windows versions support it, and older don't.
(0007290)
xleroy (administrator)
2012-04-07 12:32

Could be related to the changes for PR#5325.

For the record: under Win32, Unix.file_descr values carry a "socket" / "not a socket" flag, which enables functions such as Unix.read to adapt their behavior, calling ReadFile for non-sockets and recv for sockets.

The buffered I/O system (types in_channel and out_channel), however, calls into the read() and write() functions of the C library, which Microsoft's C runtime library implement using ReadFile and WriteFile, without making a special case for sockets.

So, the issue boils down to whether ReadFile works correctly when given a socket handle. The answer seems to be "yes" on modern Windows versions (it was "no" under Windows 95/98), provided the socket is in synchronous mode. And that's where the changes described in PR#5325 could be problematic.
(0007291)
dra (reporter)
2012-04-07 14:55

Are the changes in 0005325 definitely the problem? That was about ensuring that the socket was definitely in synchronous mode? I too have seen this with my Win32 patched version of 3.12.1 and it definitely did work in previous versions of OCaml (I only hadn't raised a bug report because I hadn't had a chance yet to try it on a "clean" 3.12.1 installation).
(0007292)
xleroy (administrator)
2012-04-07 16:34

My understanding is:
- for Alain Frisch's example to work, the socket must be in synchronous mode (cf. MSDN doc for ReadFile())
- we used to have SO_OPENTYPE hackery in win32unix/, probably to ensure that all sockets are created in synchronous mode;
- the hackery was removed as part of fixing PR#5325, and now all sockets are created in the default asynchronous mode (cf. MSDN page for socket()).

All this is guesswork to be confirmed by experiments. Any experimental data is most welcome.
(0007303)
frisch (developer)
2012-04-10 06:50

I confirm that reverting socket.c and accept.c to r11965 fixes the current issue.

Does the current Unix interface make it possible to put a socket in synchronous mode?

Would it make sense to keep the socket flag in the buffered I/O system channel structure?
(0007317)
xleroy (administrator)
2012-04-10 18:54

> Does the current Unix interface make it possible to put a socket in synchronous mode?

AFAIK, Win32 provides *no way* to change the synchronous/asynchronous mode of a socket or a file handle after it's been created. This mode can only be specified when the socket is created or the file is opened.

> Would it make sense to keep the socket flag in the buffered I/O system channel structure?

By itself it wouldn't help because the buffered I/O code is shared between Unix and Win32 and uses read() and write() for I/O, which Microsoft's C runtime implements using ReadFile() and WriteFile(). At a minimum you'd have to #ifdef in byterun/io.c and improve on MSVCRT's read()/write() emulation.

At this point, I'm tempted to revert 0005325 to fix this 0005578 bug, on the ground that 0005325 is less critical than 0005578.
(0007318)
frisch (developer)
2012-04-10 19:27

> This mode can only be specified when the socket is created or the file is opened.

Just out of curiosity, is there even a clean way to specify this mode, e.g. with "accept"? The current approach looks like a (thread-unsafe, if not protected) hack.

> At this point, I'm tempted to revert 0005325 to fix this 0005578 bug, on the ground that 0005325 is less critical than 0005578.

I agree. (Disclaimer: I'm strongly biased, we don't do multi-threading.)
(0007319)
gerd (reporter)
2012-04-10 22:07

When I understand it correctly, ReadFile behaves differently when it gets an async file handle (like a socket - unless we set the socket to synchronous). I guess the "Invalid argument" is due to the missing OVERLAPPED structure ReadFile needs to process async handles.

Wouldn't be the clean solution to just handle async reads? This KB article explains how to wait for the completion of an async request: http://support.microsoft.com/kb/156932/en-us [^] . We only need to call GetOverlappedResult. This would be simple enough (if it really worked).
(0007322)
frisch (developer)
2012-04-10 23:29

> the missing OVERLAPPED structure ReadFile needs to process async handles.

The I/O code in byterun/ relies on libc read/write functions, which probably calls ReadFile/WriteFile without an OVERLAPPED structure. As Xavier wrote, a proper solution might indeed be to call directly the Win32 functions, which would allow us to pass do whatever is the right solution under Windows. This is a non-trivial change. Now we just need a volunteer to propose and test a patch! Xavier, can you confirm that such an approach would have chances to be accepted?
(0007325)
dra (reporter)
2012-04-11 11:06

Alain - yes, there is a clean way. SO_OPENTYPE is deprecated - WSASocket should be used instead with a value of 0 for dwFlags (i.e. WSA_FLAG_OVERLAPPED not included). I haven't tried it, but looking at the documentation you should then use AcceptEx instead of accept/WSAAccept as that allows you to create the socket yourself (so you can create a non-overlapped socket). It doesn't look like WSAAccept can do that directly.

Might an alternative fix be to include either extra functions or add an optional parameter to the Win32 Unix modules accept and socket functions so either accept_async and socket_async - which create *synchronous* but threadable sockets (or Unix.socket: ?async:bool -> Unix.socket_domain -> Unix.socket_type -> int etc., with the default being false).

But, I'd be happy to have a stab at "fixing" io.c - it looks to me as though caml_do_read and do_write are the only two functions requiring the extra work? Wrapped in #ifdef WINDOWS, if errno is EINVAL then we want to attempt either recv or send instead (_get_osfhandle maps the CRT FD back to the Win32 socket handle) - both will return WSAENOTSOCK for an *actual* EINVAL (i.e. for an invalid FD) so we'd just need to call caml_sys_io_error in that instance.

That does mean that reading/writing through the CRT becomes even more inefficient than it was before - but accessing sockets in that way under Windows is more about compatibility because ReadFile is already a considerably slower way of reading a socket (according to MSDN). And, frankly, if one were in the business of trying to write a high performance sockets application for Windows in OCaml, you'd write your own bindings for Winsock, rather than using the Unix module...
(0007326)
frisch (developer)
2012-04-11 11:18

dra: thanks for all the info. I suggest you wait for some feedback from Xavier before writing the patch.

I agree that degrading performance when working with sockets under Windows through OCaml channels is not a big deal. But (out of curiosity, mainly), if we keep in the channel structure the fact that a filedescr is a socket, we should be able to call directly recv/send directly, right? Why would this be slower than the current situation?
(0007327)
dra (reporter)
2012-04-11 11:35

I guess it could be done that way (it certainly wouldn't be slower, I agree) - the "problem" is that caml_do_read is used in lots of places (although a quick grep reveals that it's only in fact in io.c).

Although caml_do_read and do_write take an fd, they're in fact only ever used in the context of a channel so the signature could be safely altered to pass the channel structure instead for the Windows version. Of course, a certain amount of care is required as caml_do_read is an exported primitive.

Let's wait and see what Xavier says... it's obviously beneficial not to hurt performance where possible, but that will involve peppering io.c with more #ifdefs!
(0007328)
gerd (reporter)
2012-04-11 12:47

dra: not sure whether this is the right way. We have currently two options:

(a) Stick to non-overlapped sockets, and hope that read() works
(b) Use overlapped sockets, and replace read() with ReadFile + postprocessing for the overlapped case

I see several problems with (a). Some socket functionality seems to be unavailable. MSDN mentions the socket timeouts (SO_RCVTIMEO, SO_SNDTIMEO) on the page for WSASocket. It is unclear what this means exactly. At least Unix exports these options. The other problem is the strange AcceptEx function, which is not directly available, and seems to have restrictions. "The AcceptEx function uses overlapped I/O, unlike the accept function." This sounds as if you cannot use it with non-overlapped sockets at all (but who knows). There is also a strange sentence that the accept socket can only be passed to certain functions, but the whole description is contradictory, because the page mentions other functions like getpeeraddr and CreateIoCompletionPort. All in all, the description for AcceptEx sounds as if this function is not for general use, but only for specific cases even Microsoft does not know exactly.

I think (b) would have fewer uncertainties. You would only have to get the file handle from the fd (get_osfhandle), and then call ReadFile/WriteFile directly - i.e. we develop our own versions of read/write that can deal with overlapped I/O.
(0007329)
dra (reporter)
2012-04-11 13:12

I agree - option b) is the best. It's the only way to get threading working properly (which is what was fixed in PR 5325 but has then caused this PR).

The WSASocket/AcceptEx stuff was just what "should" be being done if you wanted non-overlapped sockets using "modern" Winsock calls as MSDN lists SO_OPENTYPE as deprecated - it wasn't clear which version of accept/WSAAccept/AcceptEx is required to create non-overlapping sockets. It may just be that there's a gap in the API (Microsoft release an incomplete API: never!!)
(0007377)
frisch (developer)
2012-04-18 14:39

I mark this issue as "block", because I really think we need to make a decision before the next release.
(0007456)
xleroy (administrator)
2012-05-24 19:14

Reverted patch for 0005325 in version/4.00, so that the present issue is not present in the upcoming 4.00.0 release. I leave this PR open because we still need to do something about it in a later release. Downgrading the "block" status to "major" as a consequence.
(0007457)
xleroy (administrator)
2012-05-24 19:17

My thoughts on how we could address this issue. It means reimplementing bits of Microsoft's C runtime library and make them better Here is a plan of action:

byterun/io.c: in do_read and do_write, #ifdef the calls to read() and
write(), replacing them by calls to caml_win32_read and
caml_win32_write if _WIN32 is defined.

byterun/win32.c: implement the caml_win32_{read,write} functions as
follows (taking read as an example):

Minimal:
        call read(fd, buf, len) // from the C runtime library
        if error, call recv(_get_osfhandle(fd), buf, len, 0)

Problem: if the fd is in O_TEXT mode, read() converts the text,
but recv() does not. Do we live with this or go out of our way to fix
it? Is there any use for O_TEXT mode over a socket? Can we just report an error if [set_binary_mode_in ic false] is ever called on a channel obtained from a socket? Opinions from RFC experts are welcome.

(0009425)
frisch (developer)
2013-06-06 22:59

Xavier: do you remember what's the current status of this on the trunk (according to the note, you only reverted in version/4.00)? It seems the proper fix won't be done before the next release, and we should at least ensure that we don't introduce a regression here.
(0009491)
frisch (developer)
2013-06-14 14:00

There seem to be some confusion here. The original patch (commit 11966) was undone by Xavier (commit 12480) on branch 4.00 only, with the hope we would reach a better solution on the trunk. This commit was imported by Damien (commit 12720) while merging changes from 4.00 onto the trunk, and he later undid it (commit 13743), because that change was not intended to be on the trunk. I've undid that last commit on 4.01 (commit 13779) so that we don't get a regression in the next release. So, to summarize: 4.01 has today the same old version (4.00 and before); trunk has still the one which fixes 0005325 but breaks 0005578.

I propose to undo the original patch from the trunk as well, so that we don't replay the same vaudeville again. Keeping 0005325 open should be enough to remember about it, and this would reduce the risk of introducing a regression in a future release. I'll do it in a few days unless someone objects to it.
(0009493)
frisch (developer)
2013-06-14 14:03

Pushing to 4.02 now that 4.01 is out of danger.
(0009660)
frisch (developer)
2013-07-01 10:32

As said, commit 13860 re-undoes commit 11966 (on trunk). The trunk is now synchronized with 4.01 and free of this bug 0005578.
(0010056)
xleroy (administrator)
2013-08-01 11:05
edited on: 2013-08-14 16:52

If I correctly followed the vaudeville, this PR was fixed both in 4.00 and in 4.01 even though it reappeared somewhere in between. I'm marking it resolved. The one that remains open in 5325.

(0010367)
waern (reporter)
2013-09-17 15:45

In case this information is helpful to anyone (perhaps in making a decision on how to proceed): I did a naive test implementation of xleroy's plan of action and it seems to work fine together with the fix of 0005325.

- Issue History
Date Modified Username Field Change
2012-04-06 17:37 frisch New Issue
2012-04-06 20:51 gerd Note Added: 0007289
2012-04-07 12:32 xleroy Note Added: 0007290
2012-04-07 12:32 xleroy Status new => feedback
2012-04-07 12:32 xleroy Relationship added related to 0005325
2012-04-07 14:55 dra Note Added: 0007291
2012-04-07 16:34 xleroy Note Added: 0007292
2012-04-10 06:50 frisch Note Added: 0007303
2012-04-10 06:50 frisch Status feedback => new
2012-04-10 15:53 doligez Status new => acknowledged
2012-04-10 18:54 xleroy Note Added: 0007317
2012-04-10 19:27 frisch Note Added: 0007318
2012-04-10 22:07 gerd Note Added: 0007319
2012-04-10 23:29 frisch Note Added: 0007322
2012-04-11 11:06 dra Note Added: 0007325
2012-04-11 11:18 frisch Note Added: 0007326
2012-04-11 11:35 dra Note Added: 0007327
2012-04-11 12:47 gerd Note Added: 0007328
2012-04-11 13:12 dra Note Added: 0007329
2012-04-18 14:39 frisch Note Added: 0007377
2012-04-18 14:39 frisch Severity minor => block
2012-05-24 19:14 xleroy Note Added: 0007456
2012-05-24 19:17 xleroy Note Added: 0007457
2012-05-24 19:17 xleroy Severity block => major
2012-06-20 11:26 frisch Category OCaml otherlibs => OCaml windows
2012-07-06 16:14 doligez Target Version => 4.01.0+dev
2012-07-31 13:36 doligez Target Version 4.01.0+dev => 4.00.1+dev
2012-09-06 19:22 frisch Target Version 4.00.1+dev => 4.00.2+dev
2013-06-06 22:59 frisch Note Added: 0009425
2013-06-06 22:59 frisch Target Version 4.00.2+dev => 4.01.0+dev
2013-06-14 14:00 frisch Note Added: 0009491
2013-06-14 14:03 frisch Note Added: 0009493
2013-06-14 14:03 frisch Target Version 4.01.0+dev => 4.02.0+dev
2013-07-01 10:32 frisch Note Added: 0009660
2013-07-12 18:15 doligez Target Version 4.02.0+dev => 4.01.1+dev
2013-08-01 11:05 xleroy Note Added: 0010056
2013-08-01 11:05 xleroy Status acknowledged => resolved
2013-08-01 11:05 xleroy Resolution open => fixed
2013-08-01 11:05 xleroy Fixed in Version => 4.00.0
2013-08-01 11:05 xleroy Target Version 4.01.1+dev =>
2013-08-14 16:52 doligez Note Edited: 0010056 View Revisions
2013-09-17 15:45 waern Note Added: 0010367


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker