New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows: Exception raised when reading from a socket #5578
Comments
Comment author: gerd As far as I remember, the issue always existed - the Ocaml runtime does not make an attempt to recv instead of read from a socket. So it depends only on Windows whether this works. What I can imagine is that some (newer) Windows versions support it, and older don't. |
Comment author: @xavierleroy Could be related to the changes for #5325. For the record: under Win32, Unix.file_descr values carry a "socket" / "not a socket" flag, which enables functions such as Unix.read to adapt their behavior, calling ReadFile for non-sockets and recv for sockets. The buffered I/O system (types in_channel and out_channel), however, calls into the read() and write() functions of the C library, which Microsoft's C runtime library implement using ReadFile and WriteFile, without making a special case for sockets. So, the issue boils down to whether ReadFile works correctly when given a socket handle. The answer seems to be "yes" on modern Windows versions (it was "no" under Windows 95/98), provided the socket is in synchronous mode. And that's where the changes described in #5325 could be problematic. |
Comment author: @dra27 Are the changes in #5325 definitely the problem? That was about ensuring that the socket was definitely in synchronous mode? I too have seen this with my Win32 patched version of 3.12.1 and it definitely did work in previous versions of OCaml (I only hadn't raised a bug report because I hadn't had a chance yet to try it on a "clean" 3.12.1 installation). |
Comment author: @xavierleroy My understanding is:
All this is guesswork to be confirmed by experiments. Any experimental data is most welcome. |
Comment author: @alainfrisch I confirm that reverting socket.c and accept.c to r11965 fixes the current issue. Does the current Unix interface make it possible to put a socket in synchronous mode? Would it make sense to keep the socket flag in the buffered I/O system channel structure? |
Comment author: @xavierleroy
AFAIK, Win32 provides no way to change the synchronous/asynchronous mode of a socket or a file handle after it's been created. This mode can only be specified when the socket is created or the file is opened.
By itself it wouldn't help because the buffered I/O code is shared between Unix and Win32 and uses read() and write() for I/O, which Microsoft's C runtime implements using ReadFile() and WriteFile(). At a minimum you'd have to #ifdef in byterun/io.c and improve on MSVCRT's read()/write() emulation. At this point, I'm tempted to revert #5325 to fix this #5578 bug, on the ground that #5325 is less critical than #5578. |
Comment author: @alainfrisch
Just out of curiosity, is there even a clean way to specify this mode, e.g. with "accept"? The current approach looks like a (thread-unsafe, if not protected) hack.
I agree. (Disclaimer: I'm strongly biased, we don't do multi-threading.) |
Comment author: gerd When I understand it correctly, ReadFile behaves differently when it gets an async file handle (like a socket - unless we set the socket to synchronous). I guess the "Invalid argument" is due to the missing OVERLAPPED structure ReadFile needs to process async handles. Wouldn't be the clean solution to just handle async reads? This KB article explains how to wait for the completion of an async request: http://support.microsoft.com/kb/156932/en-us . We only need to call GetOverlappedResult. This would be simple enough (if it really worked). |
Comment author: @alainfrisch
The I/O code in byterun/ relies on libc read/write functions, which probably calls ReadFile/WriteFile without an OVERLAPPED structure. As Xavier wrote, a proper solution might indeed be to call directly the Win32 functions, which would allow us to pass do whatever is the right solution under Windows. This is a non-trivial change. Now we just need a volunteer to propose and test a patch! Xavier, can you confirm that such an approach would have chances to be accepted? |
Comment author: @dra27 Alain - yes, there is a clean way. SO_OPENTYPE is deprecated - WSASocket should be used instead with a value of 0 for dwFlags (i.e. WSA_FLAG_OVERLAPPED not included). I haven't tried it, but looking at the documentation you should then use AcceptEx instead of accept/WSAAccept as that allows you to create the socket yourself (so you can create a non-overlapped socket). It doesn't look like WSAAccept can do that directly. Might an alternative fix be to include either extra functions or add an optional parameter to the Win32 Unix modules accept and socket functions so either accept_async and socket_async - which create synchronous but threadable sockets (or Unix.socket: ?async:bool -> Unix.socket_domain -> Unix.socket_type -> int etc., with the default being false). But, I'd be happy to have a stab at "fixing" io.c - it looks to me as though caml_do_read and do_write are the only two functions requiring the extra work? Wrapped in #ifdef WINDOWS, if errno is EINVAL then we want to attempt either recv or send instead (_get_osfhandle maps the CRT FD back to the Win32 socket handle) - both will return WSAENOTSOCK for an actual EINVAL (i.e. for an invalid FD) so we'd just need to call caml_sys_io_error in that instance. That does mean that reading/writing through the CRT becomes even more inefficient than it was before - but accessing sockets in that way under Windows is more about compatibility because ReadFile is already a considerably slower way of reading a socket (according to MSDN). And, frankly, if one were in the business of trying to write a high performance sockets application for Windows in OCaml, you'd write your own bindings for Winsock, rather than using the Unix module... |
Comment author: @alainfrisch dra: thanks for all the info. I suggest you wait for some feedback from Xavier before writing the patch. I agree that degrading performance when working with sockets under Windows through OCaml channels is not a big deal. But (out of curiosity, mainly), if we keep in the channel structure the fact that a filedescr is a socket, we should be able to call directly recv/send directly, right? Why would this be slower than the current situation? |
Comment author: @dra27 I guess it could be done that way (it certainly wouldn't be slower, I agree) - the "problem" is that caml_do_read is used in lots of places (although a quick grep reveals that it's only in fact in io.c). Although caml_do_read and do_write take an fd, they're in fact only ever used in the context of a channel so the signature could be safely altered to pass the channel structure instead for the Windows version. Of course, a certain amount of care is required as caml_do_read is an exported primitive. Let's wait and see what Xavier says... it's obviously beneficial not to hurt performance where possible, but that will involve peppering io.c with more #ifdefs! |
Comment author: gerd dra: not sure whether this is the right way. We have currently two options: (a) Stick to non-overlapped sockets, and hope that read() works I see several problems with (a). Some socket functionality seems to be unavailable. MSDN mentions the socket timeouts (SO_RCVTIMEO, SO_SNDTIMEO) on the page for WSASocket. It is unclear what this means exactly. At least Unix exports these options. The other problem is the strange AcceptEx function, which is not directly available, and seems to have restrictions. "The AcceptEx function uses overlapped I/O, unlike the accept function." This sounds as if you cannot use it with non-overlapped sockets at all (but who knows). There is also a strange sentence that the accept socket can only be passed to certain functions, but the whole description is contradictory, because the page mentions other functions like getpeeraddr and CreateIoCompletionPort. All in all, the description for AcceptEx sounds as if this function is not for general use, but only for specific cases even Microsoft does not know exactly. I think (b) would have fewer uncertainties. You would only have to get the file handle from the fd (get_osfhandle), and then call ReadFile/WriteFile directly - i.e. we develop our own versions of read/write that can deal with overlapped I/O. |
Comment author: @dra27 I agree - option b) is the best. It's the only way to get threading working properly (which is what was fixed in PR 5325 but has then caused this PR). The WSASocket/AcceptEx stuff was just what "should" be being done if you wanted non-overlapped sockets using "modern" Winsock calls as MSDN lists SO_OPENTYPE as deprecated - it wasn't clear which version of accept/WSAAccept/AcceptEx is required to create non-overlapping sockets. It may just be that there's a gap in the API (Microsoft release an incomplete API: never!!) |
Comment author: @alainfrisch I mark this issue as "block", because I really think we need to make a decision before the next release. |
Comment author: @xavierleroy Reverted patch for #5325 in version/4.00, so that the present issue is not present in the upcoming 4.00.0 release. I leave this PR open because we still need to do something about it in a later release. Downgrading the "block" status to "major" as a consequence. |
Comment author: @xavierleroy My thoughts on how we could address this issue. It means reimplementing bits of Microsoft's C runtime library and make them better Here is a plan of action: byterun/io.c: in do_read and do_write, #ifdef the calls to read() and byterun/win32.c: implement the caml_win32_{read,write} functions as Minimal: Problem: if the fd is in O_TEXT mode, read() converts the text, |
Comment author: @alainfrisch Xavier: do you remember what's the current status of this on the trunk (according to the note, you only reverted in version/4.00)? It seems the proper fix won't be done before the next release, and we should at least ensure that we don't introduce a regression here. |
Comment author: @alainfrisch There seem to be some confusion here. The original patch (commit 11966) was undone by Xavier (commit 12480) on branch 4.00 only, with the hope we would reach a better solution on the trunk. This commit was imported by Damien (commit 12720) while merging changes from 4.00 onto the trunk, and he later undid it (commit 13743), because that change was not intended to be on the trunk. I've undid that last commit on 4.01 (commit 13779) so that we don't get a regression in the next release. So, to summarize: 4.01 has today the same old version (4.00 and before); trunk has still the one which fixes #5325 but breaks #5578. I propose to undo the original patch from the trunk as well, so that we don't replay the same vaudeville again. Keeping #5325 open should be enough to remember about it, and this would reduce the risk of introducing a regression in a future release. I'll do it in a few days unless someone objects to it. |
Comment author: @alainfrisch Pushing to 4.02 now that 4.01 is out of danger. |
Comment author: @alainfrisch As said, commit 13860 re-undoes commit 11966 (on trunk). The trunk is now synchronized with 4.01 and free of this bug #5578. |
Comment author: @xavierleroy If I correctly followed the vaudeville, this PR was fixed both in 4.00 and in 4.01 even though it reappeared somewhere in between. I'm marking it resolved. The one that remains open in 5325. |
Comment author: waern In case this information is helpful to anyone (perhaps in making a decision on how to proceed): I did a naive test implementation of xleroy's plan of action and it seems to work fine together with the fix of #5325. |
…ve a proper fix. git-svn-id: http://caml.inria.fr/svn/ocaml/version/4.01@13779 f963ae5c-01c2-4b8c-9fe0-0dff7051ff02
Original bug ID: 5578
Reporter: @alainfrisch
Status: closed (set by @xavierleroy on 2015-07-24T08:38:58Z)
Resolution: fixed
Priority: normal
Severity: major
Fixed in version: 4.00.0
Category: platform support (windows, cross-compilation, etc)
Related to: #4466 #5325
Monitored by: @dra27
Bug description
In recent versions of the trunk, the following piece of code raises an exception on input_line:
let () =
let sock = Unix.socket Unix.PF_INET Unix.SOCK_STREAM 0 in
Unix.setsockopt sock Unix.SO_REUSEADDR true;
Unix.bind sock (Unix.ADDR_INET (Unix.inet_addr_any, 8090));
Unix.listen sock 10;
let (client, _) = Unix.accept sock in
Unix.set_close_on_exec client;
let ic = Unix.in_channel_of_descr client in
print_endline (input_line ic)
(compiled with: ocamlc -custom -o test_http.exe unix.cma test_http.ml
to trigger the error, just connect to 127.0.0.1:8090 (with telnet, a browser, ...)
The exception is:
Fatal error: exception Sys_error("Invalid argument")
I've tested this with the trunk (msvc and mingw ports). I'm pretty sure this used to work with previous versions.
Replacing input_line with Unix.recv seems to work.
The text was updated successfully, but these errors were encountered: