Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007276OCamlplatform support (windows, cross-compilation, etc)public2016-06-16 13:122017-09-24 17:32
Reporterdjs55 
Assigned Tofrisch 
PrioritynormalSeveritymajorReproducibilityalways
StatusclosedResolutionfixed 
PlatformOSOS Version
Product Version4.03.0 
Target VersionFixed in Version4.04.0 +dev / +beta1 / +beta2 
Summary0007276: Unix.select fast path misses events if the list of fds is greater than FD_SETSIZE (typically 64)
DescriptionHi,

On Windows there's a problem with the Unix.select fast path used when all
the fds are sockets: the MSDN docs for select[1] say

> The variable FD_SETSIZE determines the maximum number of descriptors in
> a set. (The default value of FD_SETSIZE is 64, which can be modified by
> defining FD_SETSIZE to another value before including Winsock2.h.)

With the default FD_SETSIZE of 64, if the sockets list is of length 65+
then later sockets will be silently ignored. The attached program demonstrates
this effect on OCaml 4.03 and 4.02.3.

A possible fix appears to be to fall back to the slow path for lists of fds greater
than FD_SETSIZE (like we do already for list of fds of varying types).
For example the following patch:

diff --git a/otherlibs/win32unix/select.c b/otherlibs/win32unix/select.c
index 0e21db8..b592e3a 100644
--- a/otherlibs/win32unix/select.c
+++ b/otherlibs/win32unix/select.c
@@ -909,9 +909,12 @@ static int fdlist_to_fdset(value fdlist, fd_set *fdset)
 {
   value l, c;
   FD_ZERO(fdset);
+ int used = 0;
   for (l = fdlist; l != Val_int(0); l = Field(l, 1)) {
     c = Field(l, 0);
     if (Descr_kind_val(c) == KIND_SOCKET) {
+ used++;
+ if (used > FD_SETSIZE) return 0;
       FD_SET(Socket_val(c), fdset);
     } else {
       DEBUG_PRINT("Non socket value encountered");

This is similar to "PR#5563: harden Unix.select against file descriptors
above FD_SETSIZE" except that we can still handle these larger lists using
the more generic (but more complicated) path.


[1] https://msdn.microsoft.com/en-gb/library/windows/desktop/ms740141(v=vs.85).aspx [^]
Steps To ReproduceOn Windows (I'm using the cygwin based installer from https://fdopen.github.io/opam-repository-mingw/ [^])

ocamlfind ocamlopt -package unix -linkpkg -o test.exe test.ml
./test.exe

For me the large Unix.select will time out and the program prints "ERROR" and exits with code 1. On OSX and on Windows with the patch above applied, it prints "OK" and exits with code 0.
Additional InformationI encountered this when filtering all network connections from a VM through the Mirage TCP/IP stack and out into the Internet via regular sockets. Occasionally the I/O would stop (but other Lwt timer threads continue), and then after some timeout it would wake up again. I believe this is because the list of active socket connections was > 64 and all the activity was on the fds at the end of the list.
TagsNo tags attached.
Attached Files? file icon test.ml [^] (1,705 bytes) 2016-06-16 13:12 [Show Content]

- Relationships
related to 0005563closed Caml.Unix.select doesn't bounds-check file-descriptor integer 

-  Notes
(0015987)
frisch (developer)
2016-06-21 09:25

Can you confirm the following:

   - FD_SETSIZE is supposed to give the maximum number of file descriptors that can go into an fd_set, independently of their numerical values. Hence the fix suggested here, which counts the number of such file descriptors.

   - On some systems, FD_SETSIZE gives an upper bound to fd values that can go into an fd_set. This corresponds to the fix in 0005563 (which would be overly restrictive with the other interpretation above). I assume that most Unix-like where OCaml is supported follow this convention. Is that right?
(0015988)
frisch (developer)
2016-06-21 09:36

I confirm the behavior (tested with the MSVC 32-bit port), and reversing the list before the final select in the example has the expected effect.
(0015989)
frisch (developer)
2016-06-21 12:41

Fixed by commit f642817.

It would be nice to include the test in the testsuite, but I'm concerned that its use of the network interface might break it. (E.g. on my machine, it triggered a Windows security popup on first use.)
(0015990)
frisch (developer)
2016-06-21 12:41

@djs55: feel free to give your real name for the Changes file if you wish.
(0015994)
djs55 (reporter)
2016-06-22 17:51

Many thanks for resolving this! For the Changes file could you put my real name "David Scott"?

Regarding your two statements, the Microsoft docs for select say:
> The variable FD_SETSIZE determines the maximum number of descriptors in a set.
and
> Internally, socket handles in an fd_set structure are not represented as bit flags as in Berkeley Unix. Their data representation is opaque.
so I think this confirms your first statement is correct on Windows.

I'm not enough of a Unix expert to confirm your second statement ("FD_SETSIZE gives an upper bound to fd values that can go into an fd_set") but I think that is correct. I believe Unix-like systems all use low integers as file descriptors (unlike Windows which uses opaque handles, probably pointer values). I believe select on Unix has traditionally used simple bitmaps where including fd "n" means setting bit "n" to 1. Therefore I think on Unix people are more concerned with ensuring the bitmaps are large enough to represent the highest-possible fd value (and the performance implications of frequently scanning large bitmaps).

- Issue History
Date Modified Username Field Change
2016-06-16 13:12 djs55 New Issue
2016-06-16 13:12 djs55 File Added: test.ml
2016-06-21 09:09 frisch Relationship added related to 0005563
2016-06-21 09:25 frisch Note Added: 0015987
2016-06-21 09:36 frisch Note Added: 0015988
2016-06-21 09:36 frisch Assigned To => frisch
2016-06-21 09:36 frisch Status new => confirmed
2016-06-21 12:41 frisch Note Added: 0015989
2016-06-21 12:41 frisch Note Added: 0015990
2016-06-21 12:42 frisch Status confirmed => resolved
2016-06-21 12:42 frisch Fixed in Version => 4.04.0 +dev / +beta1 / +beta2
2016-06-21 12:42 frisch Resolution open => fixed
2016-06-22 17:51 djs55 Note Added: 0015994
2017-02-23 16:46 doligez Category OCaml windows => platform support (windows, etc)
2017-02-23 17:16 doligez Category platform support (windows, etc) => platform support (windows, cross-compilation, etc)
2017-09-24 17:32 xleroy Status resolved => closed


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker