Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006120OCamlOCaml otherlibspublic2013-08-06 12:472015-11-19 10:54
ReporterChristophe Troestler 
Assigned To 
Platformx86_64OSWindowsOS Version7
Product Version4.00.1 
Target Version4.03.0+devFixed in Version 
Summary0006120: Windows links
DescriptionThe documentation of the Unix module [^] says that "link", "symlink" and "readlink" are not implemented. This is no longer true for "link". Moreover, it seems symlinks exist on recent versions of Windows: [^] [^]
TagsNo tags attached.
Attached Filespatch file icon PR6120.patch [^] (29,619 bytes) 2015-07-29 14:34 [Show Content]
patch file icon PR6120-st_ino.patch [^] (5,730 bytes) 2015-07-30 16:43 [Show Content]

- Relationships

-  Notes
dra (reporter)
2015-07-21 12:36

Is this being actively worked on? I've just done some research/C-based PoC on supporting native Windows symlinks for OPAM's infrastructure and would be happy to work on the stubs needed for symlink and readlink.

As far as I can see, apart from lots of documentation explaining the subtleties and irritations of Windows symbolic links, what is needed is one additional function has_symlink: unit -> bool (returning HAS_SYMLINK for *nix, but using OpanProcessToken to test for the SeCreateSymbolicLinkPrivilege on Windows) and an optional parameter to the symlink function, because on Windows you have to specify if the symbolic link is to a directory or a regular file. I'd suggest that the type should become symlink: ?kind:file_kind -> string -> string -> unit where specifying ~kind on *nix causes Invalid_argument to be raised (similarly on Windows if the file_kind is not S_DIR or S_REG). The default behaviour for this parameter on Windows would be to determine the type of the symlink target (if it exists) and otherwise create a file symlink.
dbuenzli (reporter)
2015-07-21 17:05
edited on: 2015-07-21 17:05

> I'd suggest that the type should become symlink: ?kind:file_kind -> string -> string -> unit where specifying ~kind on *nix causes Invalid_argument to be raised (similarly on Windows if the file_kind is not S_DIR or S_REG).

If it's going to raise invalid_argument on Unix then there is no point of changing the interface of this function and rather an alternative function should be provided for windows (as you will have to conditionalize the code on Sys.win32 anyways).

But then, can't the windows stub simply determine if the symbolic link is to a directory or regular file and hide this peculiarity to the Unix module user ?

dra (reporter)
2015-07-21 17:27
edited on: 2015-07-21 17:27

Why does the code calling the proposed Unix.symlink have to be conditional for Sys.win32?

dbuenzli (reporter)
2015-07-21 18:37

Well since the ?kind argument raises on Unix I cannot use Unix.symlink in a cross-platform fashion and need to conditionalize the code according to windows or unix. I would like to be able to use Unix.symlink without having to bother whether I'm on Unix or Windows (as the rest of the Unix module allows me).
dra (reporter)
2015-07-21 18:50

You can't use Unix.symlink in a cross-platform way without a vast amount of worrying if you're on Windows unless your definition of cross-platform excludes Windows (that's kind of already the case - and there's at least one other instance in Unix of differences in behaviour).

In a "cross-platform" usage you just wouldn't specify ?kind - and accept the default behaviour on Windows (which will be correct in all except the case of a symlink targeting a directory which hasn't been created).

If you need to create a symbolic link to a non-existent directory, you might write:

if Sys.win32 then
  Unix.symlink ~kind:Unix.S_DIR "foo" "bar"
  Unix.symlink "foo" "bar"

but therefore the unix version of Unix still needs either the optional argument (or a dummy version of Unix.windows_symlink if you want a separate function instead of an optional argument).

I suppose there's a case for having the unix version of Unix just silently ignore the optional argument?
dbuenzli (reporter)
2015-07-21 19:00
edited on: 2015-07-21 19:00

My question is what prevents you in the stub of Unix.symlink on windows to actually perform what is needed to match the semantics of symlink(2) ?

Alternatively we could also simply return ENOENT for the case of the non-existent directory (and document the fact that this happens on windows).

> I suppose there's a case for having the unix version of Unix just silently ignore the optional argument?

That would be at least a saner option but then I still don't exactly see why we can't simply avoid breaking compatibility here by handling this in the windows stub (namely doing the Unix.symlink ~kind:Unix.S_DIR in the stub if the target dir doesn't exist).

dra (reporter)
2015-07-21 19:04

There are two kinds of symlink on Windows - if the target doesn't exist you need to know whether it's a file or a directory (if you don't specify ~kind:S_DIR then it would assume S_REG).

The Unix module, like Cygwin, could try to wallpaper over this corner case - but like Cygwin's support, it would make it largely worthless on Windows, because you'd need to implement it properly in another library.
dbuenzli (reporter)
2015-07-21 19:15

> There are two kinds of symlink on Windows - if the target doesn't exist you need to know whether it's a file or a directory (if you don't specify ~kind:S_DIR then it would assume S_REG)

Ah ok this is what I didn't get. Then yes adding an argument to denote the intent seems to be the way to go as the program may actually know this. But then this should simply be *ignored* on Unix, so that we don't have to specialize code paths for no reasons. Also I would simply have this:

Unix.symlink : ?is_dir:bool -> string -> string

as it avoids the need for Invalid_arguments.
dra (reporter)
2015-07-21 19:25

Agreed - I was too set on the "this argument shouldn't be specified for unix" model!

is_dir is a much better solution (I was only lazily using file_kind to avoid creating yet another variant type). There is a third kind of link (junctions - which could possibly have been represented with S_LNK) - but these really should be handled by a separate Windows-only function, as they're something conceptually very different, and handled by a different Windows API call.
xleroy (administrator)
2015-07-25 18:09

OK for an optional "is_dir" argument, if we absolutely must. But by all means let's ignore it under Unix!

As to whether "this is actively worked on", the answer is no to the best of my knowledge. A proof-of-concept patch for otherlibs/win32unix would be appreciated.
dra (reporter)
2015-07-29 14:40
edited on: 2015-07-30 00:59

Patch attached - individual commits available from [^]

In related work, I discovered that the CRT's implementation of stat has had a potted history (see the comments added in stat.c). The latest implementation (see the Great C Runtime Refactoring at [^] - the Microsoft CRT is now amusingly written in C++) includes a basically working implementation of stat (although I think they have broken st_dev). However, no implementation of lstat is provided, so I hope you won't baulk too much at the inclusion of a full implementation of stat because it's needed in order to provide lstat! Microsoft make the relevant code for the CRT available, which allows us to ensure that the behaviour of stat is equivalent. The only thing I "fixed" is that for some reason Microsoft hard-code st_nlink = 1, even though the actual value is available.

I've tested all 4 Windows native ports on 64-bit Windows 7 and also verified the build on Ubuntu 14.4. The only thing which made me slightly nervous is the necessity (for mingw) to switch from _fstati64 to _fstat64 but I'm reasonably certain looking at the history of win32unix/stat.c that the i64 variants would have been chosen to maximise compatibility with older C runtimes which are now irrelevant.

I have also verified that a 32-bit compiled executable still runs correctly on Windows XP, although personally I think that all software should be trying as noisily as possible to cease supporting Windows XP!

Incidentally, as I happened to have a Windows 2000 Virtual Box on this machine, OCaml no longer supports Windows 2000 (I don't know when that broke).

dra (reporter)
2015-07-30 16:43

One further commit - included as a separate patch on top of the previous one (and pushed to github).

My motivation for fixing stat has been trying to get ocamlbuild to work with a source directory which includes symbolic links (which the dose library uses). I originally tracked down the inconsistency in the CRTs because ocamlbuild could build dose when compiled with mingw64 but not with msvc.

The fix for stat fixed the problem with ocamlbuild - but the inclusion of a working lstat promptly broke it again! I discovered that the reason is in ocamlbuild/ which, quite reasonably, assumes that (st_dev, st_ino) (in this case My_unix.stat_key as defined in ocamlbuild/ uniquely identifies a regular file or directory on the system, as per POSIX. Because the Microsoft implementations always give st_ino = 0, I had done this too.

The additional patch uses dwVolumeSerialNumber to populate st_dev (which is a potentially breaking change if any weird Windows application has relied on the value of st_dev being 0=A:, 1=B:, 2=C:, etc.) and uses nFileIndex to populate st_ino (see [^]). However, for maximum compatibility, the patch silently drops any bits beyond max_int - however, as far as I can tell NTFS allocates index numbers sequentially so this is highly unlikely to be a problem in 64-bit OCaml and reasonably unlikely to be an issue in 32-bit OCaml. What is does mean is that for files and directories (st_dev, st_ino) should uniquely represent a file (I think it may even be hard-link compliant, though I haven't double-checked)

I haven't, however, applied this change to fstat and I'm wondering if for consistency it should be too - it's not a lot of extra code over what I've already done for stat, because various parts are shared. I notice that we already have the case that Unix.fstat returns st_kind = S_FIFO instead of S_SOCK for sockets.

Xavier - assuming you're content with my patch so far (I use the word "content", rather than "happy"!), would you be similarly content for it to be extended to fstat and, if so, would a separate PR be appropriate?

I have to say looking at the implementations, I'd take the view that the Microsoft CRT doesn't provide stat or fstat, in the same way as it doesn't provide symlink or readlink - in other words, this doesn't feel like round a bug!
xleroy (administrator)
2015-07-30 17:21

I'm not complaining about the reimplementation of stat() -- it is clear that the implementation from the CRT is not doing enough. fstat() is not terribly useful, but why not give it the same treatment, if you can share as much code as possible. I'll let the intended users (the OPAM people, the ocamlbuild people) say whether the new implementations fit their needs.
gasche (developer)
2015-11-16 15:36
edited on: 2015-11-17 22:59

I am bumping the status of this PR a bit, based on the reports by David "dra" Allsopp that this is the main upstream change that would be necessary to have a good support for OPAM on native windows -- and cannot be worked around easily outside the distribution.

Edit: below is a comment/explanation from dra I received by email

> Since Windows Vista there is support for Unix-style symbolic
> links in Windows (since Windows 2000 there was support for
> a limited kind, but it was for directories only and only
> allowed absolute paths, amongst other limitations). There are
> two downsides to the support: firstly, in default
> configuration, only members of the Administrators
> group (i.e. root) can create them and Administrators can only
> do that when elevated (somewhat akin to having to do sudo ln
> all the time, but it’s much much much less convenient than just
> running a command like sudo!). That said, it is possible to
> configure Windows so that ordinary users can create them with
> no limitations, so they are useful, just awkward to
> set-up. Secondly, Windows has two kinds of symbolic links –
> directory symbolic links and file symbolic links. You have to
> specify which kind you are creating – it sounds very weird from
> the Unix-perspective, but it does make ‘sense’ when you see how
> they’re implemented in Windows. In particular, if a directory
> symbolic link actually dereferences to a file, then trying to
> read the directory (or open the file) results in a file system
> error.
> My patch ( [^])
> implements the missing readlink and symlink for
> win32unix. There is a small ABI-breaking but not code-breaking
> change to UNIX’s Unix module in that symlink now has an
> optional [?is_dir] parameter (to deal with the second
> problem above). I also add the function Unix.has_symlink which
> on Windows is a dynamic test to determine if the user is able
> to create symbolic links (and on other OSes is simply the value
> for HAS_SYMLINK found by configure)
> The other thing added to the patch is that Microsoft’s CRT
> implementation of stat is really broken (and they have never
> added lstat) – I add a Windows-implementation of lstat to the
> Unix module and also completely re-implemented stat in C. This
> is important as the low-quality structure returned by
> Microsoft’s stat caused ocamlbuild on Windows not to work if
> native symbolic links were being used.
> The only thing remaining, for completeness, is to re-implement
> fstat so that stat, lstat and fstat return consistent
> information.

gasche (developer)
2015-11-18 17:08

So we discussed this (but not in depth) at the developer meeting, and Xavier is sympathetic to the need and would be ready to merge a patch if there is a reasonable final patch proposed. The feature freeze for 4.03 should be in around a month, so David (dra) and Xavier would need to converge rather quickly.
frisch (developer)
2015-11-19 10:54

So, by default, regular users are not able to create symlinks? This means that to make OPAM work, we'd necessarily need to change Windows's configuration?

- Issue History
Date Modified Username Field Change
2013-08-06 12:47 Christophe Troestler New Issue
2013-08-19 17:18 doligez Status new => acknowledged
2013-08-19 17:18 doligez OS Debian GNU/Linux => Windows
2013-08-19 17:18 doligez Target Version => 4.01.1+dev
2013-08-19 17:18 doligez OS Version 3.10.0 => 7
2014-05-25 20:20 doligez Target Version 4.01.1+dev => 4.02.0+dev
2014-07-24 22:53 doligez Target Version 4.02.0+dev => 4.02.1+dev
2014-09-04 00:25 doligez Target Version 4.02.1+dev => undecided
2014-09-24 00:02 doligez Target Version undecided => 4.03.0+dev
2015-07-21 12:36 dra Note Added: 0014226
2015-07-21 17:05 dbuenzli Note Added: 0014230
2015-07-21 17:05 dbuenzli Note Edited: 0014230 View Revisions
2015-07-21 17:27 dra Note Added: 0014231
2015-07-21 17:27 dra Note Edited: 0014231 View Revisions
2015-07-21 18:37 dbuenzli Note Added: 0014232
2015-07-21 18:50 dra Note Added: 0014233
2015-07-21 19:00 dbuenzli Note Added: 0014234
2015-07-21 19:00 dbuenzli Note Edited: 0014234 View Revisions
2015-07-21 19:04 dra Note Added: 0014235
2015-07-21 19:15 dbuenzli Note Added: 0014236
2015-07-21 19:25 dra Note Added: 0014237
2015-07-25 18:09 xleroy Note Added: 0014270
2015-07-25 18:23 dra Note Added: 0014272
2015-07-29 14:34 dra File Added: PR6120.patch
2015-07-29 14:35 dra Note Deleted: 0014272
2015-07-29 14:40 dra Note Added: 0014298
2015-07-30 00:59 dra Note Edited: 0014298 View Revisions
2015-07-30 16:43 dra Note Added: 0014305
2015-07-30 16:44 dra File Added: PR6120-st_ino.patch
2015-07-30 17:21 xleroy Note Added: 0014306
2015-11-16 15:34 gasche Status acknowledged => confirmed
2015-11-16 15:36 gasche Note Added: 0014697
2015-11-17 22:59 gasche Note Edited: 0014697 View Revisions
2015-11-18 17:08 gasche Note Added: 0014720
2015-11-19 10:54 frisch Note Added: 0014723

Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker