Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows links #6120

Closed
vicuna opened this issue Aug 6, 2013 · 19 comments
Closed

Windows links #6120

vicuna opened this issue Aug 6, 2013 · 19 comments
Assignees
Milestone

Comments

@vicuna
Copy link

vicuna commented Aug 6, 2013

Original bug ID: 6120
Reporter: @Chris00
Assigned to: @gasche
Status: closed (set by @xavierleroy on 2017-09-24T15:32:21Z)
Resolution: fixed
Priority: normal
Severity: minor
Platform: x86_64
OS: Windows
OS Version: 7
Version: 4.00.1
Target version: 4.03.0+dev / +beta1
Fixed in version: 4.03.0+dev / +beta1
Category: otherlibs
Monitored by: @gasche @dbuenzli @alainfrisch

Bug description

The documentation of the Unix module http://caml.inria.fr/pub/docs/manual-ocaml/manual036.html says that "link", "symlink" and "readlink" are not implemented. This is no longer true for "link". Moreover, it seems symlinks exist on recent versions of Windows:
http://www.maximumpc.com/article/howtos/howto_master_your_file_system_mklink
http://www.windows7home.net/how-to-create-symbolic-link-in-windows-7/

File attachments

@vicuna
Copy link
Author

vicuna commented Jul 21, 2015

Comment author: @dra27

Is this being actively worked on? I've just done some research/C-based PoC on supporting native Windows symlinks for OPAM's infrastructure and would be happy to work on the stubs needed for symlink and readlink.

As far as I can see, apart from lots of documentation explaining the subtleties and irritations of Windows symbolic links, what is needed is one additional function has_symlink: unit -> bool (returning HAS_SYMLINK for *nix, but using OpanProcessToken to test for the SeCreateSymbolicLinkPrivilege on Windows) and an optional parameter to the symlink function, because on Windows you have to specify if the symbolic link is to a directory or a regular file. I'd suggest that the type should become symlink: ?kind:file_kind -> string -> string -> unit where specifying ~kind on *nix causes Invalid_argument to be raised (similarly on Windows if the file_kind is not S_DIR or S_REG). The default behaviour for this parameter on Windows would be to determine the type of the symlink target (if it exists) and otherwise create a file symlink.

@vicuna
Copy link
Author

vicuna commented Jul 21, 2015

Comment author: @dbuenzli

I'd suggest that the type should become symlink: ?kind:file_kind -> string -> string -> unit where specifying ~kind on *nix causes Invalid_argument to be raised (similarly on Windows if the file_kind is not S_DIR or S_REG).

If it's going to raise invalid_argument on Unix then there is no point of changing the interface of this function and rather an alternative function should be provided for windows (as you will have to conditionalize the code on Sys.win32 anyways).

But then, can't the windows stub simply determine if the symbolic link is to a directory or regular file and hide this peculiarity to the Unix module user ?

@vicuna
Copy link
Author

vicuna commented Jul 21, 2015

Comment author: @dra27

Why does the code calling the proposed Unix.symlink have to be conditional for Sys.win32?

@vicuna
Copy link
Author

vicuna commented Jul 21, 2015

Comment author: @dbuenzli

Well since the ?kind argument raises on Unix I cannot use Unix.symlink in a cross-platform fashion and need to conditionalize the code according to windows or unix. I would like to be able to use Unix.symlink without having to bother whether I'm on Unix or Windows (as the rest of the Unix module allows me).

@vicuna
Copy link
Author

vicuna commented Jul 21, 2015

Comment author: @dra27

You can't use Unix.symlink in a cross-platform way without a vast amount of worrying if you're on Windows unless your definition of cross-platform excludes Windows (that's kind of already the case - and there's at least one other instance in Unix of differences in behaviour).

In a "cross-platform" usage you just wouldn't specify ?kind - and accept the default behaviour on Windows (which will be correct in all except the case of a symlink targeting a directory which hasn't been created).

If you need to create a symbolic link to a non-existent directory, you might write:

if Sys.win32 then
Unix.symlink ~kind:Unix.S_DIR "foo" "bar"
else
Unix.symlink "foo" "bar"

but therefore the unix version of Unix still needs either the optional argument (or a dummy version of Unix.windows_symlink if you want a separate function instead of an optional argument).

I suppose there's a case for having the unix version of Unix just silently ignore the optional argument?

@vicuna
Copy link
Author

vicuna commented Jul 21, 2015

Comment author: @dbuenzli

My question is what prevents you in the stub of Unix.symlink on windows to actually perform what is needed to match the semantics of symlink(2) ?

Alternatively we could also simply return ENOENT for the case of the non-existent directory (and document the fact that this happens on windows).

I suppose there's a case for having the unix version of Unix just silently ignore the optional argument?

That would be at least a saner option but then I still don't exactly see why we can't simply avoid breaking compatibility here by handling this in the windows stub (namely doing the Unix.symlink ~kind:Unix.S_DIR in the stub if the target dir doesn't exist).

@vicuna
Copy link
Author

vicuna commented Jul 21, 2015

Comment author: @dra27

There are two kinds of symlink on Windows - if the target doesn't exist you need to know whether it's a file or a directory (if you don't specify ~kind:S_DIR then it would assume S_REG).

The Unix module, like Cygwin, could try to wallpaper over this corner case - but like Cygwin's support, it would make it largely worthless on Windows, because you'd need to implement it properly in another library.

@vicuna
Copy link
Author

vicuna commented Jul 21, 2015

Comment author: @dbuenzli

There are two kinds of symlink on Windows - if the target doesn't exist you need to know whether it's a file or a directory (if you don't specify ~kind:S_DIR then it would assume S_REG)

Ah ok this is what I didn't get. Then yes adding an argument to denote the intent seems to be the way to go as the program may actually know this. But then this should simply be ignored on Unix, so that we don't have to specialize code paths for no reasons. Also I would simply have this:

Unix.symlink : ?is_dir:bool -> string -> string

as it avoids the need for Invalid_arguments.

@vicuna
Copy link
Author

vicuna commented Jul 21, 2015

Comment author: @dra27

Agreed - I was too set on the "this argument shouldn't be specified for unix" model!

is_dir is a much better solution (I was only lazily using file_kind to avoid creating yet another variant type). There is a third kind of link (junctions - which could possibly have been represented with S_LNK) - but these really should be handled by a separate Windows-only function, as they're something conceptually very different, and handled by a different Windows API call.

@vicuna
Copy link
Author

vicuna commented Jul 25, 2015

Comment author: @xavierleroy

OK for an optional "is_dir" argument, if we absolutely must. But by all means let's ignore it under Unix!

As to whether "this is actively worked on", the answer is no to the best of my knowledge. A proof-of-concept patch for otherlibs/win32unix would be appreciated.

@vicuna
Copy link
Author

vicuna commented Jul 29, 2015

Comment author: @dra27

Patch attached - individual commits available from https://github.com/dra27/ocaml/tree/pr6120.

In related work, I discovered that the CRT's implementation of stat has had a potted history (see the comments added in stat.c). The latest implementation (see the Great C Runtime Refactoring at http://blogs.msdn.com/b/vcblog/archive/2014/06/10/the-great-crt-refactoring.aspx - the Microsoft CRT is now amusingly written in C++) includes a basically working implementation of stat (although I think they have broken st_dev). However, no implementation of lstat is provided, so I hope you won't baulk too much at the inclusion of a full implementation of stat because it's needed in order to provide lstat! Microsoft make the relevant code for the CRT available, which allows us to ensure that the behaviour of stat is equivalent. The only thing I "fixed" is that for some reason Microsoft hard-code st_nlink = 1, even though the actual value is available.

I've tested all 4 Windows native ports on 64-bit Windows 7 and also verified the build on Ubuntu 14.4. The only thing which made me slightly nervous is the necessity (for mingw) to switch from _fstati64 to _fstat64 but I'm reasonably certain looking at the history of win32unix/stat.c that the i64 variants would have been chosen to maximise compatibility with older C runtimes which are now irrelevant.

I have also verified that a 32-bit compiled executable still runs correctly on Windows XP, although personally I think that all software should be trying as noisily as possible to cease supporting Windows XP!

Incidentally, as I happened to have a Windows 2000 Virtual Box on this machine, OCaml no longer supports Windows 2000 (I don't know when that broke).

@vicuna
Copy link
Author

vicuna commented Jul 30, 2015

Comment author: @dra27

One further commit - included as a separate patch on top of the previous one (and pushed to github).

My motivation for fixing stat has been trying to get ocamlbuild to work with a source directory which includes symbolic links (which the dose library uses). I originally tracked down the inconsistency in the CRTs because ocamlbuild could build dose when compiled with mingw64 but not with msvc.

The fix for stat fixed the problem with ocamlbuild - but the inclusion of a working lstat promptly broke it again! I discovered that the reason is in ocamlbuild/slurp.ml which, quite reasonably, assumes that (st_dev, st_ino) (in this case My_unix.stat_key as defined in ocamlbuild/ocamlbuild_unix_plugin.ml) uniquely identifies a regular file or directory on the system, as per POSIX. Because the Microsoft implementations always give st_ino = 0, I had done this too.

The additional patch uses dwVolumeSerialNumber to populate st_dev (which is a potentially breaking change if any weird Windows application has relied on the value of st_dev being 0=A:, 1=B:, 2=C:, etc.) and uses nFileIndex to populate st_ino (see https://msdn.microsoft.com/en-us/library/windows/desktop/aa363788.aspx). However, for maximum compatibility, the patch silently drops any bits beyond max_int - however, as far as I can tell NTFS allocates index numbers sequentially so this is highly unlikely to be a problem in 64-bit OCaml and reasonably unlikely to be an issue in 32-bit OCaml. What is does mean is that for files and directories (st_dev, st_ino) should uniquely represent a file (I think it may even be hard-link compliant, though I haven't double-checked)

I haven't, however, applied this change to fstat and I'm wondering if for consistency it should be too - it's not a lot of extra code over what I've already done for stat, because various parts are shared. I notice that we already have the case that Unix.fstat returns st_kind = S_FIFO instead of S_SOCK for sockets.

Xavier - assuming you're content with my patch so far (I use the word "content", rather than "happy"!), would you be similarly content for it to be extended to fstat and, if so, would a separate PR be appropriate?

I have to say looking at the implementations, I'd take the view that the Microsoft CRT doesn't provide stat or fstat, in the same way as it doesn't provide symlink or readlink - in other words, this doesn't feel like round a bug!

@vicuna
Copy link
Author

vicuna commented Jul 30, 2015

Comment author: @xavierleroy

I'm not complaining about the reimplementation of stat() -- it is clear that the implementation from the CRT is not doing enough. fstat() is not terribly useful, but why not give it the same treatment, if you can share as much code as possible. I'll let the intended users (the OPAM people, the ocamlbuild people) say whether the new implementations fit their needs.

@vicuna
Copy link
Author

vicuna commented Nov 16, 2015

Comment author: @gasche

I am bumping the status of this PR a bit, based on the reports by David "dra" Allsopp that this is the main upstream change that would be necessary to have a good support for OPAM on native windows -- and cannot be worked around easily outside the distribution.

Edit: below is a comment/explanation from dra I received by email

Since Windows Vista there is support for Unix-style symbolic
links in Windows (since Windows 2000 there was support for
a limited kind, but it was for directories only and only
allowed absolute paths, amongst other limitations). There are
two downsides to the support: firstly, in default
configuration, only members of the Administrators
group (i.e. root) can create them and Administrators can only
do that when elevated (somewhat akin to having to do sudo ln
all the time, but it’s much much much less convenient than just
running a command like sudo!). That said, it is possible to
configure Windows so that ordinary users can create them with
no limitations, so they are useful, just awkward to
set-up. Secondly, Windows has two kinds of symbolic links –
directory symbolic links and file symbolic links. You have to
specify which kind you are creating – it sounds very weird from
the Unix-perspective, but it does make ‘sense’ when you see how
they’re implemented in Windows. In particular, if a directory
symbolic link actually dereferences to a file, then trying to
read the directory (or open the file) results in a file system
error.

My patch (https://github.com/dra27/ocaml/tree/pr6120)
implements the missing readlink and symlink for
win32unix. There is a small ABI-breaking but not code-breaking
change to UNIX’s Unix module in that symlink now has an
optional [?is_dir] parameter (to deal with the second
problem above). I also add the function Unix.has_symlink which
on Windows is a dynamic test to determine if the user is able
to create symbolic links (and on other OSes is simply the value
for HAS_SYMLINK found by configure)

The other thing added to the patch is that Microsoft’s CRT
implementation of stat is really broken (and they have never
added lstat) – I add a Windows-implementation of lstat to the
Unix module and also completely re-implemented stat in C. This
is important as the low-quality structure returned by
Microsoft’s stat caused ocamlbuild on Windows not to work if
native symbolic links were being used.

The only thing remaining, for completeness, is to re-implement
fstat so that stat, lstat and fstat return consistent
information.

@vicuna
Copy link
Author

vicuna commented Nov 18, 2015

Comment author: @gasche

So we discussed this (but not in depth) at the developer meeting, and Xavier is sympathetic to the need and would be ready to merge a patch if there is a reasonable final patch proposed. The feature freeze for 4.03 should be in around a month, so David (dra) and Xavier would need to converge rather quickly.

@vicuna
Copy link
Author

vicuna commented Nov 19, 2015

Comment author: @alainfrisch

So, by default, regular users are not able to create symlinks? This means that to make OPAM work, we'd necessarily need to change Windows's configuration?

@vicuna
Copy link
Author

vicuna commented Dec 2, 2015

Comment author: @alainfrisch

Ping?

@vicuna
Copy link
Author

vicuna commented Dec 10, 2015

Comment author: @dra27

Sorry for delay replying - Mantis ate my original reply, and I didn't have time to rewrite it at the time (is there a Mantis tracker for Mantis - the reason it ate my comment was a stupid piece of error handling for expired sessions...)

I am literally just about to do some work on this, which I'll push to https://github.com/dra27/ocaml/tree/pr6120 hopefully no later than Saturday. Given that OCaml is now being hosted on GitHub, is it now preferred to move to a GitHub pull request at that point?

Anyway, answering the question: indeed, by default regular users cannot create symlinks. OPAM will provide a command which will allow users who are able to obtain administrative access to their system to enable/disable them for ordinary users (see https://github.com/dra27/opam-experiments/blob/master/symlinks/symlink.c). Administrative users are forced to run elevated to use them (that's a highly tedious, and self-evidently stupid, Windows policy limitation).

There will of course be users who physically cannot change the configuration of the machine they're on - for this reason, users of OPAM will not be required to have symbolic links enabled, they'll just be faced with a warning (that can be disabled) and OPAM will attempt to workaround the lack of real symbolic links (see preliminary work in https://github.com/dra27/opam-experiments/blob/master/symlinks/convert-symlinks.sh)

However, by having this change in 4.03, we have the benefit that we can advertise an unpatched version of OCaml which supports them "properly".

@vicuna
Copy link
Author

vicuna commented Apr 17, 2016

Comment author: @gasche

This was resolved by dra's pull request #462

#462

commited both in trunk ( 76265a1) and 4.03 ( 76265a1 ).

Thanks David!

@vicuna vicuna closed this as completed Sep 24, 2017
@vicuna vicuna added this to the 4.03.0 milestone Mar 14, 2019
@vicuna vicuna added the bug label Mar 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants