Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filename.concat "C:" "a" returns "C:a" instead of "C:\\a" #6614

Closed
vicuna opened this issue Oct 15, 2014 · 2 comments
Closed

Filename.concat "C:" "a" returns "C:a" instead of "C:\\a" #6614

vicuna opened this issue Oct 15, 2014 · 2 comments

Comments

@vicuna
Copy link

vicuna commented Oct 15, 2014

Original bug ID: 6614
Reporter: Camarade_Tux
Status: closed (set by @xavierleroy on 2016-12-07T10:36:52Z)
Resolution: not a bug
Priority: normal
Severity: minor
Category: platform support (windows, cross-compilation, etc)

Bug description

Title says it all:

Filename.concat "C:" "a" returns "C:a" instead of "C:\a"

Looking at the code (that's from 4.01), the reason is quite obvious:

let is_dir_sep s i = let c = s.[i] in c = '/' || c = '\' || c = ':'

let concat dirname filename =
let l = String.length dirname in
if l = 0 || is_dir_sep dirname (l-1)
then dirname ^ filename
else dirname ^ dir_sep ^ filename

I'm not sure what the reason was. In any case, from http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx (some more windows madness):
If a file name begins with only a disk designator but not the backslash after the colon, it is interpreted as a relative path to the current directory on the drive with the specified letter.
[...]
"C:tmp.txt" refers to a file named "tmp.txt" in the current directory on drive C.

I don't think anyone expect and relies on that behaviour. I also don't think many people still use the Windows "feature" mentioned above.

I'd simply drop the " || c = ':'" from is_dir_sep's code but this is Windows and there could be weird consequences behind any change. Can anyone think of reason not to remove that clause?

I've traced back this bit of code to commit ... 637 from Xavier Leroy in February 1996; the commit message is "Portage Windows". :)

@vicuna
Copy link
Author

vicuna commented Oct 16, 2014

Comment author: @xavierleroy

You gave a good explanation of the current behavior:

  • "C:" designates the current directory of drive C
  • "C:a" is file "a" in this current directory
  • "C:\" designates the root directory of drive C
  • "C:\a" is file "a" at root of drive C.

So, the behavior of Filename.concat is consistent with Windows's conventions for file paths, as bizarre as these conventions may be.

@vicuna
Copy link
Author

vicuna commented Oct 26, 2014

Comment author: Camarade_Tux

I see and it makes perfect sense. Unfortunately it makes Filename quite difficult to use at times (and I did something wrong myself).

I believe it would be good to expand the documentation on the topic inside Filename but that would need lengthy and contrived explanations, or adding Filename.split (or "split_path").
Merlin has such a function; its implementation is simple:

let rec split_path path acc =
match Filename.dirname path, Filename.basename path with
| dir, _ when dir = path -> dir :: acc
| dir, base -> split_path dir (base :: acc)

I've been working on tests for it and it looks good. UNC paths ("\\server\foo") don't work but that's because dirname itself doesn't handle them.

The approach to tests is to start with a string like "C:\foo\bar", split it, compare the result to a reference, then 'List.fold_left Filename.concat "" components' and compare the result to a reference again.
I would have liked to cd into each intermediate directory and check the remainder of the path still points to the same file but that's impractical for absolute paths and makes testing for systems other than the current one impossible.
For the testsuite, I think "filename.ml" would be copid to "./myFilename.ml" which would then be 's/Sys.os_type/Sys.argv.(1)/' in order to be able to test other systems. This also give access to the internal Win32/Unix/Cygwin modules which might prove useful in the future.

So, would it be ok to add Filename.split along with tests and documentation on the expected behaviour?

(I'm proposing this because I strongly believe it is useful, simple, fills a hole in the stdlib (not being able to roundtrip) and will avoid many bugs; this won't immediately shrink my own codebase since I'm still depending on 3.12.1 and my next step will be 4.01 considering what distros are going to be shipping for the next few years)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant