Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0004549OCamlOCaml generalpublic2008-05-07 15:012013-08-31 12:44
Reportertill 
Assigned To 
PrioritynormalSeverityminorReproducibilityalways
StatusclosedResolutionfixed 
PlatformOSOS Version
Product Version3.10.1 
Target VersionFixed in Version3.13.0+dev 
Summary0004549: Filename.dirname is not handling multiple / on Unix
DescriptionUnder unix / and // in a path are semantically the same. The case where double / are present in a path is rare but leads to dirname returning eronous results:

Filename.dirname (Filename.dirname "a/b//c") -> "a/b" instead of "a"
TagsNo tags attached.
Attached Files

- Relationships
related to 0005429acknowledged Unix.stat behaves differently on win32 and linux 
related to 0005395assignedxclerc OCamlbuild ignores relative-symlinked subdirectories or subdirectories with a trailing slash. 

-  Notes
(0004570)
doligez (administrator)
2008-08-04 17:20

We currently treat an empty path component as designating the current directory ".". It's not clear to me which is the best way.
(0004847)
till (reporter)
2009-03-03 17:37
edited on: 2009-03-03 17:42

It seems that all the linux c functions remove them from the path. The whole unix semantic of dirname can be found here:

http://www.opengroup.org/onlinepubs/9699919799/utilities/dirname.html [^]

It is also different on trailing "/" (dirname "b/a/" is "b"...). This difference is more likely to break code. I, for one, would like to see it be congruent with unix.

(0004848)
till (reporter)
2009-03-03 18:31

Here is a full implementation which, I believe, respects the posix semantic (I chose to project // to /, this is left unspecified in the posix guideline). It should behave roughly the same as the version currently in the standard_library but some corner cases are going to be different. This code is up for grabs and bot my employer and me would be happy to give the copyright to the inria if that can help.

let rec skip_end_slashes s n =
  if n <= 0 || s.[n-1] <> '/' then
    n
  else
    skip_end_slashes s (n-1)


(*
  http://www.opengroup.org/onlinepubs/9699919799/utilities/basename.html [^]
*)
let basename s =
  if s = "" then
    "."
  else
    let end_pos = skip_end_slashes s (String.length s) in
    if end_pos = 0 then
      "/"
    else
      let start_pos =
        let rec loop n =
          if n <= 0 || s.[n-1] = '/' then
            n
          else
            loop (n-1)
        in
        loop end_pos
      in
      String.sub ~pos:start_pos ~len:(end_pos-start_pos) s

(*
  http://www.opengroup.org/onlinepubs/9699919799/utilities/dirname.html [^]
*)
let dirname s =
  if s = "" then
    "."
  else
    let end_pos = skip_end_slashes s (String.length s) in
    if end_pos = 0 then
      "/"
    else
      let rec loop n =
        if n = 0 then
          "."
        else if s.[n-1] = '/' then
          let end_pos = skip_end_slashes s (n-1) in
          if end_pos = 0 then
            "/"
          else
            String.sub ~pos:0 ~len:end_pos s
        else
          loop (n-1)
      in
      loop end_pos
(0006245)
rixed (reporter)
2011-12-12 18:41

So, anything new wrt. this minor annoyance ?
If you come up with a patch I'd gladly test it (on Unix only, though).
(0006375)
jjb (reporter)
2011-12-18 13:57

I guess that it would be more portable to use Filename.dir_sep instead of '/'. Is this guaranteed to be a string of length 1, or would using something like Filename.check_suffix and Filename.chop_suffic instead of skip_end_slashes be needed?

Also note that the filename_concat function in ocamlbuild/my_std.ml involves some (non-portable) special case code for a related issue with Filename.concat. My limited understanding is that analogously changing Filename.concat would allow removing ocamlbuild's filename_concat.
(0006380)
till (reporter)
2011-12-19 06:12

unix (i.e. POSIX) has very precisely defined semantics (e.g. two dir_sep in a row in a path are equivalent to just one unless they are at the beginning of the path); I don't know windows well enough to be sure that changing the dirsep is enough.
I'd be happy to review any Unix specific code; I do not feel competent for the Windows part.
(0006383)
jjb (reporter)
2011-12-19 11:56

The Windows documentation unfortunately does not have the same level of detail:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247.aspx [^]

Appealing to Cygwin, cygpath strips repeated slashes when consuming Windows paths, except at the beginning of the path where the meaning is significant.

Also, the Windows system calls have always interpreted both '\' and '/' as a path separator. In the command shell '/' is used for command line options and therefore not reliable as a path separator, but if paths are quoted, then '/' can be used as a separator. Quoting is admittedly a nightmare. I am not familiar enough with the ocaml code base to know if using pathnames only as arguments to system calls, and removing the special casing for dir_sep, is feasible.
(0006518)
doligez (administrator)
2011-12-23 15:06

dir_sep is used by concat, and concat is used by users of Filename to build paths that can be passed to shell commands, so I don't think you can get rid of dir_sep.
(0006609)
doligez (administrator)
2012-01-06 15:27

I have committed a POSIX-compliant version of filename and dirname in trunk (rev 11999).
Still needs to be tested on Windows.

- Issue History
Date Modified Username Field Change
2008-05-07 15:01 till New Issue
2008-08-04 17:20 doligez Note Added: 0004570
2008-08-04 17:20 doligez Status new => acknowledged
2009-03-03 17:37 till Note Added: 0004847
2009-03-03 17:42 till Note Edited: 0004847
2009-03-03 18:31 till Note Added: 0004848
2011-12-12 18:41 rixed Note Added: 0006245
2011-12-15 11:55 gasche Relationship added related to 0005429
2011-12-17 20:13 meyer Assigned To => meyer
2011-12-17 20:13 meyer Status acknowledged => assigned
2011-12-18 13:57 jjb Note Added: 0006375
2011-12-19 06:12 till Note Added: 0006380
2011-12-19 11:56 jjb Note Added: 0006383
2011-12-23 15:06 doligez Note Added: 0006518
2011-12-28 22:03 meyer Assigned To meyer =>
2012-01-03 18:34 doligez Relationship added related to 0005395
2012-01-06 15:27 doligez Note Added: 0006609
2012-01-06 15:27 doligez Status assigned => resolved
2012-01-06 15:27 doligez Resolution open => fixed
2012-01-06 15:27 doligez Fixed in Version => 3.13.0+dev
2013-08-31 12:44 xleroy Status resolved => closed


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker