Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0004444OCamlOCaml generalpublic2007-11-11 15:482012-09-25 20:07
ReporterSnark 
Assigned Toxclerc 
PrioritynormalSeverityfeatureReproducibilityalways
StatusclosedResolutionfixed 
PlatformOSOS Version
Product Version3.10.0 
Target VersionFixed in Version3.12.1+dev 
Summary0004444: Lack of String.strip
DescriptionHere is an implementation, hopefully not too buggy :
let strip str =
    let len = String.length str
    and is_space char = (char = ' ' || char = '\t') in
    let start = ref 0
    and stop = ref (len - 1)
    and seen_non_space = ref false in
        for ii = 0 to len - 1 do
            if is_space str.[ii] then
                (if not !seen_non_space then
                    start := ii + 1)
            else
                (seen_non_space := true;
                stop := ii)
        done;
        String.sub str !start (!stop - !start +1)
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0004281)
Snark (reporter)
2007-11-11 22:13

The function is_space_char should probably have a || char = '\n' so it's possible to strip a full line.
(0004302)
thelema (reporter)
2007-11-13 19:40

Is the intent to trim whitespace off the ends of a string? i.e. " foo " -> "foo"? If so, I might write it like this (not tested):

let trim s =
  let rec skip_space dir i =
    if s.[i] = ' ' || s.[i] = '\t' (* throws Invalid_argument *)
    then skip_space dir (i+dir)
    else i in
  try
    let start = skip_space 1 0
    and end = skip_space -1 (String.length s - 1) in
    String.sub str start (end-start+1)
  with Invalid_argument _ -> "" (* out of bounds searching for non-space *)

I admit to cheating a bit on the bounds handling, but this seems a more caml-esque implementation. As to the question of whether this should be in the standard library, I'd vote yes, but I'm for a heavier stdlib than OCaml has even now.
(0004304)
Snark (reporter)
2007-11-13 21:00

I propose pushing the test on what is a whitespace in a function like I did -- and let's not forget to cut '\n' too.

Your solution looks better : you only go on the whitespaces while mine goes on the whole string. Notice though that it doesn't work because you use the 'end' keyword as a variable and miss some ( ) here and there.

Amended solution :

let trim s =
    let is_space c = (c = ' ' || c = '\t' || c = '\n') in
    let rec skip_space dir i =
      if is_space s.[i] (* throws Invalid_argument *)
      then skip_space dir (i+dir)
      else i
    in
      try (
        let start = skip_space 1 0
        and stop = skip_space (-1) ((String.length s) - 1) in
          String.sub s start (stop -start+1)
      ) with Invalid_argument _ -> "";;
(0004305)
thelema (reporter)
2007-11-13 21:29

Even with a perfect implementation, the question quickly comes "does this function belong in the stdlib?"

The two best arguments against inclusion are: 1) having to maintain more code as part of the core OCaml distribution, and 2) Those who need this function are probably doing quick & dirty parsing and using this function to clean up improperly cut strings.

Considering the first point, imagine a more shared ownership of maintenance responsibility. The OCaml community has resources (time, programming energy) to contribute to finding and fixing bugs. At the moment, those at INRIA seem to allow us outsiders to only find & report bugs, but want to control strongly what makes it into the ocaml distribution. This makes some sense for the core compiler, as its architecture must remain solid in order for them to use it as a base for their research. And users of OCaml come and go, while those at INRIA will continue to have a hand in code they contribute for a long time. OTOH, the stdlib seems like a fine place for the community to take part in making OCaml more friendly, as the barrier of entry is *much* lower.

As to the second point, I find only one fault with it: users still want this functionality, even if there exists a better way.

- Issue History
Date Modified Username Field Change
2007-11-11 15:48 Snark New Issue
2007-11-11 22:13 Snark Note Added: 0004281
2007-11-12 14:08 xleroy Status new => acknowledged
2007-11-13 19:40 thelema Note Added: 0004302
2007-11-13 21:00 Snark Note Added: 0004304
2007-11-13 21:29 thelema Note Added: 0004305
2011-09-15 15:19 xclerc Status acknowledged => resolved
2011-09-15 15:19 xclerc Fixed in Version => 3.12.1+dev
2011-09-15 15:19 xclerc Resolution open => fixed
2011-09-15 15:19 xclerc Assigned To => xclerc
2012-09-25 20:07 xleroy Status resolved => closed


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker