Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006111OCamlOCaml standard librarypublic2013-07-31 08:192013-08-28 15:55
Reporterberenger 
Assigned To 
PrioritynormalSeverityfeatureReproducibilityalways
StatusclosedResolutionno change required 
PlatformOSOS Version
Product Version4.00.1 
Target VersionFixed in Version 
Summary0006111: Set.Make.split not very useful
DescriptionHello,

Set.Make.split : elt -> t -> t * bool * t

As a scientific programmer, I would most times
want to split like this:
split_lt = (elements < x, elements >= x)
or
split_lte = (elements <= x, elements > x)

Would it be possible to introduce split_lt and split_lte?

Or maybe: split_lt_gte and split_lte_gt?

Thanks,
Francois.

PS: remember that x may be different from the
    elements that compare = to x but that is
    actually in the set being split
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0010048)
berenger (reporter)
2013-08-01 04:08

Here is my naive implementation in file set.ml:

    let rec split_lt_gte x = function
      | Empty -> (Empty, Empty)
      | Node(l, v, r, _) ->
        let c = Ord.compare x v in
        if c = 0 then (l, add v r)
        else if c < 0 then
          let (ll, rl) = split_lt_gte x l in
          (ll, join rl v r)
        else
          let (lr, rr) = split_lt_gte x r in
          (join l v lr, rr)

    let rec split_lte_gt x = function
      | Empty -> (Empty, Empty)
      | Node(l, v, r, _) ->
        let c = Ord.compare x v in
        if c = 0 then (add v l, r)
        else if c < 0 then
          let (ll, rl) = split_lte_gt x l in
          (ll, join rl v r)
        else
          let (lr, rr) = split_lte_gt x r in
          (join l v lr, rr)

I say naive because I see in set.ml functions such that add_min_element
and add_max_element that should maybe be used.
However, as I am not so acquainted with set.ml, I leave this
to the INRIA hackers.

Regards,
F.
(0010049)
berenger (reporter)
2013-08-01 05:46

Maybe the smaller change is to create:

Set.Make.split_opt : elt -> t -> t * elt option * t

Then users could easily create the other functions
on top of this.
(0010058)
xleroy (administrator)
2013-08-01 11:33

Hello Francois,

I can see where those new functions would be useful, having recently done something similar (but more general) in Coq, as part of CompCert.

I am not sure this is general enough, though. Consider sets of pairs, lexicographically ordered, and assume you want to extract (efficiently) the set of pairs whose first component is equal to "x".

If it's a set of int * int pairs, your API works fine:

let (lt_x, ge_x) = split_lt (x, min_int) s in
let (eq_x, gt_x) = split_le (x, max_int) ge_x in
eq_x

But if it's a set of string * string pairs, it doesn't work, as strings have no least element, no greater element, and no successor operation.

The approach I used in CompCert is described here:
http://compcert.inria.fr/doc/html/FSetAVLplus.html [^]

It selects set elements according to two boolean predicates that must be compatible with the set order, in a way made precise in the Coq development. But I don't think this can be exposed as is in OCaml's Set API, because it can break the BST invariant.
(0010076)
berenger (reporter)
2013-08-02 04:56

I don't know Coq unfortunately.
Your problem reminds me of interval trees from computational geometry,
but in a more general/abstract setting.

https://github.com/UnixJunkie/interval-tree/blob/master/interval_tree.ml [^]

Would there be any problem with the following function for set.ml:

    let rec split_opt x = function
      | Empty -> (Empty, None, Empty)
      | Node(l, v, r, _) ->
          let c = Ord.compare x v in
          if c = 0 then (l, Some v, r)
          else if c < 0 then
            let (ll, maybe, rl) = split_opt x l in
            (ll, maybe, join rl v r)
          else
            let (lr, maybe, rr) = split_opt x r in
            (join l v lr, maybe, rr)
(0010077)
berenger (reporter)
2013-08-02 05:11

If we have a split_opt, then we can have:

    let split_lt_ge x s =
      let l, maybe, r = split_opt x s in
      match maybe with
        | Some eq_x -> l, add eq_x r
        | None -> l, r

    let split_le_gt x s =
      let l, maybe, r = split_opt x s in
      match maybe with
        | Some eq_x -> add eq_x l, r
        | None -> l, r

I personally need only the splitting functions.
But I guess some users may want to directly use split_opt
in some cases.
(0010082)
xleroy (administrator)
2013-08-02 16:40

The forthcoming 4.01 release has a new function "find" in module Set, which can be used to implement the splitting functions you need:

let split_opt x s =
  let (l, maybe, r) = S.split x s in
  (l, if maybe then Some(S.find x s) else None, r)

let split_lt_ge x s =
  let (l, maybe, r) = S.split x s in
  (l, if maybe then S.add (S.find x) r else r)

The cost of calling S.find (log n) is negligible compared with that of S.split (n).

The even more general variants of "split" that I mentioned cannot be encoded efficiently this way, though.
(0010090)
berenger (reporter)
2013-08-04 15:13

The cost of S.split is really O(n)?
I thought only S.partition had that bad a cost.

http://caml.inria.fr/pub/docs/manual-ocaml/libref/Set.S.html [^]

Sorry if I am wrong,
F.
(0010238)
berenger (reporter)
2013-08-26 03:34

The functions split_opt, split_lt and split_le
now have corresponding pending pull request in batteries included.
(0010242)
berenger (reporter)
2013-08-27 03:07

I think this issue can be closed: the functions
are in batteries now.
(0010251)
xleroy (administrator)
2013-08-28 15:55

Closing this PR as suggested. And, yes, S.split is logarithmic time, not linear; my mistake.

- Issue History
Date Modified Username Field Change
2013-07-31 08:19 berenger New Issue
2013-08-01 04:08 berenger Note Added: 0010048
2013-08-01 05:46 berenger Note Added: 0010049
2013-08-01 11:33 xleroy Note Added: 0010058
2013-08-01 11:33 xleroy Status new => acknowledged
2013-08-02 04:56 berenger Note Added: 0010076
2013-08-02 05:11 berenger Note Added: 0010077
2013-08-02 16:40 xleroy Note Added: 0010082
2013-08-04 15:13 berenger Note Added: 0010090
2013-08-26 03:34 berenger Note Added: 0010238
2013-08-27 03:07 berenger Note Added: 0010242
2013-08-28 15:55 xleroy Note Added: 0010251
2013-08-28 15:55 xleroy Status acknowledged => closed
2013-08-28 15:55 xleroy Resolution open => no change required


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker