Anonymous | Login | Signup for a new account | 2014-10-02 12:35 CEST |

Main | My View | View Issues | Change Log | Roadmap |

View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | |||||||

ID | Project | Category | View Status | Date Submitted | Last Update | |||

0006111 | OCaml | OCaml standard library | public | 2013-07-31 08:19 | 2013-08-28 15:55 | |||

Reporter | berenger | |||||||

Assigned To | ||||||||

Priority | normal | Severity | feature | Reproducibility | always | |||

Status | closed | Resolution | no change required | |||||

Platform | OS | OS Version | ||||||

Product Version | 4.00.1 | |||||||

Target Version | Fixed in Version | |||||||

Summary | 0006111: Set.Make.split not very useful | |||||||

Description | Hello, Set.Make.split : elt -> t -> t * bool * t As a scientific programmer, I would most times want to split like this: split_lt = (elements < x, elements >= x) or split_lte = (elements <= x, elements > x) Would it be possible to introduce split_lt and split_lte? Or maybe: split_lt_gte and split_lte_gt? Thanks, Francois. PS: remember that x may be different from the elements that compare = to x but that is actually in the set being split | |||||||

Tags | No tags attached. | |||||||

Attached Files | ||||||||

Notes | |

(0010048) berenger (reporter) 2013-08-01 04:08 |
Here is my naive implementation in file set.ml: let rec split_lt_gte x = function | Empty -> (Empty, Empty) | Node(l, v, r, _) -> let c = Ord.compare x v in if c = 0 then (l, add v r) else if c < 0 then let (ll, rl) = split_lt_gte x l in (ll, join rl v r) else let (lr, rr) = split_lt_gte x r in (join l v lr, rr) let rec split_lte_gt x = function | Empty -> (Empty, Empty) | Node(l, v, r, _) -> let c = Ord.compare x v in if c = 0 then (add v l, r) else if c < 0 then let (ll, rl) = split_lte_gt x l in (ll, join rl v r) else let (lr, rr) = split_lte_gt x r in (join l v lr, rr) I say naive because I see in set.ml functions such that add_min_element and add_max_element that should maybe be used. However, as I am not so acquainted with set.ml, I leave this to the INRIA hackers. Regards, F. |

(0010049) berenger (reporter) 2013-08-01 05:46 |
Maybe the smaller change is to create: Set.Make.split_opt : elt -> t -> t * elt option * t Then users could easily create the other functions on top of this. |

(0010058) xleroy (administrator) 2013-08-01 11:33 |
Hello Francois, I can see where those new functions would be useful, having recently done something similar (but more general) in Coq, as part of CompCert. I am not sure this is general enough, though. Consider sets of pairs, lexicographically ordered, and assume you want to extract (efficiently) the set of pairs whose first component is equal to "x". If it's a set of int * int pairs, your API works fine: let (lt_x, ge_x) = split_lt (x, min_int) s in let (eq_x, gt_x) = split_le (x, max_int) ge_x in eq_x But if it's a set of string * string pairs, it doesn't work, as strings have no least element, no greater element, and no successor operation. The approach I used in CompCert is described here: http://compcert.inria.fr/doc/html/FSetAVLplus.html [^] It selects set elements according to two boolean predicates that must be compatible with the set order, in a way made precise in the Coq development. But I don't think this can be exposed as is in OCaml's Set API, because it can break the BST invariant. |

(0010076) berenger (reporter) 2013-08-02 04:56 |
I don't know Coq unfortunately. Your problem reminds me of interval trees from computational geometry, but in a more general/abstract setting. https://github.com/UnixJunkie/interval-tree/blob/master/interval_tree.ml [^] Would there be any problem with the following function for set.ml: let rec split_opt x = function | Empty -> (Empty, None, Empty) | Node(l, v, r, _) -> let c = Ord.compare x v in if c = 0 then (l, Some v, r) else if c < 0 then let (ll, maybe, rl) = split_opt x l in (ll, maybe, join rl v r) else let (lr, maybe, rr) = split_opt x r in (join l v lr, maybe, rr) |

(0010077) berenger (reporter) 2013-08-02 05:11 |
If we have a split_opt, then we can have: let split_lt_ge x s = let l, maybe, r = split_opt x s in match maybe with | Some eq_x -> l, add eq_x r | None -> l, r let split_le_gt x s = let l, maybe, r = split_opt x s in match maybe with | Some eq_x -> add eq_x l, r | None -> l, r I personally need only the splitting functions. But I guess some users may want to directly use split_opt in some cases. |

(0010082) xleroy (administrator) 2013-08-02 16:40 |
The forthcoming 4.01 release has a new function "find" in module Set, which can be used to implement the splitting functions you need: let split_opt x s = let (l, maybe, r) = S.split x s in (l, if maybe then Some(S.find x s) else None, r) let split_lt_ge x s = let (l, maybe, r) = S.split x s in (l, if maybe then S.add (S.find x) r else r) The cost of calling S.find (log n) is negligible compared with that of S.split (n). The even more general variants of "split" that I mentioned cannot be encoded efficiently this way, though. |

(0010090) berenger (reporter) 2013-08-04 15:13 |
The cost of S.split is really O(n)? I thought only S.partition had that bad a cost. http://caml.inria.fr/pub/docs/manual-ocaml/libref/Set.S.html [^] Sorry if I am wrong, F. |

(0010238) berenger (reporter) 2013-08-26 03:34 |
The functions split_opt, split_lt and split_le now have corresponding pending pull request in batteries included. |

(0010242) berenger (reporter) 2013-08-27 03:07 |
I think this issue can be closed: the functions are in batteries now. |

(0010251) xleroy (administrator) 2013-08-28 15:55 |
Closing this PR as suggested. And, yes, S.split is logarithmic time, not linear; my mistake. |

Issue History | |||

Date Modified | Username | Field | Change |

2013-07-31 08:19 | berenger | New Issue | |

2013-08-01 04:08 | berenger | Note Added: 0010048 | |

2013-08-01 05:46 | berenger | Note Added: 0010049 | |

2013-08-01 11:33 | xleroy | Note Added: 0010058 | |

2013-08-01 11:33 | xleroy | Status | new => acknowledged |

2013-08-02 04:56 | berenger | Note Added: 0010076 | |

2013-08-02 05:11 | berenger | Note Added: 0010077 | |

2013-08-02 16:40 | xleroy | Note Added: 0010082 | |

2013-08-04 15:13 | berenger | Note Added: 0010090 | |

2013-08-26 03:34 | berenger | Note Added: 0010238 | |

2013-08-27 03:07 | berenger | Note Added: 0010242 | |

2013-08-28 15:55 | xleroy | Note Added: 0010251 | |

2013-08-28 15:55 | xleroy | Status | acknowledged => closed |

2013-08-28 15:55 | xleroy | Resolution | open => no change required |

Copyright © 2000 - 2011 MantisBT Group |