New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set.disjoint #6450
Comments
Comment author: @mshinwell If the answer to PR 6279 is that these functions should go in external libraries, I don't see how the inclusion of this (far less useful) function could be justified. |
Comment author: @alainfrisch #6279 is about a function which would not really benefit from being included in the functor. Well, actually, I just realized that the split function is exposed in Set, so the disjoint function could indeed be implemented outside the functor as well. But keeping it in Set would make it possible to apply further optimizations to it. |
Comment author: @chambart Note that this one cannot be efficiently implemented outside. The split function is effectively exported, but there is no function to find the root of the tree (here v1 in the pattern matching). And we wouldn't want such a root function, since it wouldn't have a clear semantics. |
Comment author: @alainfrisch Ah yes, indeed. |
Comment author: @mshinwell I get the feeling that, increasingly, the criterion for inclusion in the standard library should be whether or not the feature is needed now in the compiler distribution itself. If the feature is not needed there, then it goes outside, as per Damien's comment. If that is not the criterion, then the next most reasonable criterion would seem to be whether the feature assists in the provision of a comprehensive standard library. In that scenario, I cannot see how the inclusion of [Set.disjoint] but not [Set.map] could be justified. I cannot see why a good criterion for inclusion in the standard library would be one based on efficiency of some core set of functions, leaving users to write the remaining functions on top. That would seem to lead to a fragmented interface, and bugs. |
Comment author: @alainfrisch There is no consensus in the community about a good general purpose alternative to the standard library, and I believe the "official" stdlib will remain widely used, especially for libraries. If an extension fits well in the stdlib, I don't see why we should prevent ourselves from including it. Of course, we need to apply some reason and not include gazillions of tiny one-liner helper functions which only complicate the interface. I've no strong opinion about Set.map. This does not seem to be a very common operation on sets and it does not benefit particularly from the choice of the underlying concrete data structure; and it's a one-liner. I would have a more positive opinion if the function did something non-trivial, such as guaranteeing a linear complexity for the case where the mapped function is monotonic on the elements in the set (this does not seem very difficult). Checking disjointness of sets, or combining maps, on the other hand, seem more common and immediately benefit from being included in the stdlib.
So what do you propose? Rejecting simple and useful addition, esp. when they cannot be implemented externally without switching to an entirely different library, means that we encourage users to stop using the official stdlib. This is not a good approach while there is no good alternative to it. |
Comment author: ybarnoy There are advantages to having no dependencies. My old research project's approach was not to use any dependencies that were not in-tree. We therefore used the stdlib with our own enhancements. This makes the code very portable, especially for windows systems where opam doesn't work. Therefore, I see nothing wrong with enhancing the stdlib with useful functionality, even if it's never going to be at the level of Batteries or Core. Heck, had stdlib had low precedence function application (@@) earlier, we wouldn't have littered our code with our own symbol. I'm still amazed that there's no standard function composition operator in stdlib. Stdlib is the gold standard by which the other 2 libraries live, even if Core likes to flaunt its divergence from stdlib, and it's worth maintaining, even if it's not cutting edge IMO. |
Comment author: @alainfrisch For the records, we have been using the following version for some time: (* Same as split, but compute the left and right subtrees only if the pivot element is not in the set. The right subtree is computed on demand. *) type split_bis = | Found | NotFound of t * (unit -> t) let rec split_bis x = function Empty -> NotFound (Empty, (fun () -> Empty)) | Node(l, v, r, _) -> let c = Ord.compare x v in if c = 0 then Found else if c < 0 then match split_bis x l with | Found -> Found | NotFound (ll, rl) -> NotFound (ll, (fun () -> join (rl ()) v r)) else match split_bis x r with | Found -> Found | NotFound (lr, rr) -> NotFound (join l v lr, rr) let rec disjoint s1 s2 = match (s1, s2) with (Empty, _) | (_, Empty) -> true | (Node(l1, v1, r1, _), t2) -> if s1 == s2 then false else match split_bis v1 t2 with NotFound(l2, r2) -> disjoint l1 l2 && disjoint r1 (r2 ()) | Found -> false |
Comment author: @nojb PR merged. |
Original bug ID: 6450
Reporter: @alainfrisch
Assigned to: @nojb
Status: resolved (set by @nojb on 2018-08-21T17:01:03Z)
Resolution: fixed
Priority: normal
Severity: feature
Target version: 4.08.0+dev/beta1/beta2
Category: standard library
Bug description
Checking disjointness of sets is a common operation and it can be implemented more efficiently from inside the Set module:
(it could even probably be more clever, using respective depth of the subtree.)
Any opposition to including this function?
The text was updated successfully, but these errors were encountered: