Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006450OCamlOCaml standard librarypublic2014-06-03 17:502014-06-11 15:34
Reporterfrisch 
Assigned To 
PrioritynormalSeverityfeatureReproducibilityhave not tried
StatusacknowledgedResolutionopen 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0006450: Set.disjoint
DescriptionChecking disjointness of sets is a common operation and it can be implemented more efficiently from inside the Set module:

    let rec disjoint s1 s2 =
      match (s1, s2) with
        (Empty, _) | (_, Empty) -> true
      | (Node(l1, v1, r1, _), t2) ->
          if s1 == s2 then false
          else match split v1 t2 with
            (l2, false, r2) -> disjoint l1 l2 && disjoint r1 r2
          | (l2, true, r2) -> false


(it could even probably be more clever, using respective depth of the subtree.)

Any opposition to including this function?
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0011648)
shinwell (developer)
2014-06-04 09:54

If the answer to PR 6279 is that these functions should go in external libraries, I don't see how the inclusion of this (far less useful) function could be justified.
(0011651)
frisch (developer)
2014-06-04 14:07

0006279 is about a function which would not really benefit from being included in the functor. Well, actually, I just realized that the split function is exposed in Set, so the disjoint function could indeed be implemented outside the functor as well. But keeping it in Set would make it possible to apply further optimizations to it.
(0011665)
chambart (developer)
2014-06-04 23:36

Note that this one cannot be efficiently implemented outside. The split function is effectively exported, but there is no function to find the root of the tree (here v1 in the pattern matching). And we wouldn't want such a root function, since it wouldn't have a clear semantics.
(0011668)
frisch (developer)
2014-06-05 06:28

Ah yes, indeed.
(0011692)
shinwell (developer)
2014-06-06 10:51

I get the feeling that, increasingly, the criterion for inclusion in the standard library should be whether or not the feature is needed now in the compiler distribution itself. If the feature is not needed there, then it goes outside, as per Damien's comment.

If that is not the criterion, then the next most reasonable criterion would seem to be whether the feature assists in the provision of a comprehensive standard library. In that scenario, I cannot see how the inclusion of [Set.disjoint] but not [Set.map] could be justified.

I cannot see why a good criterion for inclusion in the standard library would be one based on efficiency of some core set of functions, leaving users to write the remaining functions on top. That would seem to lead to a fragmented interface, and bugs.
(0011699)
frisch (developer)
2014-06-06 20:23

There is no consensus in the community about a good general purpose alternative to the standard library, and I believe the "official" stdlib will remain widely used, especially for libraries. If an extension fits well in the stdlib, I don't see why we should prevent ourselves from including it. Of course, we need to apply some reason and not include gazillions of tiny one-liner helper functions which only complicate the interface. I've no strong opinion about Set.map. This does not seem to be a very common operation on sets and it does not benefit particularly from the choice of the underlying concrete data structure; and it's a one-liner. I would have a more positive opinion if the function did something non-trivial, such as guaranteeing a linear complexity for the case where the mapped function is monotonic on the elements in the set (this does not seem very difficult).

Checking disjointness of sets, or combining maps, on the other hand, seem more common and immediately benefit from being included in the stdlib.

> I cannot see why a good criterion for inclusion in the standard library would > be one based on efficiency of some core set of functions, leaving users to
> write the remaining functions on top. That would seem to lead to a fragmented
> interface, and bugs.

So what do you propose? Rejecting simple and useful addition, esp. when they cannot be implemented externally without switching to an entirely different library, means that we encourage users to stop using the official stdlib. This is not a good approach while there is no good alternative to it.
(0011735)
ybarnoy (reporter)
2014-06-11 15:34

There are advantages to having no dependencies. My old research project's approach was not to use any dependencies that were not in-tree. We therefore used the stdlib with our own enhancements. This makes the code very portable, especially for windows systems where opam doesn't work.

Therefore, I see nothing wrong with enhancing the stdlib with useful functionality, even if it's never going to be at the level of Batteries or Core. Heck, had stdlib had low precedence function application (@@) earlier, we wouldn't have littered our code with our own symbol. I'm still amazed that there's no standard function composition operator in stdlib.

Stdlib is the gold standard by which the other 2 libraries live, even if Core likes to flaunt its divergence from stdlib, and it's worth maintaining, even if it's not cutting edge IMO.

- Issue History
Date Modified Username Field Change
2014-06-03 17:50 frisch New Issue
2014-06-04 09:54 shinwell Note Added: 0011648
2014-06-04 09:54 shinwell Status new => acknowledged
2014-06-04 14:07 frisch Note Added: 0011651
2014-06-04 23:36 chambart Note Added: 0011665
2014-06-05 06:28 frisch Note Added: 0011668
2014-06-06 10:51 shinwell Note Added: 0011692
2014-06-06 20:23 frisch Note Added: 0011699
2014-06-11 15:34 ybarnoy Note Added: 0011735


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker