Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variants of Map.find #7277

Closed
vicuna opened this issue Jun 23, 2016 · 5 comments
Closed

Variants of Map.find #7277

vicuna opened this issue Jun 23, 2016 · 5 comments
Assignees
Milestone

Comments

@vicuna
Copy link

vicuna commented Jun 23, 2016

Original bug ID: 7277
Reporter: gerd
Assigned to: @alainfrisch
Status: resolved (set by @alainfrisch on 2016-12-06T15:31:56Z)
Resolution: fixed
Priority: normal
Severity: feature
Version: 4.03.0
Target version: 4.05.0 +dev/beta1/beta2/beta3/rc1
Fixed in version: 4.05.0 +dev/beta1/beta2/beta3/rc1
Category: standard library

Bug description

I'm suggesting to add the following functions (impl is trivial):

val find_le : key -> 'a t -> (key * 'a) option * 'a option
val find_ge : key -> 'a t -> 'a option * (key * 'a) option

find_le k m returns a pair where the left component is the binding with the biggest key smaller than k (if existing) and the right component is the value for k (if existing). find_ge symmetrically returns the value for k and the binding with the smallest key bigger than k.

Additional information

We can of course already do that with Map.split, but in this case the submaps are constructed which may be expensive. So the motivation is performance.

Use case: Imagine you need a mapping where the keys are disjoint ranges (left,right), and you want to implement a function that returns the mapped value for any number in that range. With find_le you can do that by only putting the left points of the ranges into the map (and an additional member check).

Alternatives: The API could be a little bit "smaller" (not returning the exact match):

val find_lt : key -> 'a t -> (key * 'a) option
val find_gt : key -> 'a t -> (key * 'a) option

Or it could be "bigger" (always returning the previous and the next elements):

val find_around : key -> 'a t -> (key * 'a) option * 'a option * (key * 'a) option

IMHO, this is matter of taste. The use case I know wouldn't profit from find_around.

I'll happily submit a PR when the addition is accepted (and the style has been agreed upon).

File attachments

@vicuna
Copy link
Author

vicuna commented Jun 27, 2016

Comment author: @alainfrisch

I'm concerned by the number of small variants around the proposal that would all look as useful (e.g. find_lt, find_lte, ...). Since performance is the key reason here, the most general form (find_around) might not be enough (but still sometimes useful, since it would be more efficient than multiple calls to simple functions).

Gerd: in your case, would find_lt/find_gt be enough?

A slightly more general interface would be:

val find_first: (key -> bool) -> 'a t -> (key * 'a) option
val find_last: (key -> bool) -> 'a t -> (key * 'a) option

where the predicate is assumed to be monotonic. From these function, one can create find_lt/find_gt, but also variants with non-strict comparison (find_lte/find_gte), and the overhead could be acceptable.

@vicuna
Copy link
Author

vicuna commented Jun 27, 2016

Comment author: gerd

Well, the interface design would be a little bit simpler if flambda could optimize the unused parts of find_around away. However, to my understanding flambda cannot do this (you need to optimize the whole recursion, and track unused values through it - can somebody verify? - I've attached a sample impl of find_around).

Sure, find_lt/find_gt would be good enough for my application, i.e. a one-sided function. There is still the question where to return an option or whether to raise Not_found. The latter would be more consistent with the existing interface.

find_first/last is also a nice idea (and would solve my problem).

@vicuna
Copy link
Author

vicuna commented Jul 5, 2016

Comment author: @paurkedal

I like Alain's proposal, as it reminds my of Dedekind cuts, and after a bit reflection I realised that it extends the interface beyond the four possible inequalities:

To use find_lt and find_gt one needs to make up a suitable key. This key is not necessarily easy to construct for complex types. Moreover, the key does not necessarily exist. For instance, if the keys of the map are multi-precision fractions, then field_first would allow us to find the first binding after a transcendental number, while there is no fraction we could pass to find_gt which would guarantee the right result.

@vicuna
Copy link
Author

vicuna commented Jul 6, 2016

Comment author: gerd

Submitted a PR: #665

@vicuna
Copy link
Author

vicuna commented Dec 6, 2016

Comment author: @alainfrisch

#869

@vicuna vicuna closed this as completed Dec 6, 2016
@vicuna vicuna added the stdlib label Mar 14, 2019
@vicuna vicuna added this to the 4.05.0 milestone Mar 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants