| Anonymous | Login | Signup for a new account | 2013-05-25 19:04 CEST | ![]() |
| Main | My View | View Issues | Change Log | Roadmap |
| View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||||
| ID | Project | Category | View Status | Date Submitted | Last Update | ||||||
| 0004986 | OCaml | OCaml general | public | 2010-02-26 16:13 | 2012-12-04 15:23 | ||||||
| Reporter | frisch | ||||||||||
| Assigned To | |||||||||||
| Priority | normal | Severity | feature | Reproducibility | have not tried | ||||||
| Status | acknowledged | Resolution | open | ||||||||
| Platform | OS | OS Version | |||||||||
| Product Version | |||||||||||
| Target Version | 4.01.0+dev | Fixed in Version | |||||||||
| Summary | 0004986: Add efficient functions to create sets from lists or arrays | ||||||||||
| Description | Creating a set from a list or an array is quite common. It could be useful to add such functions in the Set module. A trivial implementation is: let of_array a = Array.fold_right add a empty A more efficient implementation is to sort the array first, and then build the balanced tree directly: let of_sorted_array a = let rec sub i0 = function | 0 -> Empty | 1 -> Node (Empty, a.(i0), Empty, 1) | 2 -> Node (Node(Empty, a.(i0), Empty, 1), a.(i0 + 1), Empty, 2) | 3 -> Node (Node(Empty, a.(i0), Empty, 1), a.(i0 + 1), Node(Empty, a.(i0 + 2), Empty, 1), 2) | n -> let nl = n / 2 in let l = sub i0 nl in let r = sub (i0 + nl + 1) (n - nl - 1) in create l a.(i0 + nl) r in sub 0 (Array.length a) (The cases for n=1,2,3 are just extra optimizations.) When creating sets from arrays of random integers of size 100000, the optimized version is almost twice as fast as the naive one (ocamlopt, x86). | ||||||||||
| Tags | No tags attached. | ||||||||||
| Attached Files | |||||||||||
Notes |
|
|
(0007705) doligez (manager) 2012-07-11 11:58 |
Note: the code looks wrong, nl should probably be something like (i0 + n) / 2. |
|
(0007706) frisch (developer) 2012-07-11 12:01 |
Why is it wrong? nl represents the length of the subarray, i0 the index of its first element. |
|
(0008099) doligez (manager) 2012-09-17 17:48 |
I don't know what I was thinking when I wrote that. The code looks fine. |
|
(0008559) frisch (developer) 2012-12-04 15:23 |
A patch is attached, with also a new List.sort_dedup functions (like List.sort, but removes duplicates). Performance compared to an implementation based on folding Set.add is usually improved in a significant way, especially for long lists. The only case of degraded performances is for long lists with a lot of duplicates. Examples: - list of length 10000 made of random strings of length 5 (x1000): 6.61s -> 4.62s - list of length 10 made of random strings of length 5 (x1000000): 0.96s -> 0.76s - list of length 1000 made of random integers between 0 and 9 (x10000): 0.45s -> 0.49s - list of length 1000 made of random integers between 0 and 9999 (x10000): 1.89s -> 1.28s |
Issue History |
|||
| Date Modified | Username | Field | Change |
| 2010-02-26 16:13 | frisch | New Issue | |
| 2011-06-01 14:31 | doligez | Status | new => acknowledged |
| 2012-07-11 11:58 | doligez | Note Added: 0007705 | |
| 2012-07-11 11:58 | doligez | Target Version | => 4.01.0+dev |
| 2012-07-11 12:01 | frisch | Note Added: 0007706 | |
| 2012-07-31 13:36 | doligez | Target Version | 4.01.0+dev => 4.00.1+dev |
| 2012-09-17 17:48 | doligez | Note Added: 0008099 | |
| 2012-09-17 17:48 | doligez | Severity | minor => feature |
| 2012-09-17 17:48 | doligez | Target Version | 4.00.1+dev => 4.01.0+dev |
| 2012-12-04 15:15 | frisch | File Added: patch_set_of_list.diff | |
| 2012-12-04 15:23 | frisch | Note Added: 0008559 | |
| Copyright © 2000 - 2011 MantisBT Group |