Browse thread
Canonical Set/Map datastructure?
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2008-03-07 (10:09) |
From: | Berke Durak <berke.durak@e...> |
Subject: | Re: [Caml-list] Canonical Set/Map datastructure? |
Harrison, John R a écrit : > Hi Berke, > > | However, the idea of combining hash-consing and Patricia trees, > | however elegant, does not suit my problem. Basically, you are > | cheating by using an auxiliary data structure, the hashtable (which is > | also O(n^2) worst-case). > > The code I pointed you to uses hashes, but not hash tables. As for the > worst-case efficiency, what you say is true, but it's unlikely to > matter in practice. And I suspect it may be unavoidable in principle, > which is the interesting thing I learned last time this was discussed. > See for example the following paper and its references: Hi John, Thanks for your code. I'm sorry I thought you maintained a separate hash table. It's very interesting code and I'll give it a try. But it seems to have two properties which make it unsuitable for the particular application I had in mind: - The theoretical worst-case performance of hash-based data structures can be attained if the hash function has bad dispersal. As the code relies on Hashtbl.hash, which does not hash deeply, this could potentially be a proble, in particular if your keys have long "common prefixes" that are not distinguished by the hash function. It might work well in practice but I feel a little uncomfortable. - Also, it does not preserve the natural order for keys, and that is particularly bad, because I often use, for instance, float-indexed maps or sets as a substitute for heaps. The paper with the inverse cubic lower bound is *very* interesting; we don't see plausible lower bounds often in complexity theory, especially with such large assumptions (just bounded out-degree for the graph nodes). So it seems there is little hope to have a drop-in replacement for Set or Map that is compatible with the natural order of things, a.k.a., Pervasives.compare. -- Berke DURAK