<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE message PUBLIC
  "-//MLarc//DTD MLarc output files//EN"
  "../../mlarc.dtd"[
  <!ATTLIST message
    listname CDATA #REQUIRED
    title CDATA #REQUIRED
  >
]>

  <?xml-stylesheet href="../../mlarc.xsl" type="text/xsl"?>


<message 
  url="2003/11/978db28d889e2aac657b467588e07c8c"
  from="Harrison, John R &lt;johnh@i...&gt;"
  author="Harrison, John R"
  date="2003-11-12T00:21:02"
  subject="RE: [Caml-list] Efficient and canonical set representation?"
  prev="2003/11/fd3a18e2126bc42ed73c26461de1220f"
  next="2003/11/a13ba4cc36c14e7d7021336bcbc04024"
  next-in-thread="2003/11/4a8990a1ef715e68cd8d34483ae5ee9b"
  prev-thread="2003/11/ae2317e348119907279987d644ffa2ec"
  next-thread="2003/11/a64bd700deb07acf9ea74cea02719abc"
  root="../../"
  period="month"
  listname="caml-list"
  title="Archives of the Caml mailing list">

<thread subject="RE: [Caml-list] Efficient and canonical set representation?">
<msg 
  url="2003/11/978db28d889e2aac657b467588e07c8c"
  from="Harrison, John R &lt;johnh@i...&gt;"
  author="Harrison, John R"
  date="2003-11-12T00:21:02"
  subject="RE: [Caml-list] Efficient and canonical set representation?">
<msg 
  url="2003/11/4a8990a1ef715e68cd8d34483ae5ee9b"
  from="Brian Hurt &lt;bhurt@s...&gt;"
  author="Brian Hurt"
  date="2003-11-12T01:05:57"
  subject="RE: [Caml-list] Efficient and canonical set representation?">
</msg>
<msg 
  url="2003/11/f0ae170ec3837507642505ac411c4dd9"
  from="Diego Olivier Fernandez Pons &lt;Diego.FERNANDEZ_PONS@e...&gt;"
  author="Diego Olivier Fernandez Pons"
  date="2003-11-12T16:17:58"
  subject="RE: [Caml-list] Efficient and canonical set representation?">
</msg>
</msg>
</thread>

<contents>
That seems to be the best suggestion so far. I guess it would work well
in practice. But theoretically it still doesn't give O(log n) lookup
and insertion without the kinds of assumptions you noted about the
distribution of elements w.r.t. the hash function. And relying on
polymorphic hashing seems a bit of a hack.

So I still can't help wondering if there's an elegant solution with the
desired worst-case behaviour, preferably relying only on pairwise
comparison. Is it just a coincidence that the numerous varieties of
balanced tree (AVL, 2-3-4, red-black, ...) all seem to be non-canonical?
Or is it essential to their efficiency? (Perhaps this is a question for
another forum.)

John.

-----Original Message-----
From: owner-caml-list@pauillac.inria.fr
[mailto:owner-caml-list@pauillac.inria.fr]On Behalf Of Diego Olivier
Fernandez Pons
Sent: Monday, November 10, 2003 5:25 AM
To: Harrison, John R
Cc: caml-list@inria.fr
Subject: RE: [Caml-list] Efficient and canonical set representation?


    Bonjour,

&gt; After your remarks and Brian's, I'm starting to wonder if it is
&gt; possible at all to do what I want. Maybe I should be looking for an
&gt; impossibility proof instead...

Patricia sets seem to be what you are looking for.
 (1). Efficient usual operations (lookup, insertion, union)
 (2). Structural equality

Their only problem is that they cannot handle polymorphic orderable
types but only integers...

Hash the data, use this key to insert it in a patricia map and solve
the collisions by chaining in an ordered list (with the polymorphic
[compare] function). (1) and (2) still hold under usual hypothesis on
the rate of collisions.

A few changes to JCF's implementation should be enough.

        Diego Olivier



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

</contents>

</message>

