This site is updated infrequently. For up-to-date information, please visit the new OCaml website at ocaml.org.

Re: Sort.array easily degenerates
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
 Date: 1999-03-10 (16:59) From: Xavier Leroy Subject: Re: Sort.array easily degenerates
```I have revised my Quicksort implementation based on that found in
glibc 2, which contains some clever optimizations.  Interested parties
can see the code at

http://camlcvs.inria.fr/cgi-bin/cvsweb.out/ocaml/stdlib/sort.ml

The behavior on extreme situations (input already sorted) is now much
better.

> There's no way to implement Sedgewick's quicksort with the interface
> given in sort.mli.  You'd need two comparison functions, one for ">="
> and one for "<=".  That explains the degenerate case when all the
> elements are equal.

It's true that ">=" vs. ">" can make a big difference, but >= is
easily definable from <= : a >= b is b <= a, a > b is not(a <= b),
a < b is not(b <= a).

> And it degenerates on already-sorted data because you swap the pivot
> with the right-most element.  As a consequence, one of the subarrays
> has its two highest elements in first and last positions, which makes
> the median-of-three degenerate.  I think this bug is also in
> Sedgewick's pseudo-code.

Sedgewick doesn't even give pseudo-code for the "median of three"
heuristic, but the glibc implementation avoids this swap altogether.

> Also, you should recurse on the smallest subarray first, not the
> largest.

Good point.

> I vote for Shellsort.

Well, I tried it, and it's noticeably slower than quicksort by almost
a factor of two on random input.  Perhaps because polymorphic array
access is quite expensive in OCaml.

- Xavier Leroy

```