Sort.array easily degenerates
 Markus Mottl
[
Home
]
[ Index:
by date

by threads
]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
Date:   (:) 
From:  Markus Mottl <mottl@m...> 
Subject:  Sort.array easily degenerates 
Hello, I have played around with the new functions and modules in the new OCAMLrelease. Besides other things I have tested the new function for sorting arrays (Sort.array). I am not sure where the problem in the implementation is, but the "qsort"function, which is applied in "Sort.array" degenerates easily on presorted and/or nonunique data. E.g.:  SNIP  let _ = let size = 5000 in let ar = Array.create size 0 in for i = 0 to (size1) do ar.(i) < i done; Sort.array (>=) ar  SNIP  The array to be sorted is initialized with its index. Then it is sorted with descending order. Running the same test with larger arrays clearly shows that we encounter the worstcase behaviour (n^2) of quicksort. Even worse: If we initialize the array with the same number, the time complexity stays at its worstcase but with an even higher constant factor. Initializing the array with random integers (of a large range) shows that in such cases "Sort.array" does not perform this badly (actually quite well). I have compared this to "qsort" in "stdlib.h", the standard library of C. It is faster than the OCAMLversion, as was to be expected. But in contrast to OCAML it behaves very nicely on lowentropy data. E.g.: (CCode):  SNIP  #include <stdlib.h> int int_comp (const void * a, const void * b) { return *(int *) a  *(int *) b; } const int size = 5000; int main () { int ar[size]; for (int i = 0; i < size; ++i) ar[i] = i; qsort (ar, size, sizeof(int), int_comp); }  SNIP  It would probably be a good idea to change the implementation of "Sort.array" so as to make it more unlikely to encounter such worstcase behaviour. Especially with data, where elements may occur more than once, the current implementation performs really badly. So here a question: has someone already written a quicksortfunction with inplace modification for arrays which demonstrates nicer behaviour? Best regards, Markus  Markus Mottl, mottl@miss.wuwien.ac.at, http://miss.wuwien.ac.at/~mottl