A partition sorting routine

Modern computing resources have ensured that, once again, only traditional value has been left in a topic which used to be fiercely debated. However, after having experimented with four sorting methods (including straight selection sort, treesort and mergesort), the subject has kept me busy for long enough to deserve a section.

A partition sort is not a good as Quicksort, of course, though it is broadly based on the same principle: The list is partitioned around a pivot element, though a different array is used to hold the partially ordered list.

A pivot element is selected as the 'best' (middle value) of three randomly chosen elements.
The list is partitioned in terms of the pivot element- lower than or equal keys are placed at the head of a new array, higher keys at the tail.
The new list becomes the old list and the sublists are partioned in the same way, until all partitions hold fewer than two elements.

There are a couple of points to watch, in practice: The list may not get partitioned during the first attempt.

A sublist may consist of identical keys.
We may have been unfortunate enough to select the smallest (or largest) element as a pivot: (As partitions get shorter, this is not as difficult as it may seem, if the list includes several identical keys.)

When the sublist consists exclusively of identical keys, no further work is required on it: (Indeed, the program would never end, should processing continue on such a partition.) On the other hand, if an inadequate pivot has been selected, random pick- out of pivots will eventually ensure that the list is partitioned. (Repeated keys and boundary conditions require additional code in several good sorting algorithms.)

Many sublists of size two are present at the final stage of the algorithm: These are sorted in a straightforward manner (or that stage would have lasted twice as long.)

Usually, presentations employ integer keys, but any field of a record can be considered: Real numbers, dates, alphabetic sorting of strings and so on.

The space needed by the routine is twice that of the unprocessed list. Stack space is typically proportional to the base two logarithm of the list size. (There are four input parameters/ local variables.) Worst case storage is proportional to the size of the list, but with long lists, the possibility of coming across this case is remote.

The average number of comparisons required by acceptable sorting routines is of the order of nlog₂n. (The worst case is n²/2, but again, with long lists, the probability of obtaining that is negligibly small.) Because of the randomness involved, analytic average performance calculations are no easy task. However, simulation suggests that the partition sort is close to that figure.

There were two simulation groups: The first involved keys which were all different; the second allowed repeated keys (in the cardinal range from 1 to the listsize) as produced by a random number generator. It was a surprise that there was not much difference between the two groups. Admittedly, though, a large number of identical keys does degrade performance.

Copying operations are twice as many as comparisons: However, it is fair to remark that copying part of an array should be faster than copying all the individual elements in the partition- if low level memory access is employed.

The drawbacks of the algorithm are the extra array, the need to verify that a partition solely comprises repeated keys, and the occasional repartitioning necessary because of an unlucky choice of pivot.

If the routine is implemented in machine code on a small microcontroller, the programmer will have to use his own stack for non- static variables- it is unlikely that the native stack can cope if the list is long.

The code has been beautified and is shown: The Rnd function is expected to provide a random number between 0 and 1.

Site Index
Mandelbrot Enlargements

Errors will not be removed until their presence has been felt.