## Radix Sort, also known as Bucket Sort, also known as Hash Sort

Radix sort divides the list into a number of buckets, based on the values.

Each bucket is divided recursively until there's a single element.

Then all the buckets are gathered together in order.

Imagine you were sorting a large number of books

You put each book that starts with 'A' in the 'A' pile.

Each that starts with 'B' goes in the 'B' pile, 'C' goes in the 'C' pile and so on.

You now have 26 piles.

If the piles are small enough you sort them.

If a pile is large, sort it into 26 piles based on the second letter.

An ordinary sort has a compare function that compares two values, but radix sort uses an index function which can sort elements in to N buckets in one shot.

In pseudo code;

add_to_bucket( Bucket[first_symbol_of_element], element )

If you choose a Radix (number of buckets) greater than or equal to the number of elements,

it might seem like Radix Sort can be O(N), however, it's still O(N log N)

Instead of N×log_{2}N it's N×log_{radix}N, but that's still O(N log N).

For very large lists, the difference between log_{2}N and log_{radix}N can be substantial.

For small lists, the overhead of creating and managing the buckets can take more time than the sorting does.

Sorting evenly distributed or random elements, Radix sort performs extremely well but Radix sort can fail spectacularly when the values are clustered.

Imagine sorting the "King, Stephen" section of a library.

But if you know a lot about the data, it can be incredible.

Imagine sorting playing cards into 52 buckets - one pass and done.

Various attempts have been made to improve the clustering problem, including sorting from the last symbol first, sorting on a hash of the elements value.

These methods work in better in some cases and worse in others.

Radix sort is often worthy of consideration, since you often know a fair bit about the data you are sorting.

However, it's not generally used in a code library because a library function doesn't know much about the data it's sorting.

^ UP ^
< Back <
> Next >