Hi.

This is a short tutorial on the radix-sort. If you already know what a radix-sort
is, how it works and you don't use quicksort anymore you can skip this
tutorial.

The tutorial is intended for programmers who have to write a fast sorting
algorithm for small- and medium-sized lists. So if you want to sort a
database you should search the web for another tutorial. However, if you
write a 3d-engine or such stuff you should read on because radix-sort is very
good for this purpose.


THE BASICS
----------

  How do you sort a list of numbers? Basically you have an algorithm which
  compares numbers and tries to get the list more and more sorted.

  The Radix-sort is a little bit different. It doesn't compare anything, it
  sorts the numbers with several passes.

  Let's start with a list of bytes. A very simple radix-sort for bytes
  would work like this:

  source:     List of bytes
  source_n:   number of bytes to sort
  dest[256]:  256 lists of bytes. each list should have enough space to hold
                source_n elements.

  Here is pseudo-code to sort the list:

  for i = 0 to source_n do
    dest[source[i]].append (source[i])
  endfor

  This means you append all items with value #1 to list 1, all items with
  value #2 to list 2 and so on. As you can see this algorithm runs very fast
  (it only copies the bytes in a very tight loop). The disadvantage is that
  you need a lot of memory to sort the list becuase you don't know how many
  items of each type you have...



SAVING MEMORY
-------------

  Do you really not know how many items of each type you have? No, you can
  count them and optimize your memory requirements. We'll need another loop
  to count the distribution of the source values:

  int distribution[256]

  // fill the list with zeros.

  for i=0 to 255  do
     distribution[i]=0;

  // build a distribution history:

  for i=0 to source_n do
     distribution[source[i]] =  distribution[source[i]] +1;
  endfor

  // Now we build an index-list for each possible element:

  int index[256];
  index [0]=0;

  for i=0 to 255 do
    index[i]=index[i-1]+distribution[i-1];
  endfor


  Adapting the original radix-sort to make use of the new information
  will result in this code:

  dest: array of bytes with space for source_n bytes.

  for i = 0 to source_n do
    dest[index[source[i]]]=source[i];
    index[source[i]] = index[source[i]] + 1;
  endfor

  After this dest contains the sorted bytes.

  Now we only need the same amount of memory as the original source list
  took up. This saves quite a lot of memory, and if your lists don't contain
  millions of values we can live with the memory requirements.


EXTENDING THE BYTE-SORT
-----------------------

  Who wants to sort bytes? You? I normally sort ascii text, floats or
  long ints. Because I only want to explain how the extension works, I'll
  go on with long ints.

  You can increase the size of the elements in the sort in several ways.
  You can either increase the size of index and distribution or you sort in
  several passes.

  Increasing the size of the index only works for short values (sorting
  short ints would require at least 65536*2 bytes for the index and
  distribution lists, which finally need 256 kb of memory... I think this is
  too much (just think about how long it takes to clear the memory and build
  the index-list.. In this time a quick sort may have sorted your list
  already).

  Now we need a clever way to sort more bits with the same memory amount.

  We can do this with several passes. Let's do it in decimal.

  Unsorted list:

  523
  153
  088
  554
  235

  Sorted for Radix 0 (least significant digit):

  523
  153
  554
  235
  088
    ^

  Sorted for Radix 1 (2nd. significant digit):

  523
  235
  153
  554
  088
   ^

  Sorted for Radix 2 (most. significant digit):

  088
  153
  235
  523
  554
  ^

  As you can se we preserve the old sorting order and advance by one digit
  until we've sorted all digits. The illustration did this in decimal. The
  native digit format for computers are bits, bytes, words and dwords.

C Source
--------

 I know... real programmers code in assembler. (Hey! I thought you want to
 understand the radix-sort, not assembly). The program compiles well under
 Watcom C++. However, it should at least work on all 32-bit C compilers and
 on some 16-bit compilers.



 ----------------------------- cut here ---------------------------

#include <iostream.h>
#include <stdlib.h>
#include <string.h>

void radix (int byte, long N, long *source, long *dest)
{
  long count[256];
  long index[256];
  memset (count, 0, sizeof (count));
  for ( int i=0; i<N; i++ ) count[((source[i])>>(byte*8))&0xff]++;

  index[0]=0;
  for ( i=1; i<256; i++ ) index[i]=index[i-1]+count[i-1];
  for ( i=0; i<N; i++ ) dest[index[((source[i])>>(byte*8))&0xff]++] = source[i];
}

void radixsort (long *source, long *temp, long N)
{
  radix (0, N, source, temp);
  radix (1, N, temp, source);
  radix (2, N, source, temp);
  radix (3, N, temp, source);
}

void make_random (long *data, long N)
{
  for ( int i=0; i<N; i++ ) data[i]=rand()|(rand()<<16);
}


long data[100];
long temp[100];

void main (void)
{
 make_random(data, 100);
 radixsort (data, temp, 100);
 for ( int i=0; i<100; i++ ) cout << data[i] << '\n';
}

 ----------------------------- cut here ---------------------------

LAST WORDS
----------

 I hope you learned something.

 I have to thank Climax from Amable for making me interested in this
 algorithm (well, two weeks before I thought quick sort is the best).

 Also thanks go out to Peter Flynn who took a look at this file and
 corrected all my type-erros. Thank you!

 If you want to contact me write to

  Nils Pipenbrinck / Submissive from Cubic Team & $eeN
  100121.2450@compuserve.com (checked daily)
  subcode@tecs.de (checked at least once a week)
