
VERSION

    This is version 0 of the the Sort Flogger.

OVERVIEW

    This is a small collection of some of the more popular sort
algorithms, including:

      bubble sort -- any 1st semester course in CS
      heap sort -- contributed by der Mouse
      insertion sort -- any 1st semester course in CS
      merge sort -- roughly patterned after Knuth Vol. 3
      quick sort -- C.A.R. Hoare's as given in K&R 2 pg 87
      shell sort -- D.L. Shell as given in K&R 2 pg 62
      bogo sort -- worst-case sort by Richard Krehbiel

    A slightly weird test routine is provided which calculates some
performance measurements on the above algorithms and also on the
qsort() function provided with your system, as well as verifying that
the sort actually worked and that it didn't modify anything immediately
above or below the array, that it doesn't leak memory, and whether
or not it's stable.  These tests are carried out in a variety of
situations designed to highlight worst-case behavior.

    These implementations of the sort functions are all compatible with
qsort()'s parameter list, so if you think your application spends too
much time sorting, you can just s/qsort/merge_sort/ and link in the
sort library and give it a try.  If you think it doesn't spend *enough*
time sorting, try bubble_sort instead.

    The merge sort declares a global int called _maybe_sorted which
tells merge sort that the data in the array may already be mostly in
order.  It is advisory--the algorithm will produce the correct results
regardless of _maybe_sorted's value; however the execution time for
merge_sort will be quite a bit less if this is set and the array is
already mostly sorted.  If the array isn't sorted at all, this option
extracts about 2% speed penalty.

    The heapsort was contributed by mouse@larry.mcrcim.mcgill.edu
aka "der Mouse".

    The merge_sort() function is the only one I've taken any trouble to
optimize.

INSTALLATION

    See the INSTALLATION document which should be nearby.

USAGE

    Just type "flogger" or "flogger <number>" and watch what happens.

DESCRIPTION OF OUTPUT

    It seems like it doesn't really do anything but spit out numbers,
and rather slowly at that.  The version and patch level of the program
is printed; the resolution of the clock is printed; the size of the
elements is reported; and the number of elements to be sorted is
reported.

    For each sort algorithm, a table of measurements is reported.  The
rows describe the situation the sort algorithm was presented with for
those measurements and the columns represent the type of measurement
taken.  The rows (situations) are:

    random:   The contents of the array to be sorted are pseudo 
              random numbers generated by rand(), with two special
              values excluded for the purpose of detecting overwrites
              at the ends of the array.
    ascend:   The array is already completely in order.  To simply
              verify this fact is all that's required of the sort
              algorithm; ironically, bubble sort outperforms all
              the rest in this case.
    descend:  The array is exactly in reverse order.  Theoretically,
              an sort function could detect this and swap ends in
              linear time, but it hardly seems useful.
    fib asc:  The contents of the array are irrelevant; the comparison
              function passed to the routine always says that its
              first argument is less than its second.
    fib desc: Similarly, the comparison function always says that its
              first argument is greater than its second.
    surprise: A pseudorandom result is returned from the comparison 
              function in an attempt to confound the algorithms' efforts
              at sorting.
    mostly:   Similar to ascend, but the first and last elements
              are reversed.  This is meant to simulate an index which
              is pretty much already sorted and only a small number
              of new items need to be inserted.
    equiv:    The comparison function reports that all elements 
              compare equal.  The data is arranged so that a
              check can be made for whether the algorithm is
              stable; that is the property that records with
              equal keys stay in the same relative order in the
              array that they were in before the sort began.

The columns refer to these measurements:

    compares: The number of calls to the comparison routine during
              the sort.  I consider this very important, as most
              of the sorting bottlenecks I've experience center
              around the comparison function.
    stack:    (Estimated) Some semi-non-portable attempts to record
              the growth of the stack are made and a number comes
              out.  I consider O(log n) use of the stack acceptable
              on modern machines and none of these algorithms are
              worse than that. 
    heap:     Amount of malloc'd storage as reported by mallinfo().
              If your C library doesn't support mallinfo(), the
              compiler will probably barf on flogger.c.  The flogger
              checks for memory leaks.
    user:     Amount of user mode CPU time as reported by times().
              This includes time for both the operation of the
              sort function and the comparison function; as these
              are only marginally related, this number is not
              useful for estimating how long it might take to 
              sort useful data in some other program.
    system:   Amount of system CPU time as reported by times().
              Generally 0.00 or very small.  If a sort algorithm
              uses lots of system time it's probably a Trojan horse
              trying to break into the passwd file or delete your
              home directory or something so maybe you should
              start shuffling through your desk looking for a recent
              backup tape.  I may remove this in a future version
              (the report of system time, not the Trojan horse)
              in favor of another metric, perhaps wall clock time.

    But it isn't enough simply to be fast, a sort algorithm also has
to be correct.  To that end, the flogger routine verifies that
the array was actually sorted, at least when the comparison routine
wasn't lying.  It also checks that magic values in the memory locations
immediately above and below the array were not accidentally accessed
or overwritten.

    The bogo sort run is skipped if n is sufficiently large that
it's execution would run into trillions of cycles (approx n = 15).

MISCELLANEOUS

    When running the test routine, some sorts take so long that you
might lose patience.  Keyboard interrupt (aka Control-C) will abort the
current sort test and move on to the next.  If you really want the
program to end, give it a quit signal (Control-\) or send it a signal
with the kill command.   You can also background it (Control-Z) from
the C-shell then kill it.

    The size of the elements being sorted is hard-coded at the moment,
but it can be changed by going into flogger.c and changing the typedefs for

    #define TEST_TYPE double
    #define TEST_FUNC double_compare

into

    #define TEST_TYPE int
    #define TEST_FUNC int_compare

or back again.  Or add a new type and a new comparison function.

    Several of the sort algorithms are not re-entrant which means
you shouldn't call the sort from within the very same comparison 
function you passed into that sort in the first place.

COPYRIGHT

    Since none of the sort algorithms are mine and anyone with a copy
of Knuth Vol 3 and a little knowledge of C could type them in as easily
as I did, there isn't much point in claiming a copyright, much less
applying for a patent.  I wanted to disclaim liability though, so I put
a few minor restrictions on the code, as set forth in each of the
source files.
