
FTPFACTS.TXT         by  Jim Rector
                         Mike Clear

                         Pegasus Software
                                &
                          The Peg Board

This file is intended to provide facts concerning file transfer
protocols.  In previous files I have given my thoughts on the
evolution of new protocols and I have made some general
suggestions on the direction a new protocol might take.  This
file will not discuss a new protocol.  What I do wish to discuss
is some technical facts concerning file transfer protocols and
how they might affect a protocols performance.

In the past year we have been flooded with new protocols.  These
new protocols for the most part have come from public domain.
I have noticed a wide variation of themes and block sizes.  I
would like to limit this discussion to two main areas of file
protocol design.  The two areas are block size and error
checking methods.  The main thrust of the new protocols has been
targeted towards improving throughput.  Many claims have been
made concerning the advantage of one protocol over another with
regard to throughput.

What is throughput?  In general this term relates to the amount
of data transmitted in a given time.  This is a limited explain-
ation of throughput but, the basic idea is to design a protocol
that can transfer a file, without error, in the shortest posible
time.  Several things affect throughput.  From a design stand-
point block lenght, block overhead, error check time and maximum
modem error rate are the main factors that affect throughput.

Other factors that can affect throughput are uncontrollable by
by the protocol.  The most important to consider is phone line
quality.  Phone line quality can vary greatly on every
connection.  It can also vary greatly during a single session.
This variation constantly changes "The expected error rate".

Expected error rate is a vary deep subject and I will not try
to explain it here in great detail.  In simple terms the
expected error rate is a ratio between the number of characters
transmitted and the number of character errors encountered.
Modem design standards and phone line quality make this factor
very unstable.  There is a big difference between performance on
paper and performance in the field.  There have been many
studies on this subject, some of which have lasted for years.

I feel that most if not all of the recent protocols have failed
to consider the effects of these factors.  Most have literally
been born overnight.  I think that few authors have taken the
time to carefully plan their protocols.  I also think that some
of the side effects have been less than desirerable.  There
appears to be a race going on to see who can produce the best
protocol.  Unfortunetly the authors have evidently considered
first out to be the prime importance rather than best.  Also
authors have added features to their protocols to make them more
attractive.  Along with the "nice features" comes complexity.
Complexity is the most undesireable feature a protocol can have.

Most new protocols claim greater throughput.  In fact the case
is often the opposite.  Some of the new protocols have decreased
the throughput while maintaining that they are faster.  The
method of testing a new protocol is very important.  If a new
protocol is tested on a direct connect between two computers
then the modem error factor and line quality factors are ignored
when reporting the so-called performance of the new protocol.
I think that sometimes a new protocol is judged on paper only
rather than actual extensive field testing.  Great exaggeration
seems to dominate the recent reporting on new protocols.  A case
in point.  Some time ago I was given some advise on a BBS
modification that would increase the xmodem throughput about 15
to 20 percent. I instantly figured out that I must be talking to
Houdini.  Xmodems overhead is less than 4% of the transmission
time.  At best I could reduce the time by 4% if I eliminated the
blocks altogether and totally did away with the error checking.
Where is the other 11 to 16 percent that Houdini spoke of?

Current public domain file transfer protocols use many different
block sizes.  Anywhere from 128 to 1024 characters.  Why so many
different sizes?  Well this depends on who you talk to.  The most
popular protocols use 128 character blocks.  Why would a protocol
use a 1024 character block?  The authors of such protocols claim
that this block size increases the throughput.  Does it?  Lets
break things down and take a look at the individual factors that
affect throughput.

The common argument for larger block size is as follows.  For
each transmitted block there is something called block overhead.
Block overhead is the extra characters added to the block to
provide important information about the block.  These extra
characters represent the start of the block, the block number and
the error check.  For each block sent there is an overhead of a
fixed number of characters.  In the case of xmodem its five per
block, four in the block itself and one character counted in as
the acknowlege character.  Well it now seems simple enough, just
increase the block size and reduce the number of overhead
characters that need to be transmitted.  This will obviously
increase the throughput.  Is it really that simple?  It is if you
only consider the block size as the only factor that affects the
throughput.  As mentioned above, there are other factors that
need to be considered when calculating the throughput, something
I feel has not been done with the most recent protocols.

Lets compare the calculated overhead needed for 128 character
blocks and 1024 character blocks.  For all of the examples I will
assume a file size of 100,000 characters and 1200 baud rate used.

128 block size:

100,000 / 128 =  782 blocks (actual 781.25, but the .25 is padded
                             with 0's to fill out the last block)

782 * 5 = 3910   Character overhead (blocks times overhead per
                                     block)

103,910   Total character transmitted to complete the file trans

3910 / 103,910 = .0376 or 3.76 percent (percentage of total
                                   transfer used for overhead)

14.43 minutes file transfer time

0.55  minutes of total time used to transmit the overhead.

1024 block size:

100,000 / 1024 = 98 blocks (actual 97.65 w/pad on last block)

98 * 5 = 490    Character overhead

100,490   Total charcacters transmitted

490 / 100,490 = .0049 or .49 percent of total for overhead

13.96 minutes file transfer time

0.07 minutes of total to transmit overhead

Well, by these figures we have saved 0.47 minutes or about 28
seconds.  If you only look at the effect of the block size on the
file transfer you would think that using the 1024 block would
greatly improve the protocol.  But remember, the block size is
not the only factor that affects the throughput, if it where we
would all be using larger block sizes.

What about phone line quality?  I think we can all agree that all
downloads do not go error free.  By this I mean that there is an
occassional error that causes the protocol to retransmit a block.
This does not mean that the file is unusable it just means that
the error check found an error and asked the sending computer to
re-send the block.  We need to look at the long term picture.
Some transfers will retransmit blocks and others will not.  When
looking at the ones that do, we will find that the number of
retransmissions will vary.  To get a fair picture of the perform-
ance of a protocol we need to look at several transfers.  A
single user will download many files over the course of a year.

Lets look at the effect of retransmissions on the two different
block sizes.  The 128 block size takes about 1.1 seconds to send
a block.  The 1024 block size takes about 8.6 seconds to send a
block.  Just for argument lets say that both protocols experience
an error in four blocks.  Here are the new total times for each.

128 blocks:

14.43 minutes + 4.4 seconds = a new total time of 14.47 minutes

1024 blocks:

13.96 minutes + 34.4 seconds = a new time of 14.53 minutes

Guess what?  With 4 retransmits the 1024 block protocol took
longer than the smaller 128 block protocol.  Did someone let
Houdini back in hear?  This indicates to me that the 1024 block
protocol will work fine as long as the phone connection quality
is very good.  Dream on!!!!!!!  How often is the phone connection
good enough to guarantee that you will never have any retransmits
of blocks.   If you like to save money and use MCI or one of the
other long distance services, you will find the larger block
protocol to be a big fat negative!!!!  At about this point there
will be a few who will say, I don't get that many retransmits and
the bigger block size won't hurt me, it could only help.

Well now for the big kicker.  The one factor most overlooked by
programmers.  EXPECTED ERROR RATE.  What is that?  I mentioned
this factor in the text above.  It is the ratio of transmitted
characters to the errors in transmission.  Well that covers the
error rate part. What about the "expected" part.  This is a tough
subject to cram into a small space.  In general all modems and
phone lines cary an expected error rate, in other words with the
transmission of X number of characters you will find one error.
What number do we put in X?  Thats a hard one to answer.  If you
look on your modem you will find that it conforms to several
sections of the FCC code, part this and subsection that.  Modems
carry a fairly stable error rate more or less.  The phone lines
account for the biggest swing in the error rate.  By the
standards set down, they should be vary stable but rarely are.
There are too many outside influences, thunder storms just to
name one.  Most people have seen the error rate vary without
paying to much attention to it.  Have you ever connected to a
BBS, downloaded a file with no problems and called back another
day only to find that your download was riddled with retransmits.
This was a change in the error rate.  I am sure you have noticed
it but now for some may finally have a name.

How does error rate affect the choice of block size?  Well for
the sake of example, lets say we have a 500 to 1 error rate.  This
would mean (on average) that for every 500 characters received,
one character would be bad.  If this were an actual case, the
1024 protocol would never get off the ground.  It needs to
receive 1024 valid characters in a row to have the block error
check read good. On the other hand the the 128 protocol would
work.  It would experience some retransmits, but none the less
would work.  I have seen error rates high enough to cause the 128
protocols to time out on retransmits.  So in my opinion the 1024
size block is just asking for trouble.  There is one general fact
that applies to block size, the bigger the block the greater the
likely hood that it will catch an error.  With bigger blocks you
reduce the number of gaps between them.  With smaller blocks you
increase the number of gaps.  When a protocol has received a
block and finished its error check it starts looking for the
start character for a new block.  If it receives a character
(error) other than the start of block character it drops it on
the floor.  The gaps make a nice place for errors to occur.  If
an error happens in a gap it does not cause a block
retransmission.  Now we can say, the more gaps the more places
for an error to fall through without wasting time.  I look at the
block size situation this way; It's a lot easier to throw a rock
and hit the side of a barn than it is a tree stump!  Bigger
blocks make bigger targets.  More gap traps are less painful!

I would like to answer one question, that I expect, before it
arrives.  Some will ask what about the time it takes to make an
error check on a block of data.  The answer is simple, the time
needed to make the error check is so small its hardly worth
mentioning, at least it shouldn't be.  This is one area that
depends on the programmer and the type of error check.  With the
Messenger software, Mike has the error check time down too about
200 microseconds per block or in other words 2 tenths of a
milisecond.  Even the most extensive check should not take more
than 1 milisecond.  These figures are based on the 128 block
size.

Does the error check method used have any effect on throughput?
In theory it does.  When comparing the checksum method with the
CRC method, I find little difference.  On paper the CRC method
shows to be a little more reliable.  In actual field applicaton
little difference, if any, can be seen.  I think the implementa-
tion of the CRC method has caused more trouble that the slight
advantage that it may have gained.  Compatability with standard
xmodem was lost for such little gain.  I think this is like a
Porsche owner that spends $15,000 to make his car go 151 mph when
before it would go 150 mph!

Summary:

I think that the large block protocols, over a long period of
time, will prove to be more error prone than advertized.  I
predict than they are no more than a passing fad.  I think that
modem and phone line improvements will do more for throughput
than any attempts to tweek the current protocols.  There is an
old saying that may apply to the recent flood of file protocols,
courtesy of Mike Clear, "If it isn't broke don't fix it".

