TJSTRING CLASS:

TJString is a general purpose string class that encapsulates many of the
behaviours of traditional C strings.  This document provides the API for
the class as well as some technical notes about usage.


COPY ON WRITE:
TJString is a copy on write class.  This means that a TJString object only
has it's own copy of data if it is constructed or set equal to a C style
string, or a non-const member function is called.  For example, consider
the following code snippet:

TJString A("Hello World");
TJString B(A);
TJString C("Hello Everyone");
TJString D(C);

C = A;
D = A+B;

cout<<(const char*)A<<endl<<(const char*)B<<endl
    <<(const char*)C<<endl<<(const char*)D<<endl;

This would produce:
Hello World
Hello World
Hello World
Hello WorldHelloWorld

There is only one copy of the string "Hello World" kept in memory for this
example.  If we add the statements,
B.reverse();
B.reverse();
There will now be two copies of the string "Hello World".  A and C both
point to the first, and B now has it's own copy.


STRING COMPARISON:
Since it is possible to compare two strings that are not necessarily 
alpha-numeric a set of rules must be established for non-alpha-numeric
characters.  The following list describes the ordering of the ASCII
character set:
ASCII codes:         Priorities:        Characters:
000-031               0                 control
032-047               1 - 16            space ! " # $ % & ' ( ) * + , - . /
058-064              17 - 23            : ; < = > ? @
091-096              24 - 29            [ \ ] ^ _ `
123-127              30 - 34            { | } ~  
048-057              36 - 45            0 1 2 3 4 5 6 7 8 9
065-091              46 - 71            A - B
097-122              72 - 97            a - b
128-255              255                set dependant

For example, the string !#%%$ would come before &()'*+.  This ordering
is arbitrary for non-alpha-numeric characters and is based upon ASCII code
order only.


MEMBER FUNCTIONS:

TJString()
    This is the default constructor for TJString.  It creates an empty
    string ( "" ).

TJString( const TJString& Copy )
    This is the copy constructor for TJString.   The new TJString is created
    using the contents of Copy.

TJString( const char* strCopy )
    This constructor creates a new TJString using the NULL terminated
    C string.

~TJString()
    This is the destructor of TJString.

const char& operator[]( index ) const
    These are the indexing operators for constant TJString objects.  Note
    that for this version of TJString, no index checking is performed.  Care
    should be taken that the given index is valid.

char& operator[]( index )
    These are the indexing operators for non-constant TJString objects.
    Calling the [] operator for a non-contant TJString object causes the
    object to create it's own copy of the string data.  These operators
    should only be called when the it is desired to change the contents
    of the string.

operator char*()
    This is the cast operator to a character string.  Care should be
    used when using this cast.  Since the TJString does not know what is
    being done with the buffer it goes into an internal state that
    indicates that the data held is unreliable.  In this state the
    TJString is much less efficient.  TJString objects created using an
    unreliable TJString will not use the unreliable object's data buffer
    for storage.  The unreliable TJString, not trusting it's own character
    buffer, will be forced to calculate certain values ( string length
    for example ) every time that value is requested, rather that just
    returning a stored value.  It is also possible that the data buffer
    may be destroyed by the TJString object within the scope of the 
    cast.  For example:

    char* tmpPtr;
    TJString Me("Hello World");

    tmpPtr = Me;  //cast to char*

    Me="Hello Everyone"; //The buffer is deleted here.

    cout<<tmpPtr<<endl;  // The effects of this are not defined.


operator const char*()
    This is the cast operator to a constant character pointer.  This is
    a "safe" cast and can be used without fear of object corruption.

TJString& operator=( const TJString& )
    This operator sets one TJString equal to another. 
    
TJString& operator+=( const TJString& )
    This operator performs a concatination of the TJString parameter onto
    target string, ie. A="Hello", B=" World", A+=B yields "Hello World".

TJString& operator=( const char* )
    This operator sets the TJString equal to the C string.

TJString& operator+=( const char* )
    This operator concatinates the C string to the TJString.

operator ==
operator !=
operator <
operator >
operator <=
operator >=
    The normal boolean comparison operators are provided for comparison
    of two TJStrings and a TJString with a C string.

unsigned long length() const
    Returns the number of characters in the string.

char charAt( index ) const
    Returns the character at the given index.  This member should be used
    in place of the [] for non-const strings when the character is not
    being used as a l-param ( for example in an if statement ).  This
    does not cause the TJString to make it's own copy of the data.

int contains( const TJString& ) const
int contains( const char* ) const
    These members check for the existence of the parameter string and return
    1 ( TRUE ) if it exists in the TJString object and 0 ( FALSE ) if
    it does not.

TJString operator()( unsigned long, unsigned long )
    This operator returns the substring from the first index paremeter to
    the second index paremeter, inclusive.  For example if A="Hello World"
    then A(3, 6)="lo Wo".

TJString subString( const unsigned long, const unsigned long )
    This member is equivalent to operator()

TJString subStringTo( const unsigned long )
    This member is equivalent to subString( 0, parameter ).

TJString subStringFrom( const unsigned long )
    This member returns the substring from the parameter to the end of
    the TJString object.

TJString& toUpper()
    This member changes all alphabetic characters ( a - z ) to upper case.

TJString& toLower()
    This member changes all alphabetic characters ( A - Z ) to lower case.

TJString& reverse()
    This member reverses the string.  "Hello World" --> "dlroW olleH"

static const TJString TJNULLString
    This is the constant "".

void setBufferAddSize( const unsigned long newSize )
    When a string is created, extra space is kept at the end of the buffer
    to make any concatination operations quicker.  This sets the size of
    the extra space.  The minimum value is 16 bytes.

unsigned long bufferAddSize() const
    This member returns the extra buffer size.


TECHNICAL NOTES:
    This library was compiled using the Microsoft Visual C++ compiler V4.
    Code generation is set to blend and structure byte alignment is set
    to 16.  The source code may be obtained for a small fee.  E-mail to 
    wortcook@aol.com.  The member bodies were not included in the 
    header as a precaution against unauthorized use.


LEGAL NOTICE:
    This class is freely usable by individuals and educational institutions
    for non-commercial purposes only.  Commercial use of this code is 
    prohibited without permission by the author.


ENHANCEMENTS:
    - iostream support is not included.  This is more a matter of laziness
      on my part then for any technical reason.  Output streams can be
      used due to the inclusion of the const char* cast operator by using
      an explicit cast.  An implicit cast will not work.  If you want, just
      paste the following into your code:

      ostream& operator<<(ostream& out, const TJString& outString )
      {
          return (out<<(const char*)outString);
      }

      istream& operator>>(istream& in, TJString& inString )
      {
          in>>(char*)inString;
          inString=inString;
          return in;
      }

      the >> operator is a bit kludgy.  A better method would be to make
      the operator a friend of TJString and allow it direct access to the
      buffer.

    - Parsing, substring removal and replacement can easily be added 
      through derivation.  I had originally added this functionality but
      later decided that what I really wanted was a simple way to 
      handle strings, not a swiss army knife.

    - Index checking is not performed except for generating substrings.
      I did have exceptions included but they were part of a more general
      exception scheme that I did not want to include with this code.  The
      code is not "safe" in this sense but can easily be made so without
      a severe performance hit.

    - Another version of the string class that I have written uses a 
      template array base class for storage.  As with exceptions, I did
      not want to include that code here.

    - TJString should eventually handle 2 byte character codes.  An
      ideal would be a general purpose unicode compliant string class
      that could also deal with the traditional 1 byte character codes.
      This would require a larger set of comparison rules, specific to
      each language set as well as a set of rules for inter-language set
      string comparisons.

    - TJString is an ideal candidate to have it's own memory management.
      Something that would keep old data buffers in memory to be reused.
      I have it partially written but don't have it completed.

    - This would eventually make a nice DLL once some more related classes
      are added.
 
FINAL NOTES:
    - The TJString source code should compile with any compiler out there.
      It uses standard C++ and does not include any platform or compiler
      specific code.  There are many compilers that include a string class
      as part of the application framework.  I wrote this class as a
      replacement for one such framework.  The class that I wrote this to
      replace lacked most of the features that one would normally want to
      include in a string class ( operator[], const char* casts, etc. ).
      As such, this is my attempt.  I welcome all and any comments and
      critiques.

Thomas Jones
wortcook@aol.com

    
   
