Subject: C++ FAQ (#4 of 7)
Date: 6 Nov 1996 23:23:45 GMT
Summary: Please read this before posting to comp.lang.c++

Archive-name: C++-faq/part4
Posting-Frequency: monthly
URL: http://www.cerfnet.com/~mpcline/C++-FAQs-Lite/

AUTHOR: Marshall Cline / cline@parashift.com / Paradigm Shift, Inc. /
One Park St. / Norwood, NY 13668 / 315-353-6100 (voice) / 315-353-6110 (fax)

COPYRIGHT: This posting is part of "C++ FAQs Lite."  The entire "C++ FAQs Lite"
document is Copyright(C) 1991-96 Marshall P. Cline, Ph.D., cline@parashift.com.
All rights reserved.  Copying is permitted only under designated situations.
For details, see section [1].

NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS.  THE AUTHOR PROVIDES NO
WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING
WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR
PURPOSE.

C++-FAQs-Lite != C++-FAQs-Book: This document, C++ FAQs Lite, is not the same
as the C++ FAQs Book.  The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is
500% larger than this document, and is available in bookstores.  For details,
see section [3].

==============================================================================

SECTION [16]: Freestore management


[16.1] Does delete p delete the pointer p, or the pointed-to-data *p?

The pointed-to-data.

The keyword should really be delete_the_thing_pointed_to_by.  The same abuse of
English occurs when freeing the memory pointed to by a pointer in C: free(p)
really means free_the_stuff_pointed_to_by(p).

==============================================================================

[16.2] Can I free() pointers allocated with new? Can I delete pointers
       allocated with malloc()?

No!

It is perfectly legal, moral, and wholesome to use malloc() and delete in the
same program, or to use new and free() in the same program.  But it is illegal,
immoral, and despicable to call free() with a pointer allocated via new, or to
call delete on a pointer allocated via malloc().

Beware! I occasionally get e-mail from people telling me that it works OK for
them on machine X and compiler Y.  That does not make it right! Sometimes
people say, "But I'm just working with an array of char." Nonetheless do not
mix malloc() and delete on the same pointer, or new and free() on the same
pointer! If you allocated via p = new char[n], you must use delete[] p; you
must not use free(p).  Or if you allocated via p = malloc(n), you must use
free(p); you must not use delete[] p or delete p! Mixing these up could cause a
catastrophic failure at runtime if the code was ported to a new machine, a new
compiler, or even a new version of the same compiler.

You have been warned.

==============================================================================

[16.3] Why should I use new instead of trustworthy old malloc()?

Constructors/destructors, type safety, overridability.
 * Constructors/destructors: unlike malloc(sizeof(Fred)), new Fred() calls
   Fred's constructor.  Similarly, delete p calls *p's destructor.
 * Type safety: malloc() returns a void* which isn't type safe.  new Fred()
   returns a pointer of the right type (a Fred*).
 * Overridability: new is an operator that can be overridden by a class, while
   malloc() is not overridable on a per-class basis.

==============================================================================

[16.4] Can I use realloc() on pointers allocated via new?

No!

When realloc() has to copy the allocation, it uses a bitwise copy operation,
which will tear many C++ objects to shreds.  C++ objects should be allowed to
copy themselves.  They use their own copy constructor or assignment operator.

Besides all that, the heap that new uses may not be the same as the heap that
malloc() and realloc() use!

==============================================================================

[16.5] Do I need to check for NULL after p = new Fred()?

No! (But if you have an old compiler, you may have to force the compiler to
have this behavior[16.6]).

It turns out to be a real pain to always write explicit NULL tests after every
new allocation.  Code like the following is very tedious:

    Fred* p = new Fred();
    if (p == NULL)
      throw bad_alloc();

If your compiler doesn't support (or if you refuse to use) exceptions[17], your
code might be even more tedious:

    Fred* p = new Fred();
    if (p == NULL) {
      cerr << "Couldn't allocate memory for a Fred" << endl;
      abort();
    }

Take heart.  In C++, if the runtime system cannot allocate sizeof(Fred) bytes
of memory during p = new Fred(), a bad_alloc exception will be thrown.  Unlike
malloc(), new never returns NULL!

Therefore you should simply write:

    Fred* p = new Fred();   // No need to check if p is NULL

However, if your compiler is old, it may not yet support this.  Find out by
checking your compiler's documentation under "new".  If you have an old
compiler, you may have to force the compiler to have this behavior[16.6].

==============================================================================

[16.6] How can I convince my (older) compiler to automatically check new to see
       if it returns NULL? [UPDATED!]

[Recently added comments on constructors of globals; thanks to William Carroll
carrolw@ttc.com (on 10/96).]

Eventually your compiler will.

If you have an old compiler that doesn't automagically perform the NULL
test[16.5], you can force the runtime system to do the test by installing a
"new handler" function.  Your "new handler" function can do anything you want,
such as print a message and abort() the program, delete some objects and return
(in which case operator new will retry the allocation), throw an exception,
etc.

Here's a sample "new handler" that prints a message and calls abort().  The
handler is installed using set_new_handler():

    #include <new.h>        // To get set_new_handler
    #include <stdlib.h>     // To get abort()
    #include <iostream.h>   // To get cerr

    void myNewHandler(size_t nbytes)
    {
      // This is your own handler.  It can do anything you want.
      cerr << "Attempt to allocate " << nbytes << " failed!" << endl;
      abort();
    }

    main()
    {
      set_new_hanlder(myNewHandler);   // Install your "new handler"
      // ...
    }

After the set_new_handler() line is executed, operator new will call your
myNewHandler if/when it runs out of memory.  This means that new will never
return NULL:

    Fred* p = new Fred();   // No need to check if p is NULL

Note: Please use this abort() approach as a last resort.  If your compiler
supports exception handling[17], please consider throwing an exception instead
of calling abort().

Note: If some global/static object's constructor uses new, it won't use the
myNewHandler() function since that constructor will get called before main()
begins.  Unfortunately there's no convenient way to guarantee that the
set_new_handler() will be called before the first use of new.  For example,
even if you put the set_new_handler() call in the constructor of a global
object, you still don't know if the module ("compilation unit") that contains
that global object will be elaborated first or last or somewhere inbetween.
Therefore you still don't have any guarantee that your call of
set_new_handler() will happen before any other global's constructor gets
invoked.

==============================================================================

[16.7] Do I need to check for NULL before delete p?

No!

The C++ language guarantees that delete p will do nothing if p is equal to
NULL.  Since you might get the test backwards, and since most testing
methodologies force you to explicitly test every branch point, you should not
put in the redundant if test.

Wrong:

    if (p != NULL)
      delete p;

Right:

    delete p;

==============================================================================

[16.8] What are the two steps that happen when I say delete p?

delete p is a two-step process: it calls the destructor, then releases the
memory.  The code generated for delete p looks something like this (assuming p
is of type Fred*):

    // Original code: delete p;
    if (p != NULL) {
      p->~Fred();
      operator delete(p);
    }

The statement p->~Fred() calls the destructor for the Fred object pointed to by
p.

The statement operator delete(p) calls the memory deallocation primitive,
void operator delete(void* p).  This primitive is similar in spirit to
free(void* p).  (Note, however, that these two are not interchangeable; e.g.,
there is no guarantee that the two memory deallocation primitives even use the
same heap!).

==============================================================================

[16.9] In p = new Fred(), does the Fred memory "leak" if the Fred constructor
       throws an exception?

No.

If an exception occurs during the Fred constructor of p = new Fred(), the C++
language guarantees that the memory sizeof(Fred) bytes that were allocated will
automagically be released back to the heap.

Here are the details: new Fred() is a two-step process:
 1. sizeof(Fred) bytes of memory are allocated using the primitive
    void* operator new(size_t nbytes).  This primitive is similar in spirit to
    malloc(size_t nbytes).  (Note, however, that these two are not
    interchangeable; e.g., there is no guarantee that the two memory allocation
    primitives even use the same heap!).
 2. It constructs an object in that memory by calling the Fred constructor.
    The pointer returned from the first step is passed as the this parameter to
    the constructor.  This step is wrapped in a try ... catch block to handle
    the case when an exception is thrown during this step.

Thus the actual generated code looks something like:

    // Original code: Fred* p = new Fred();
    Fred* p = (Fred*) operator new(sizeof(Fred));
    try {
      new(p) Fred();       // Placement new[11.10]
    } catch (...) {
      operator delete(p);  // Deallocate the memory
      throw;               // Re-throw the exception
    }

The statement marked "Placement new[11.10]" calls the Fred constructor.  The
pointer p becomes the this pointer inside the constructor, Fred::Fred().

==============================================================================

[16.10] How do I allocate / unallocate an array of things?

Use p = new T[n] and delete[] p:

    Fred* p = new Fred[100];
    // ...
    delete[] p;

Any time you allocate an array of objects via new (usually with the [n] in the
new expression), you must use [] in the delete statement.  This syntax is
necessary because there is no syntactic difference between a pointer to a thing
and a pointer to an array of things (something we inherited from C).

==============================================================================

[16.11] What if I forget the [] when deleteing array allocated via new T[n]?

All life comes to a catastrophic end.

It is the programmer's --not the compiler's-- responsibility to get the
connection between new T[n] and delete[] p correct.  If you get it wrong,
neither a compile-time nor a run-time error message will be generated by the
compiler.  Heap corruption is a likely result.  Or worse.  Your program will
probably die.

==============================================================================

[16.12] Can I drop the [] when deleteing array of some built-in type (char,
        int, etc)?

No!

Sometimes programmers think that the [] in the delete[] p only exists so the
compiler will call the appropriate destructors for all elements in the array.
Because of this reasoning, they assume that an array of some built-in type such
as char or int can be deleted without the [].  E.g., they assume the following
is valid code:

    void userCode(int n)
    {
      char* p = new char[n];
      // ...
      delete p;     // <---- ERROR! Should be delete[] p !
    }

But the above code is wrong, and it can cause a disaster at runtime.  In
particular, the code that's called for delete p is operator delete(void*), but
the code that's called for delete[] p is operator delete[](void*).  The default
behavior for the latter is to call the former, but users are allowed to replace
the latter with a different behavior (in which case they would normally also
replace the corresponding new code in operator new[](size_t)).  If they
replaced the delete[] code so it wasn't compatible with the delete code, and
you called the wrong one (i.e., if you said delete p rather than delete[] p),
you could end up with a disaster at runtime.

==============================================================================

[16.13] After p = new Fred[n], how does the compiler know there are n objects
        to be destructed during delete[] p?

Short answer: Magic.

Long answer: The run-time system stores the number of objects, n, somewhere
where it can be retrieved if you only know the pointer, p.  There are two
popluar techniques that do this.  Both these techniques are in use by
commercial grade compilers, both have tradeoffs, and neither is perfect.  These
techniques are:
 3. Over-allocate the array and put n just to the left of the first Fred
    object[33.4].
 4. Use an associative array with p as the key and n as the value[33.5].

==============================================================================

[16.14] Is it legal (and moral) for a member function to say delete this?

As long as you're careful, it's OK for an object to commit suicide (delete
this).

Here's how I define "careful":
 1. You must be absolutely 100% positive sure that this object was allocated
    via new (not by new[], nor by placement new[11.10], nor a local object on
    the stack, nor a global, nor a member of another object; but by plain
    ordinary new).
 2. You must be absolutely 100% positive sure that your member function will be
    the last member function invoked on this object.
 3. You must be absolutely 100% positive sure that the rest of your member
    function (after the delete this line) doesn't touch any piece of this
    object (including calling any other member functions or touching any data
    members).
 4. You must be absolutely 100% positive sure that no one even touches the this
    pointer itself after the delete this line.  In other words, you must not
    examine it, compare it with another pointer, compare it with NULL, print
    it, cast it, do anything with it.

Naturally the usual caveats apply in cases where your this pointer is a pointer
to a base class when you don't have a virtual destructor[20.4].

==============================================================================

[16.15] How do I allocate multidimensional arrays using new?

There are many ways to do this, depending on how flexible you want the array
sizing to be.  On one extreme, if you know all the dimensions at compile-time,
you can allocate multidimensional arrays statically (as in C):

    class Fred { /*...*/ };

    void manipulateArray()
    {
      Fred matrix[10][20];

      // Use matrix[i][j]...

      // No need for explicit deallocation
    }

On the other extreme, if you want to allow the various slices of the matrix to
have a different sizes, you can allocate everything off the freestore.  For
example, in the following function, nrows is the number of rows in the array
(i.e., the valid row numbers are from 0 to nrows-1 inclusive), and array
element ncols[r] is the number of columns in row r (where r in the range 0 to
nrows-1 inclusive):

    void manipulateArray(unsigned nrows, unsigned ncols[])
    {
      Fred** matrix = new Fred*[nrows];
      for (unsigned r = 0; r < nrows; ++r)
        matrix[r] = new Fred[ ncols[r] ];

      // ...

      // Deletion is the opposite of allocation:
      for (r = nrows; r > 0; --r)
        delete[] matrix[r-1];
      delete[] matrix;
    }

Note the funny use of matrix[r-1] in the deletion process.  This prevents
wrap-around of the unsigned value when r goes one step past zero.

==============================================================================

[16.16] Does C++ have arrays whose length can be specified at run-time?

Yes, in the sense that STL[32.1] has a vector template that provides this
behavior.

No, in the sense that built-in array types need to have their length specified
at compile time.

Yes, in the sense that even built-in array types can specify the first index
bounds at run-time.  E.g., comparing with the previous FAQ, if you only need
the first array dimension to vary then you can just ask new for an array of
arrays, rather than an array of pointers to arrays:

    const unsigned ncols = 100;           // ncols = number of columns in the array

    class Fred { /*...*/ };

    void manipulateArray(unsigned nrows)  // nrows = number of rows in the array
    {
      Fred (*matrix)[ncols] = new Fred[nrows][ncols];
      // ...
      delete[] matrix;
    }

You can't do this if you need anything other than the first dimension of the
array to change at run-time.

But please, don't use arrays unless you have to.  Arrays are evil[21.5].  Use
some object of some class if you can.  Use arrays only when you have to.

==============================================================================

[16.17] How can I force objects of my class to always be created via new rather
        than as locals or global/static objects?

Use the Named Constructor Idiom[10.6].

As usual with the Named Constructor Idiom, the constructors are all private: or
protected:, and there are one or more public static create() methods (the
so-called "named constructors"), one per constructor.  In this case the
create() methods allocate the objects via new.  Since the constructors
themselves are not public, there is no other way to create objects of the
class.

    class Fred {
    public:
      // The create() methods are the "named constructors":
      static Fred* create()                 { return new Fred();     }
      static Fred* create(int i)            { return new Fred(i);    }
      static Fred* create(const Fred& fred) { return new Fred(fred); }
      // ...

    private:
      // The constructors themselves are private or protected:
      Fred();
      Fred(int i);
      Fred(const Fred& fred);
      // ...
    };

Now the only way to create Fred objects is via Fred::create():

    main()
    {
      Fred* p = Fred::create(5);
      // ...
      delete p;
    }

Make sure your constructors are in the protected: section if you expect Fred to
have derived classes.

Note also that you can make another class Wilma a friend[14] of Fred if you
want to allow a Wilma to have a member object of class Fred, but of course this
is a softening of the original goal, namely to force Fred objects to be
allocated via new.

==============================================================================

[16.18] How do I do simple reference counting?

If all you want is the ability to pass around a bunch of pointers to the same
object, with the feature that the object will automagically get deleted when
the last pointer to it disappears, you can use something like the following
"smart pointer" class:

    // Fred.h

    class FredPtr;

    class Fred {
    public:
      Fred() : count_(0) /*...*/ { }  // All ctors set count_ to 0 !
      // ...
    private:
      friend FredPtr;     // A friend class[14]
      unsigned count_;
      // count_ must be initialized to 0 by all constructors
      // count_ is the number of FredPtr objects that point at this
    };

    class FredPtr {
    public:
      Fred* operator-> () { return p_; }
      Fred& operator* ()  { return *p_; }
      FredPtr(Fred* p)    : p_(p) { }   // p must not be NULL
     ~FredPtr()           { if (--p_->count_ == 0) delete p_; }
      FredPtr(const FredPtr& p) : p_(p.p_) { ++p_->count_; }
      FredPtr& operator= (const FredPtr& p)
            { // DO NOT CHANGE THE ORDER OF THESE STATEMENTS!
              // (This order properly handles self-assignment[12.1])
              ++p.p_->count_;
              if (--p_->count_ == 0) delete p_;
              p_ = p.p_;
              return *this;
            }
    private:
      Fred* p_;    // p_ is never NULL
    };

Naturally you can use nested classes to rename FredPtr to Fred::Ptr.

Note that you can soften the "never NULL" rule above with a little more
checking in the copy constructor, assignment operator, and destructor.  If you
do that, you might as well put a p_ != NULL check into the "*" and "->"
operators (at least as an assert()).  I would recommend against an
operator Fred*() method, since that would let people accidentally get at the
Fred*.

If you want to be really safe, you can make all of Fred's constructors private,
and for each constructor have a public (static) create() method.  These
create() methods would return a FredPtr rather than a Fred*.  That way the only
way anyone could create a Fred object would be to get a FredPtr
("Fred* p = new Fred()" would be replaced by "FredPtr p = Fred::create()").
Thus no one could accidentally subvert the reference counted mechanism by
saying getting a Fred*.  For example, if Fred had a Fred::Fred() and a
Fred::Fred(int i, int j), the changes to class Fred would be:

    class Fred {
    public:
      static FredPtr create()             { return new Fred(); }
      static FredPtr create(int i, int j) { return new Fred(i,j); }
      // ...
    private:
      Fred();
      Fred(int i, int j);
      // ...
    };

The end result is that you now have a way to use simple reference counting to
provide "pointer semantics" for a given object.  Users of your Fred class
explicitly use FredPtr objects, which act more or less like Fred* pointers.
The benefit is that users can make as many copies of their FredPtr "smart
pointer" objects, and the pointed-to Fred object will automagically get deleted
when the last such FredPtr object vanishes.

If you'd rather give your users "reference semantics" rather than "pointer
semantics," you can use reference counting to provide "copy on write"[16.19].

==============================================================================

[16.19] How do I provide reference counting with copy-on-write semantics?

The previous FAQ[16.18] a simple reference counting scheme that provided users
with pointer semantics.  This FAQ describes an approach that provides users
with reference semantics.

The basic idea is to allow users to think they're copying your Fred objects,
but in reality the underlying implementation doesn't actually do any copying
unless and until some user actually tries to modify the underlying Fred object.

Class Fred::Data houses all the data that would normally go into the Fred
class.  Fred::Data also has an extra data member, count_, to manage the
reference counting.  Class Fred ends up being a "smart reference" that
(internally) points to a Fred::Data.

    class Fred {
    public:

      Fred();                               // A default constructor[10.4]
      Fred(int i, int j);                   // A normal constructor

      Fred(const Fred& f);
      Fred& operator= (const Fred& f);
     ~Fred();

      void sampleInspectorMethod() const;   // No changes to this object
      void sampleMutatorMethod();           // Change this object

      // ...

    private:

      class Data {
      public:
        Data();
        Data(int i, int j);

        // Since only Fred can access a Fred::Data object,
        // you can make Fred::Data's data public if you want.
        // But if that makes you uncomfortable, make the data private
        // and make Fred a friend class[14] via friend Fred;
        // ...

        unsigned count_;
        // count_ must be initialized to 0 by all constructors
        // count_ is the number of Fred objects that point at this
      };

      Data* data_;
    };

    Fred::Data::Data()             : count_(1) /*init other data*/ { }
    Fred::Data::Data(int i, int j) : count_(1) /*init other data*/ { }

    Fred::Fred()             : data_(new Data()) { }
    Fred::Fred(int i, int j) : data_(new Data(i, j)) { }

    Fred::Fred(const Fred& f)
      : data_(f.data_)
    {
      ++ data_->count_;
    }

    Fred& Fred::operator= (const Fred& f)
    {
      // DO NOT CHANGE THE ORDER OF THESE STATEMENTS!
      // (This order properly handles self-assignment[12.1])
      ++ f.data_->count_;
      if (--data_->count_ == 0) delete data_;
      data_ = f.data_;
      return *this;
    }

    Fred::~Fred()
    {
      if (--data_->count_ == 0) delete data_;
    }

    void Fred::sampleInspectorMethod() const
    {
      // This method promises ("const") not to change anything in *data_
      // Other than that, any data access would simply use "data_->..."
    }

    void Fred::sampleMutatorMethod()
    {
      // This method might need to change things in *data_
      // Thus it first checks if this is the only pointer to *data_
      if (data_->counter > 1) {
        Data* d = new Data(*data_);    // Invoke Fred::Data's copy ctor
        -- data_->counter_;
        data_ = d;
      }
      assert(data_->counter_ == 1);

      // Now the method proceeds to access "data_->..." as normal
    }

If it is fairly common to call Fred's default constructor[10.4], you can avoid
all those new calls by sharing a common Fred::Data object for all Freds that
are constructed via Fred::Fred().  To avoid static initialization order
problems, this shared Fred::Data object is created "on first use" inside a
function.  Here are the changes that would be made to the above code (note that
the shared Fred::Data object's destructor is never invoked; if that is a
problem, either hope you don't have any static initialization order problems,
or drop back to the approach described above):

    class Fred {
    public:
      // ...
    private:
      // ...
      static Data* defaultData();
    };

    Fred::Fred() : data_(defaultData()) { }

    Fred::Data* Fred::defaultData()
    {
      static Data* p = NULL;
      if (p == NULL) {
        p = new Data();
        ++ p->count_;    // Make sure it never goes to zero
      }
      return p;
    }

Note: You can also provide reference counting for a hierarchy of classes[16.20]
if your Fred class would normally have been a base class.

==============================================================================

[16.20] How do I provide reference counting with copy-on-write semantics for a
        hierarchy of classes?

The previous FAQ[16.19] presented a reference counting scheme that provided
users with reference semantics, but did so for a single class rather than for a
hierarchy of classes.  This FAQ extends the previous technique to allow for a
hierarchy of classes.  The basic difference is that Fred::Data is now the root
of a hierarchy of classes, which probably cause it to have some virtual[20]
functions.  Note that class Fred itself will still not have any virtual
functions.

The Virtual Constructor Idiom[20.5] is used to make copies of the Fred::Data
objects.  To select which derived class to create, the sample code below uses
the Named Constructor Idiom[10.6], but other techniques are possible (a switch
statement in the constructor, etc).  The sample code assumes two derived
classes: Der1 and Der2.  Methods in the derived classes are unaware of the
reference counting.

    class Fred {
    public:

      static Fred create1(String s, int i);
      static Fred create2(float x, float y);

      Fred(const Fred& f);
      Fred& operator= (const Fred& f);
     ~Fred();

      void sampleInspectorMethod() const;   // No changes to this object
      void sampleMutatorMethod();           // Change this object

      // ...

    private:

      class Data {
      public:
        Data() : count_(1) { }
        virtual ~Data() { }                              // A virtual destructor[20.4]
        virtual Data* clone() const = 0;                 // A virtual constructor[20.5]
        virtual void sampleInspectorMethod() const = 0;  // A pure virtual function[22.4]
        virtual void sampleMutatorMethod() = 0;
      private:
        unsigned count_;   // count_ doesn't need to be protected
      };

      class Der1 : public Data {
      public:
        Der1(String s, int i);
        virtual void sampleInspectorMethod() const;
        virtual void sampleMutatorMethod();
        virtual Data* clone() const;
        // ...
      };

      class Der2 : public Data {
      public:
        Der2(float x, float y);
        virtual void sampleInspectorMethod() const;
        virtual void sampleMutatorMethod();
        virtual Data* clone() const;
        // ...
      };

      Fred(Data* data);
      // Creates a Fred smart-reference that owns *data
      // It is private to force users to use a createXXX() method
      // Requirement: data must not be NULL

      Data* data_;   // Invariant: data_ is never NULL
    };

    Fred::Fred(Data* data) : data_(data)  { assert(data != NULL); }

    Fred Fred::create1(String s, int i)   { return Fred(new Der1(s, i)); }
    Fred Fred::create2(float x, float y)  { return Fred(new Der2(x, y)); }

    Fred::Data* Fred::Der1::clone() const { return new Der1(*this); }
    Fred::Data* Fred::Der2::clone() const { return new Der2(*this); }

    Fred::Fred(const Fred& f)
      : data_(f.data_)
    {
      ++ data_->count_;
    }

    Fred& Fred::operator= (const Fred& f)
    {
      // DO NOT CHANGE THE ORDER OF THESE STATEMENTS!
      // (This order properly handles self-assignment[12.1])
      ++ f.data_->count_;
      if (--data_->count_ == 0) delete data_;
      data_ = f.data_;
      return *this;
    }

    Fred::~Fred()
    {
      if (--data_->count_ == 0) delete data_;
    }

    void Fred::sampleInspectorMethod() const
    {
      // This method promises ("const") not to change anything in *data_
      // Therefore we simply "pass the method through" to *data_:
      data_->sampleInspectorMethod();
    }

    void Fred::sampleMutatorMethod()
    {
      // This method might need to change things in *data_
      // Thus it first checks if this is the only pointer to *data_
      if (data_->counter > 1) {
        Data* d = data_->clone();   // The Virtual Constructor Idiom[20.5]
        -- data_->counter_;
        data_ = d;
      }
      assert(data_->counter_ == 1);

      // Now we "pass the method through" to *data_:
      data_->sampleInspectorMethod();
    }

Naturally the constructors and sampleXXX methods for Fred::Der1 and Fred::Der2
will need to be implemented in whatever way is appropriate.

==============================================================================

SECTION [17]: Exceptions and error handling


[17.1] How can I handle a constructor that fails?

Throw an exception.

Constructors don't have a return type, so it's not possible to use error codes.
The best way to signal constructor failure is therefore to throw an exception.

If you don't have or won't use exceptions, here's a work-around.  If a
constructor fails, the constructor can put the object into a "zombie" state.
Do this by setting an internal status bit so the object acts sort of like its
dead even though it is technically still alive.  Then add a query ("inspector")
member function to check this "zombie" bit so users of your class can find out
if their object is truly alive, or if it's a zombie (i.e., a "living dead"
object).  Also you'll probably want to have your other member functions check
this zombie bit, and, if the object isn't really alive, do a no-op (or perhaps
something more obnoxious such as abort()).  This is really ugly, but it's the
best you can do if you can't (or don't want to) use exceptions.

==============================================================================

[17.2] How should I handle resources if my constructors may throw exceptions?

Every data member inside your object should clean up its own mess.

If a constructor throws an exception, the object's destructor is not run.  If
your object has already done something that needs to be undone (such as
allocating some memory, opening a file, or locking a semaphore), this "stuff
that needs to be undone" must be remembered by a data member inside the object.

For example, rather than allocating memory into a raw Fred* data member, put
the allocated memory into a "smart pointer" member object, and the destructor
of this smart pointer will delete the Fred object when the smart pointer dies.
The standard class auto_ptr is an example of such as "smart pointer" class.
You can also write your own reference counting smart pointer[16.18].  You can
also use smart pointers to "point" to disk records or objects on other
machines[13.3].

==============================================================================

[17.3] How do I change the string-length of an array of char to prevent memory
       leaks even if/when someone throws an exception?

If what you really want to do is work with strings, don't use an array of char
in the first place, since arrays are evil[21.5].  Instead use an object of some
string-like class.

For example, suppose you want to get a copy of a string, fiddle with the copy,
then append another string to the end of the fiddled copy.  The array-of-char
approach would look something like this:

    void userCode(const char* s1, const char* s2)
    {
      // Get a copy of s1 into a new string called copy:
      char* copy = new char[strlen(s1) + 1];
      strcpy(copy, s1);

      // Now that we have a local pointer to freestore-allocated memory,
      // we need to use a try block to prevent memory leaks:
      try {

        // Now we fiddle with copy for a while...
        // ...

        // Later we want to append s2 onto the fiddled-with copy:
        // ... [Here's where people want to reallocate copy] ...
        char* copy2 = new char[strlen(copy) + strlen(s2) + 1];
        strcpy(copy2, copy);
        strcpy(copy2 + strlen(copy), s2);
        delete[] copy;
        copy = copy2;

        // Finally we fiddle with copy again...
        // ...

      } catch (...) {
        delete[] copy;   // Prevent memory leaks if we got an exception
        throw;           // Re-throw the current exception
      }

      delete[] copy;     // Prevent memory leaks if we did NOT get an exception
    }

Using char*s like this is tedious and error prone.  Why not just use an object
of some string class? Your compiler probably supplies a string-like class, and
it's probably just as fast and certainly it's a lot simpler and safer than the
char* code that you would have to write yourself.  For example, if you're using
the string class from the standardization committee[6.12], your code might look
something like this:

    #include <string>           // Let the compiler see class string
    using namespace std;

    void userCode(const string& s1, const string& s2)
    {
      // Get a copy of s1 into a new string called copy:
      string copy = s1;         // NOTE: we do NOT need a try block!

      // Now we fiddle with copy for a while...
      // ...

      // Later we want to append s2 onto the fiddled-with copy:
      copy += s2;               // NOTE: we do NOT need to reallocate memory!

      // Finally we fiddle with copy again...
      // ...
    }                           // NOTE: we do NOT need to delete[] anything!

==============================================================================

SECTION [18]: Const correctness


[18.1] What is "const correctness"?

A good thing.  It means using the keyword const to prevent const objects from
getting mutated.

For example, if you wanted to create a function f() that accepted a String,
plus you want to promise callers not to change the caller's String that gets
passed to f(), you can have f() receive its String parameter . . .
 * void f(String s);             // Pass by value
 * void f(const String& s);      // Pass by reference-to-const
 * void f(const String* sptr);   // Pass by pointer-to-const

Attempted changes to the caller's String within any of these functions would be
flagged by the compiler as an error at compile-time.  Neither run-time space
nor speed is degraded.

As an opposite example, if you wanted to create a function g() that accepted a
String, but you want to let callers know that g() might change the caller's
String object, you can have g() receive its String parameter . . .
 * void g(String& s);      // Pass by reference-to-non-const
 * void g(String* sptr);   // Pass by pointer-to-non-const

The lack of const in g() tells the compiler that g() is allowed to (but is not
required to) change the caller's String object that was handed to g().  This
means that g() can pass its String to f(), but only the first version of f()
(the one that receives its parameter "by value") can pass its String to g().
If one of the other versions of f() wants to call g(), it must pass a local
copy of the parameter to g() rather than the parameter itself.  E.g.,

    void g(String& s);

    void f(const String& s)
    {
      g(s);           // Compile-time Error since s is const

      String localCopy = s;
      g(localCopy);   // OK since localCopy is not const
    }

Naturally in the above case, any changes that g() makes are made to the
localCopy object that is local to f().  In particular, no changes will be made
to the const parameter that was passed by reference to f().

==============================================================================

[18.2] How is "const correctness" related to ordinary type safety?

Declaring the const-ness of a parameter is just another form of type safety.
It is almost as if a const String, for example, is a different class than an
ordinary String, since the const variant is missing the various mutative
operations in the non-const variant (e.g., you can imagine that a const String
simply doesn't have an assignment operator).

If you find ordinary type safety helps you get systems correct (it does;
especially in large systems), you'll find const correctness helps also.

==============================================================================

[18.3] Should I try to get things const correct "sooner" or "later"?

At the very, very, very beginning.

Back-patching const correctness results in a snowball effect: every const you
add "over here" requires four more to be added "over there."

==============================================================================

[18.4] What does "const Fred* p" mean?

It means p points to an object of class Fred, but p can't be used to change
that Fred object (naturally p could also be NULL).

For example, if class Fred has a const member function[18.9] called inspect(),
saying p->inspect() is OK.  But if class Fred has a non-const member
function[18.9] called mutate(), saying p->mutate() is an error (the error is
caught by the compiler; no run-time tests are done, which means const doesn't
slow your program down).

==============================================================================

[18.5] What's the difference between "const Fred* p", "Fred* const p" and
       "const Fred* const p"?

You have to read pointer declarations right-to-left.
 * const Fred* p means "p points to a Fred that is const" -- that is, the Fred
   object can't be changed via p[18.13].
 * Fred* const p means "p is a const pointer to a Fred" -- that is, you can
   change the Fred object via p[18.13], but you can't change the pointer p
   itself.
 * const Fred* const p means "p is a const pointer to a const Fred" -- that is,
   you can't change the pointer p itself, nor can you change the Fred object
   via p[18.13].

==============================================================================

[18.6] What does "const Fred& x" mean?

It means x aliases a Fred object, but x can't be used to change that Fred
object.

For example, if class Fred has a const member function[18.9] called inspect(),
saying x.inspect() is OK.  But if class Fred has a non-const member
function[18.9] called mutate(), saying x.mutate() is an error (the error is
caught by the compiler; no run-time tests are done, which means const doesn't
slow your program down).

==============================================================================

[18.7] Does "Fred& const x" make any sense?

No, it is nonsense.

To find out what the above declaration means, you have to read it
right-to-left[18.5].  Thus "Fred& const x" means "x is a const reference to a
Fred".  But that is redundant, since references are always const.  You can't
reseat a reference[8.4].  Never.  With or without the const.

In other words, "Fred& const x" is functionally equivalent to "Fred& x".  Since
you're gaining nothing by adding the const after the &, you shouldn't add it
since it will confuse people.  I.e., the const will make some people think that
the Fred is const, as if you had said "const Fred& x".

==============================================================================

[18.8] What does "Fred const& x" mean?

"Fred const& x" is functionally equivalent to "const Fred& x"[18.6].

The problem with using "Fred const& x" (with the const before the &) is that it
could easily be mis-typed as the nonsensical "Fred &const x"[18.7] (with the
const after the &).

Better to simply use const Fred& x.

==============================================================================

[18.9] What is a "const member function"?

A member function that inspects (rather than mutates) its object.

A const member function is indicated by a const suffix just after the member
function's parameter list.  Member functions with a const suffix are called
"const member functions" or "inspectors." Member functions without a const
suffix are called "non-const member functions" or "mutators."

    class Fred {
    public:
      void inspect() const;   // This member promises NOT to change *this
      void mutate();          // This member function might change *this
    };

    void userCode(Fred& changeable, const Fred& unchangeable)
    {
      changeable.inspect();   // OK: doesn't change a changeable object
      changeable.mutate();    // OK: changes a changeable object

      unchangeable.inspect(); // OK: doesn't change an unchangeable object
      unchangeable.mutate();  // ERROR: attempt to change unchangeable object
    }

The error in unchangeable.mutate() is caught at compile time.  There is no
runtime space or speed penalty for const.

The trailing const on inspect() member function means that the abstract
(client-visible) state of the object isn't going to change.  This is slightly
different from promising that the "raw bits" of the object's struct aren't
going to change.  C++ compilers aren't allowed to take the "bitwise"
interpretation unless they can solve the aliasing problem, which normally can't
be solved (i.e., a non-const alias could exist which could modify the state of
the object).  Another (important) insight from this aliasing issue: pointing at
an object with a const pointer doesn't guarantee that the object won't change;
it promises only that the object won't change via that pointer).

==============================================================================

[18.10] What do I do if I want to update an "invisible" data member inside a
        const member function?

Use mutable, or use const_cast.

A small percentage of inspectors need to make innocuous changes to data members
(e.g., a Set object might want to cache its last lookup in hopes of improving
the performance of its next lookup).  By saying the changes are "innocuous," I
mean that the changes wouldn't be visible from outside the object's interface
(otherwise the member function would be a mutator rather than an inspector).

When this happens, the data member which will be modified should be marked as
mutable (put the mutable keyword just before the data member's declaration;
i.e., in the same place where you could put const).  This tells the compiler
that the data member is allowed to change during a const member function.  If
your compiler doesn't support the mutable keyword, you can cast away the
const'ness of this via the const_cast keyword.  E.g., in Set::lookup() const,
you might say,

    Set* self = const_cast<Set*>(this);

After this line, self will have the same bits as this (e.g., self == this), but
self is a Set* rather than a const Set*.  Therefore you can use self to modify
the object pointed to by this.

==============================================================================

[18.11] Does const_cast mean lost optimization opportunities?

In theory, yes; in practice, no.

Even if the language outlawed const_cast, the only way to avoid flushing the
register cache across a const member function call would be to solve the
aliasing problem (i.e., to prove that there are no non-const pointers that
point to the object).  This can happen only in rare cases (when the object is
constructed in the scope of the const member function invocation, and when all
the non-const member function invocations between the object's construction and
the const member function invocation are statically bound, and when every one
of these invocations is also inlined, and when the constructor itself is
inlined, and when any member functions the constructor calls are inline).

==============================================================================

[18.12] Why does the compiler allow me to change an int after I've pointed at
        it with a const int*?

Because "const int* p" means "p promises not to change the *p," not "*p
promises not to change."

Causing a const int* to point to an int doesn't const-ify the int.  The int
can't be changed via the const int*, but if someone else has a int* (note: no
const) that points to ("aliases") the same int, then that int* can be used to
change the int.  For example:

    void f(const int* p1, int* p2)
    {
      int i = *p1;         // Get the (original) value of *p1
      *p2 = 7;             // If p1 == p2, this will also change *p1
      int j = *p1;         // Get the (possibly new) value of *p1
      if (i != j) {
        cout << "*p1 changed, but it didn't change via pointer p1!\n";
        assert(p1 == p2);  // This is the only way *p1 could be different
      }
    }

    main()
    {
      int x;
      f(&x, &x);           // This is perfectly legal (and even moral!)
    }

Note that main() and f(const int*,int*) could be in different compilation units
that are compiled on different days of the week.  In that case there is no way
the compiler can possibly detect the aliasing at compile time.  Therefore there
is no way we could make a language rule that prohibits this sort of thing.  In
fact, we wouldn't even want to make such a rule, since in general it's
considered a feature that you can have many pointers pointing to the same
thing.  The fact that one of those pointers promises not to change the
underlying "thing" is just a promise made by the pointer; it's not a promise
made by the "thing".

==============================================================================

[18.13] Does "const Fred* p" mean that *p can't change?

No! (This is related to the FAQ about aliasing of int pointers[18.12].)

"const Fred* p" means that the Fred can't be changed via pointer p, but any
aliasing pointers that aren't const can be used to change the Fred object.  For
example, if you have two pointers "const Fred* p" and "Fred* q" that point to
the same Fred object (aliasing), pointer q can be used to change the Fred
object but pointer p cannot.

    class Fred {
    public:
      void inspect() const;   // A const member function[18.9]
      void mutate();          // A non-const member function[18.9]
    };

    main()
    {
      Fred f;
      const Fred* p = &f;
            Fred* q = &f;

      p->insepct();    // OK: No change to *p
      p->mutate();     // Error: Can't change *p via p

      q->inspect();    // OK: q is allowed to inspect the object
      q->mutate();     // OK: q is allowed to mutate the object
    }

==============================================================================

SECTION [19]: Inheritance -- basics


[19.1] Is inheritance important to C++?

Yep.

Inheritance is what separates abstract data type (ADT) programming from OO
programming.

==============================================================================

[19.2] When would I use inheritance?

As a specification device.

Human beings abstract things on two dimensions: part-of and kind-of.  A Ford
Taurus is-a-kind-of-a Car, and a Ford Taurus has-a Engine, Tires, etc.  The
part-of hierarchy has been a part of software since the ADT style became
relevant; inheritance adds "the other" major dimension of decomposition.

==============================================================================

[19.3] How do you express inheritance in C++?

By the : public syntax:

    class Car : public Vehicle {
    public:
      // ...
    };

We state the above relationship in several ways:
 * Car is "a kind of a" Vehicle
 * Car is "derived from" Vehicle
 * Car is "a specialized" Vehicle
 * Car is the "subclass" of Vehicle
 * Vehicle is the "base class" of Car
 * Vehicle is the "superclass" of Car (this not as common in the C++ community)

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

[19.4] Is it OK to convert a pointer from a derived class to its base class?

Yes.

An object of a derived class is a kind of the base class.  Therefore the
conversion from a derived class pointer to a base class pointer is perfectly
safe, and happens all the time.  For example, if I am pointing at a car, I am
in fact pointing at a vehicle, so converting a Car* to a Vehicle* is perfectly
safe and normal:

    void f(Vehicle* v);
    void g(Car* c) { f(c); }  // Perfectly safe; no cast

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

[19.5] What's the difference between public:, private:, and protected:?

 * A member (either data member or member function) declared in a private:
   section of a class can only be accessed by member functions and friends[14]
   of that class
 * A member (either data member or member function) declared in a protected:
   section of a class can only be accessed by member functions and friends[14]
   of that class, and by member functions and friends[14] of derived classes
 * A member (either data member or member function) declared in a public:
   section of a class can be accessed by anyone

==============================================================================

[19.6] Why can't my derived class access private: things from my base class?

To protect you from future changes to the base class.

Derived classes do not get access to private members of a base class.  This
effectively "seals off" the derived class from any changes made to the private
members of the base class.

==============================================================================

[19.7] How can I protect subclasses from breaking when I change internal parts?

A class has two distinct interfaces for two distinct sets of clients:
 * It has a public: interface that serves unrelated classes
 * It has a protected: interface that serves derived classes

Unless you expect all your subclasses to be built by your own team, you should
consider making your base class's bits be private:, and use protected: inline
access functions by which derived classes will access the private data in the
base class.  This way the private bits can change, but the derived class's code
won't break unless you change the protected access functions.

==============================================================================

SECTION [20]: Inheritance -- virtual functions


[20.1] What is a "virtual member function"?

From an OO perspective, it is the single most important feature of C++: [6.8],
[6.9].

A virtual function allows derived classes to replace the implementation
provided by the base class.  The compiler makes sure the replacement is always
called whenever the object in question is actually of the derived class, even
if the object is accessed by a base pointer rather than a derived pointer.
This allows algorithms in the base class to be replaced in the derived class,
even if users don't know about the derived class.

The derived class can either fully replace ("override") the base class member
function, or the derived class can partially replace ("augment") the base class
member function.  The latter is accomplished by having the derived class member
function call the base class member function, if desired.

==============================================================================

[20.2] How can C++ achieve dynamic binding yet also static typing?

When you have a pointer to an object, the object may actually be of a class
that is derived from the class of the pointer (e.g., a Vehicle* that is
actually pointing to a Car object).  Thus there are two types: the (static)
type of the pointer (Vehicle, in this case), and the (dynamic) type of the
pointed-to object (Car, in this case).

Static typing means that the legality of a member function invocation is
checked at the earliest possible moment: by the compiler at compile time.  The
compiler uses the static type of the pointer to determine whether the member
function invocation is legal.  If the type of the pointer can handle the member
function, certainly the pointed-to object can handle it as well.  E.g., if
Vehicle has a certain member function, certainly Car also has that member
function since Car is a kind-of Vehicle.

Dynamic binding means that the address of the code in a member function
invocation is determined at the last possible moment: based on the dynamic type
of the object at run time.  It is called "dynamic binding" because the binding
to the code that actually gets called is accomplished dynamically (at run
time).  Dynamic binding is a result of virtual functions.

==============================================================================

[20.3] What's the difference between how virtual and non-virtual member
       functions are called? [NEW!]

[Recently created (on 11/96).]

Non-virtual member functions are resolved statically.  That is, the member
function is selected statically (at compile-time) based on the type of the
pointer (or reference) to the object.

In contrast, virtual member functions are resolved dynamically (at run-time).
That is, the member function is selected dynamically (at run-time) based on the
type of the object, not the type of the pointer/reference to that object.  This
is called "dynamic binding." Most compilers use some variant of the following
technique: if the object has one or more virtual functions, the compiler puts a
hidden pointer in the object called a "virtual-pointer" or "v-pointer." This
v-pointer points to a global table called the "virtual-table" or "v-table."

The compiler creates a v-table for each class that has at least one virtual
function.  For example, if class Circle has virtual functions for draw() and
move() and resize(), there would be exactly one v-table associated with class
Circle, even if there were a gazillion Circle objects, and the v-pointer of
each of those Circle objects would point to the Circle v-table.  The v-table
itself has pointers to each of the virtual functions in the class.  For
example, the Circle v-table would have three pointers: a pointer to
Circle::draw(), a pointer to Circle::move(), and a pointer to Circle::resize().

During a dispatch of a virtual function, the run-time system follows the
object's v-pointer to the class's v-table, then follows the appropriate slot in
the v-table to the method code.

The space-cost overhead of the above technique is nominal: an extra pointer per
object (but only for objects that will need to do dynamic binding), plus an
extra pointer per method (but only for virtual methods).  The time-cost
overhead is also fairly nominal: compared to a normal function call, a virtual
function call requires two extra fetches (one to get the value of the
v-pointer, a second to get the address of the method).  None of this runtime
activity happens with non-virtual functions, since the compiler selects the
member function exclusively at compile-time based on the type of the pointer.

Note: the above discussion is simplified considerably, since it doesn't account
for extra structural things like multiple inheritance, virtual inheritance,
RTTI, etc., nor does it account for space/speed issues such as page faults,
calling a function via a pointer-to-function, etc.  If you want to know about
those other things, please ask comp.lang.c++; PLEASE DO NOT SEND E-MAIL TO ME!

==============================================================================

[20.4] When should my destructor be virtual?

When you may delete a derived object via a base pointer.

virtual functions bind to the code associated with the class of the object,
rather than with the class of the pointer/reference.  When you say
delete basePtr, and the base class has a virtual destructor, the destructor
that gets invoked is the one associated with the type of the object *basePtr,
rather than the one associated with the type of the pointer.  This is generally
A Good Thing.

TECHNO-GEEK WARNING; PUT YOUR PROPELLER HAT ON.
Technically speaking, you need a base class's destructor to be virtual if and
only if you intend to allow someone to invoke an object's destructor via a base
class pointer (this is normally done implicitly via delete), and the object
being destructed is of a derived class that has a non-trivial destructor.  A
class has a non-trivial destructor if it either has an explicit destructor, or
if it has a member object or a base class that has a non-trivial destructor
(note that this is a recursive definition (e.g., a class has a non-trivial
destructor if it has a member object (which has a base class (which has a
member object (which has a base class (which has an explicit destructor)))))).
END TECHNO-GEEK WARNING; REMOVE YOUR PROPELLER HAT

If you had a hard grokking the previous rule, try this (over)simplified one on
for size: A class should have a virtual destructor unless that class has no
virtual functions.  Rationale: if you have any virtual functions at all, you're
probably going to be doing "stuff" to derived objects via a base pointer, and
some of the "stuff" you may do may include invoking a destructor (normally done
implicitly via delete).  Plus once you've put the first virtual function into a
class, you've already paid all the per-object space cost that you'll ever pay
(one pointer per object; note that this is theoretically compiler-specific; in
practice everyone does it pretty much the same way), so making the destructor
virtual won't generally cost you anything extra.

==============================================================================

[20.5] What is a "virtual constructor"?

An idiom that allows you to do something that C++ doesn't directly support.

You can get the effect of a virtual constructor by a virtual clone() member
function (for copy constructing), or a virtual create() member function (for
the default constructor[10.4]).

    class Shape {
    public:
      virtual ~Shape() { }                 // A virtual destructor[20.4]
      virtual void draw() = 0;             // A pure virtual function[22.4]
      virtual void move() = 0;
      // ...
      virtual Shape* clone()  const = 0;   // Uses the copy constructor
      virtual Shape* create() const = 0;   // Uses the default constructor[10.4]
    };

    class Circle : public Shape {
    public:
      Circle* clone()  const { return new Circle(*this); }
      Circle* create() const { return new Circle();      }
      // ...
    };

In the clone() member function, the new Circle(*this) code calls Circle's copy
constructor to copy the state of this into the newly created Circle object.  In
the create() member function, the new Circle() code calls Circle's default
constructor[10.4].

Users use these as if they were "virtual constructors":

    void userCode(Shape& s)
    {
      Shape* s2 = s.clone();
      Shape* s3 = s.create();
      // ...
      delete s2;    // You probably need a virtual destructor[20.4] here
      delete s3;
    }

This function will work correctly regardless of whether the Shape is a Circle,
Square, or some other kind-of Shape that doesn't even exist yet.

==============================================================================

-- 
Marshall Cline, Ph.D., President, Paradigm Shift, Inc.
315-353-6100 (voice)
315-353-6110 (fax)
mailto:cline@parashift.com
