n.ca!resurrect
Subject: C++ FAQ (#3 of 7)
Date: 6 Nov 1996 23:23:31 GMT
Summary: Please read this before posting to comp.lang.c++
X-Resurrected-By: dave@ferret.ocunix.on.ca

Reposting article removed by rogue canceller.  See news.admin.net-abuse.announce
for further information.

Archive-name: C++-faq/part3
Posting-Frequency: monthly
Last-modified: Nov 6, 1996
URL: http://www.cerfnet.com/~mpcline/C++-FAQs-Lite/

AUTHOR: Marshall Cline / cline@parashift.com / Paradigm Shift, Inc. /
One Park St. / Norwood, NY 13668 / 315-353-6100 (voice) / 315-353-6110 (fax)

COPYRIGHT: This posting is part of "C++ FAQs Lite."  The entire "C++ FAQs Lite"
document is Copyright(C) 1991-96 Marshall P. Cline, Ph.D., cline@parashift.com.
All rights reserved.  Copying is permitted only under designated situations.
For details, see section [1].

NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS.  THE AUTHOR PROVIDES NO
WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING
WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR
PURPOSE.

C++-FAQs-Lite != C++-FAQs-Book: This document, C++ FAQs Lite, is not the same
as the C++ FAQs Book.  The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is
500% larger than this document, and is available in bookstores.  For details,
see section [3].

==============================================================================

SECTION [9]: Inline functions


[9.1] What's the deal with inline functions?

An inline function is a function whose code gets inserted into the caller's
code stream.  Like a #define macro, inline functions improve performance by
avoiding the overhead of the call itself and (especially!) by the compiler
being able to optimize through the call ("procedural integration").

==============================================================================

[9.2] How can inline functions help with the tradeoff of safety vs. speed?

In straight C, you can achieve "encapsulated structs" by putting a void* in a
struct, in which case the void* points to the real data that is unknown to
users of the struct.  Therefore users of the struct don't know how to interpret
the stuff pointed to by the void*, but the access functions cast the void* to
the approprate hidden type.  This gives a form of encapsulation.

Unfortunately it forfeits type safety, and also imposes a function call to
access even trivial fields of the struct (if you allowed direct access to the
struct's fields, anyone and everyone would be able to get direct access since
they would of necessity know how to interpret the stuff pointed to by the
void*; this would make it difficult to change the underlying data structure).

Function call overhead is small, but can add up.  C++ classes allow function
calls to be expanded inline.  This lets you have the safety of encapsulation
along with the speed of direct access.  Furthermore the parameter types of
these inline functions are checked by the compiler, an improvement over C's
#define macros.

==============================================================================

[9.3] Why should I use inline functions? Why not just use plain old #define
      macros?

Because #define macros are evil.

Unlike #define macros, inline functions avoid infamous macro errors since
inline functions always evaluate every argument exactly once.  In other words,
invoking an inline function is semantically just like invoking a regular
function, only faster:

    // A macro that returns the absolute value of i
    #define unsafe(i)  \
            ( (i) >= 0 ? (i) : -(i) )

    // An inline function that returns the absolute value of i
    inline
    int safe(int i)
    {
      return i >= 0 ? i : -i;
    }

    int f();

    void userCode(int x)
    {
      int ans;

      ans = unsafe(x++);   // Error! x is incremented twice
      ans = unsafe(f());   // Danger! f() is called twice

      ans = safe(x++);     // Correct! x is incremented once
      ans = safe(f());     // Correct! f() is called once
    }

Also unlike macros, argument types are checked, and necessary conversions are
performed correctly.

Macros are bad for your health; don't use them unless you have to.

==============================================================================

[9.4] How do you tell the compiler to make a non-member function inline?

When you declare an inline function, it looks just like a normal function:

    void f(int i, char c);

But when you define an inline function, you prepend the function's definition
with the keyword inline, and you put the definition into a header file:

    inline
    void f(int i, char c)
    {
      // ...
    }

Note: It's imperative that the function's definition (the part between the
{...}) be placed in a header file, unless the function is used only in a single
.cpp file.  In particular, if you put the inline function's definition into a
.cpp file and you call it from some other .cpp file, you'll get an "unresolved
external" error from the linker.

==============================================================================

[9.5] How do you tell the compiler to make a member function inline?

When you declare an inline member function, it looks just like a normal member
function:

    class Fred {
    public:
      void f(int i, char c);
    };

But when you define an inline member function, you prepend the member
function's definition with the keyword inline, and you put the definition into
a header file:

    inline
    void Fred::f(int i, char c)
    {
      // ...
    }

It's usually imperative that the function's definition (the part between the
{...}) be placed in a header file.  If you put the inline function's definition
into a .cpp file, and if it is called from some other .cpp file, you'll get an
"unresolved external" error from the linker.

==============================================================================

[9.6] Is there another way to tell the compiler to make a member function
      inline?

Yep: define the member function in the class body itself:

    class Fred {
    public:
      void f(int i, char c)
        {
          // ...
        }
    };

Although this is easier on the person who writes the class, it's harder on all
the readers since it mixes "what" a class does with "how" it does them.
Because of this mixture, we normally prefer to define member functions outside
the class body with the inline keyword[9.5].  The insight that makes sense of
this: in a reuse-oriented world, there will usually be many people who use your
class, but there is only one person who builds it (yourself); therefore you
should do things that favor the many rather than the few.

==============================================================================

[9.7] Are inline functions guaranteed make your performance better?

Nope.

Beware that overuse of inline functions can cause code bloat, which can in turn
have a negative performance impact in paging environments.

==============================================================================

SECTION [10]: Constructors


[10.1] What's the deal with constructors?

Constructors build objects from dust.

Constructors are like "init functions".  They turn a pile of arbitrary bits
into a living object.  Minimally they initialize internally used fields.  They
may also allocate resources (memory, files, semaphores, sockets, etc).

"ctor" is a typical abbreviation for constructor.

==============================================================================

[10.2] Is there any difference between List x; and List x();?

A big difference!

Suppose that List is the name of some class.  Then function f() declares a
local List object called x:

    void f()
    {
      List x;     // Local object named x (of class List)
      // ...
    }

But function g() declares a function called x() that returns a List:

    void g()
    {
      List x();   // Function named x (that returns a List)
      // ...
    }

==============================================================================

[10.3] How can I make a constructor call another constructor as a primitive?

No way.

Dragons be here: if you call another constructor, the compiler initializes a
temporary local object; it does not initialize this object.  You can combine
both constructors by using a default parameter, or you can share their common
code in a private init() member function.

==============================================================================

[10.4] Is the default constructor for Fred always Fred::Fred()?

No.  A "default constructor" is a constructor that can be called with no
arguments.  Thus a constructor that takes no arguments is certainly a default
constructor:

    class Fred {
    public:
      Fred();   // Default constructor: can be called with no args
      // ...
    };

However it is possible (and even likely) that a default constructor can take
arguments, provided they are given default values:

    class Fred {
    public:
      Fred(int i=3, int j=5);   // Default constructor: can be called with no args
      // ...
    };

==============================================================================

[10.5] Which constructor gets called when I create an array of Fred objects?

Fred's default constructor[10.4].

There is no way to tell the compiler to call a different constructor.  If your
class Fred doesn't have a default constructor[10.4], attempting to create an
array of Fred objects is trapped as an error at compile time.

    class Fred {
    public:
      Fred(int i, int j);
      // ... assume there is no default constructor[10.4] in class Fred ...
    };

    main()
    {
      Fred a[10];               // ERROR: Fred doesn't have a default constructor
      Fred* p = new Fred[10];   // ERROR: Fred doesn't have a default constructor
    }

However if you are creating an STL[32.1] vector<Fred> rather than an array of
Fred (which you probably should be doing anyway since arrays are evil[21.5]),
you don't have to have a default constructor in class Fred, since you can give
the vector a Fred object to be used to initialize the elements:

    #include <vector>
    using namespace std;

    main()
    {
      vector<Fred> a(10, Fred(5,7));
      // The 10 Fred objects in vector a will be initialized with Fred(5,7).
      // ...
    }

==============================================================================

[10.6] What is the "Named Constructor Idiom"?

A technique that provides more intuitive and/or safer construction operations
for users of your class.

The problem is that constructors always have the same name as the class.
Therefore the only way to differentiate between the various constructors of a
class is by the parameter list.  But if there are lots of constructors, the
differences between the constructors becomes somewhat subtle and error prone.

With the Named Constructor Idiom, you declare all the class's constructors in
the private: or protected: sections, and you provide public static methods that
return an object.  These static methods are the so-called "Named Constructors."
In general there is one such static method for each different way to construct
an object.

For example, suppose we are building a Point class that represents a position
on the X-Y plane.  Turns out there are two common ways to specify a 2-space
coordinate: rectangular coordinates (X+Y), polar coordinates (Radius+Angle).
(Don't worry if you can't remember these; the point isn't the particulars of
coordinate systems; the point is that there are several ways to create a Point
object).  Unfortunately the parameters for these two coordinate systems are the
same: two floats.  This would create an ambiguity error in the overloaded
constructors:

    class Point {
    public:
      Point(float x, float y);     // Rectangular coordinates
      Point(float r, float a);     // Polar coordinates (radius and angle)
      // ERROR: Overload is Ambiguous: Point::Point(float,float)
    };

    main()
    {
      Point p = Point(5.7, 1.2);   // Ambiguous: Which coordinate system?
    }

One way to solve this ambiguity is to use the Named Constructor Idiom:

    #include <math.h>              // To get sin() and cos()

    class Point {
    public:
      static Point rectangular(float x, float y);      // Rectangular coord's
      static Point polar(float radius, float angle);   // Polar coordinates
      // These static methods are the so-called "named constructors"
      // ...
    private:
      Point(float x, float y);     // Rectangular coordinates
      float x_, y_;
    };

    inline Point::Point(float x, float y)
    : x_(x), y_(y) { }

    inline Point Point::rectangular(float x, float y)
    { return Point(x, y); }

    inline Point Point::polar(float radius, float angle)
    { return Point(radius*cos(angle), radius*sin(angle)); }

Now the users of Point have a clear and unambiguous syntax for creating Points
in either coordinate system:

    main()
    {
      Point p1 = Point::rectangular(5.7, 1.2);   // Obviously rectangular
      Point p2 = Point::polar(5.7, 1.2);         // Obviously polar
    }

Make sure your constructors are in the protected: section if you expect Fred to
have derived classes.

The Named Constructor Idiom can also be used to make sure your objects are
always created via new[16.17].

==============================================================================

[10.7] Why can't I initialize my static member data in my constructor's
       initialization list?

Because you must explicitly define your class's static data members.

Fred.h:

    class Fred {
    public:
      Fred();
      // ...
    private:
      int i_;
      static int j_;
    };

Fred.cpp (or Fred.C or whatever):

    Fred::Fred()
      : i_(10)  // OK: you can (and should) initialize member data this way
        j_(42)  // Error: you cannot initialize static member data like this
    {
      // ...
    }

    // You must define static data members this way:
    int Fred::j_ = 42;

==============================================================================

[10.8] Why are classes with static data members getting linker errors?

Because static data members must be explicitly defined in exactly one
compilation unit[10.7].  If you didn't do this, you'll probably get an
"undefined external" linker error.  For example:

    // Fred.h

    class Fred {
    public:
      // ...
    private:
      static int j_;   // Declares static data member Fred::j_
      // ...
    };

The linker will holler at you ("Fred::j_ is not defined") unless you define (as
opposed to merely declare) Fred::j_ in (exactly) one of your source files:

    // Fred.cpp

    #include "Fred.h"

    int Fred::j_ = some_expression_evaluating_to_an_int;

    // Alternatively, if you wish to use the implicit 0 value for static ints:
    // int Fred::j_;

The usual place to define static data members of class Fred is file Fred.cpp
(or Fred.C or whatever source file extension you use).

==============================================================================

SECTION [11]: Destructors


[11.1] What's the deal with destructors?

A destructor gives an object its last rites.

Destructors are used to release any resources allocated by the object.  E.g.,
class Lock might lock a semaphore, and the destructor will release that
semaphore.  The most common example is when the constructor uses new, and the
destructor uses delete.

Destructors are a "prepare to die" member function.  They are often abbreviated
"dtor".

==============================================================================

[11.2] What's the order that local objects are destructed?

In reverse order of construction: First constructed, last destructed.

In the following example, b's destructor will be executed first, then a's
destructor:

    void userCode()
    {
      Fred a;
      Fred b;
      // ...
    }

==============================================================================

[11.3] What's the order that objects in an array are destructed?

In reverse order of construction: First constructed, last destructed.

In the following example, the order for destructors will be a[9], a[8], ...,
a[1], a[0]:

    void userCode()
    {
      Fred a[10];
      // ...
    }

==============================================================================

[11.4] Can I overload the destructor for my class?

No.

You can have only one destructor for a class Fred.  It's always called
Fred::~Fred().  It never takes any parameters, and it never returns anything.

You can't pass parameters to the destructor anyway, since you never explicitly
call a destructor[11.5] (well, almost never[11.10]).

==============================================================================

[11.5] Should I explicitly call a destructor on a local variable?

No!

The destructor will get called again at the close } of the block in which the
local was created.  This is a guarantee of the language; it happens
automagically; there's no way to stop it from happening.  But you can get
really bad results from calling a destructor on the same object a second time!
Bang! You're dead!

==============================================================================

[11.6] What if I want a local to "die" before the close } of the scope in which
       it was created? Can I call a destructor on a local if I really want to?

No! [For context, please read the previous FAQ[11.5]].

Suppose the (desirable) side effect of destructing a local File object is to
close the File.  Now suppose you have an object f of a class File and you want
File f to be closed before the end of the scope (i.e., the }) of the scope of
object f:

    void someCode()
    {
      File f;

      // ... [This code that should execute when f is still open] ...

      // <--- We want the side-effect of f's destructor here!

      // ... [This code that should execute after f is closed] ...
    }

There is a simple solution to this problem[11.7].  But in the mean time,
remember: Do not explicitly call the destructor![11.5]

==============================================================================

[11.7] OK, OK already; I won't explicitly call the destructor of a local; but
       how do I handle the above situation?

[For context, please read the previous FAQ[11.6]].

Simply wrap the extent of the lifetime of the local in an artificial block {
... }:

    void someCode()
    {
      {
        File f;
        // ... [This code will execute when f is still open] ...
      }
    // ^-- f's destructor will automagically be called here!

      // ... [This code will execute after f is closed] ...
    }

==============================================================================

[11.8] What if I can't wrap the local in an artificial block?

Most of the time, you can limit the lifetime of a local by wrapping the local
in an artificial block ({ ...  })[11.7].  But if for some reason you can't do
that, add a member function that has a similar effect as the destructor.  But
do not call the destructor itself!

For example, in the case of class File, you might add a close() method.
Typically the destructor will simply call this close() method.  Note that the
close() method will need to mark the File object so a subsequent call won't
re-close an already-closed File.  E.g., it might set the fileHandle_ data
member to some nonsensical value such as -1, and it might check at the
beginning to see if the fileHandle_ is already equal to -1:

    class File {
    public:
      void close();
      ~File();
      // ...
    private:
      int fileHandle_;   // fileHandle_ >= 0 if/only-if it's open
    };

    File::~File()
    {
      close();
    }

    void File::close()
    {
      if (fileHandle_ >= 0) {
        // ... [Perform some operating-system call to close the file] ...
        fileHandle_ = -1;
      }
    }

Note that the other File methods may also need to check if the fileHandle_ is
-1 (i.e., check if the File is closed).

==============================================================================

[11.9] But can I explicitly call a destructor if I've allocated my object with
       new?

Probably not.

Unless you used placement new[11.10], you should simply delete the object
rather than explicitly calling the destructor.  For example, suppose you
allocated the object via a typical new expression:

    Fred* p = new Fred();

Then the destructor Fred::~Fred() will automagically get called when you delete
it via:

    delete p;  // Automagically calls p->~Fred()

You should not explicitly call the destructor, since doing so won't release the
memory that was allocated for the Fred object itself.  Remember: delete p does
two things[16.8]: it calls the destructor and it deallocates the memory.

==============================================================================

[11.10] What is "placement new" and why would I use it?

There are many uses of placement new.  The simplest use is to place an object
at a particular location in memory.  This is done by supplying the place as a
pointer parameter to the new part of a new expression:

    #include <new.h>      // Must #include this to use "placement new"
    #include "Fred.h"     // Declaration of class Fred

    void someCode()
    {
      char memory[sizeof(Fred)];     // Line #1
      void* place = memory;          // Line #2

      Fred* f = new(place) Fred();   // Line #3 (see "DANGER" below)
      // The pointers f and place will be equal

      // ...
    }

Line #1 creates an array of sizeof(Fred) bytes of memory, which is big enough
to hold a Fred object.  Line #2 creates a pointer place that points to the
first byte of this memory (experienced C programmers will note that this step
was unnecessary; it's there only to make the code more obvious).  Line #3
essentially just calls the constructor Fred::Fred().  The this pointer in the
Fred constructor will be equal to place.  The returned pointer f will therefore
be equal to place.

ADVICE: Don't use this "placement new" syntax unless you have to.  Use it only
when you really care that an object is placed at a particular location in
memory.  For example, when your hardware has a memory-mapped I/O timer device,
and you want to place a Clock object at that memory location.

DANGER: You are taking sole responsibility that the pointer you pass to the
"placement new" operator points to a region of memory that is big enough and is
properly aligned for the object type that you're creating.  Neither the
compiler nor the run-time system make any attempt to check whether you did this
right.  If your Fred class needs to be aligned on a 4 byte boundary but you
supplied a location that isn't properly aligned, you can have a serious
disaster on your hands (if you don't know what "alignment" means, please don't
use the placement new syntax).  You have been warned.

You are also solely responsible for destructing the placed object.  This is
done by explicitly calling the destructor:

    void someCode()
    {
      char memory[sizeof(Fred)];
      void* p = memory;
      Fred* f = new(p) Fred();
      // ...
      f->~Fred();   // Explicitly call the destructor for the placed object
    }

This is about the only time you ever explicitly call a destructor.

==============================================================================

[11.11] When I write a destructor, do I need to explicitly call the destructors
        for my member objects?

No.  You never need to explicitly call a destructor (except with placement
new[11.10]).

A class's destructor (whether or not you explicitly define one) automagically
invokes the destructors for member objects.  They are destroyed in the reverse
order they appear within the declaration for the class.

    class Member {
    public:
      ~Member();
      // ...
    };

    class Fred {
    public:
      ~Fred();
      // ...
    private:
      Member x_;
      Member y_;
      Member z_;
    };

    Fred::~Fred()
    {
      // Compiler automagically calls z_.~Member()
      // Compiler automagically calls y_.~Member()
      // Compiler automagically calls x_.~Member()
    }

==============================================================================

[11.12] When I write a derived class's destructor, do I need to explicitly call
        the destructor for my base class?

No.  You never need to explicitly call a destructor (except with placement
new[11.10]).

A derived class's destructor (whether or not you explicitly define one)
automagically invokes the destructors for base class subobjects.  Base classes
are destructed after member objects.  In the event of multiple inheritance,
direct base classes are destructed in the reverse order of their appearance in
the inheritance list.

    class Member {
    public:
      ~Member();
      // ...
    };

    class Base {
    public:
      virtual ~Base();     // A virtual destructor[20.4]
      // ...
    };

    class Derived : public Base {
    public:
      ~Derived();
      // ...
    private:
      Member x_;
    };

    Derived::~Derived()
    {
      // Compiler automagically calls x_.~Member()
      // Compiler automagically calls Base::~Base()
    }

Note: Order dependencies with virtual inheritance are trickier.  If you are
relying on order dependencies in a virtual inheritance hierarchy, you'll need a
lot more information than is in this FAQ.

==============================================================================

SECTION [12]: Assignment operators


[12.1] What is "self assignment"?

Self assignment is when someone assigns an object with itself.  For example,

    #include "Fred.hpp"    // Declares class Fred

    void userCode(Fred& x)
    {
      x = x;   // Self-assignment
    }

Obviously no one ever explicitly does a self assignment like the above, but
since more than one pointer or reference can point to the same object
(aliasing), it is possible to have self assignment without knowning it:

    #include "Fred.hpp"    // Declares class Fred

    void userCode(Fred& x, Fred& y)
    {
      x = y;   // Could be self-assignment if &x == &y
    }

    main()
    {
      Fred z;
      userCode(z, z);
    }

==============================================================================

[12.2] Why should I worry about "self assignment"?

If you don't worry about self assignment[12.1], you'll expose your users to
some very subtle bugs that have very subtle and often disastrous symptoms.  For
example, the following class will cause a complete disaster in the case of
self-assignment:

    class Wilma { };

    class Fred {
    public:
      Fred()                : p_(new Wilma())      { }
      Fred(const Fred& f)   : p_(new Wilma(*f.p_)) { }
     ~Fred()                { delete p_; }
      Fred& operator= (const Fred& f)
        {
          // Bad code: Doesn't handle self-assignment!
          delete p_;                // Line #1
          p_ = new Wilma(*f.p_);    // Line #2
          return *this;
        }
    private:
      Wilma* p_;
    };

If someone assigns a Fred object with itself, line #1 deletes both this->p_ and
f.p_ since *this and f are the same object.  But line #2 uses *f.p_, which is
no longer a valid object.  This will likely cause a major disaster.

The bottom line is that you the author of class Fred are responsible to make
sure self-assignment on a Fred object is inocuous[12.3].  Do not assume that
users won't ever do that to your objects.  It is your fault if your object
crashes when it gets a self-assignment.

Aside: the above Fred::operator= (const Fred&) has a second problem:     If an
exception is thrown[17] while evaluating new Wilma(*f.p_) (e.g., an
out-of-memory     exception[16.5] or an exception in Wilma's copy
    constructor[17.1]), this->p_ will be a dangling pointer -- it will
    point to memory that is no longer valid.  This can be solved by allocating
the     new objects before deleting the old objects.

==============================================================================

[12.3] OK, OK, already; I'll handle self-assignment.  How do I do it?

You should worry about self assignment every time you create a class[12.2].
This does not mean that you need to add extra code to all your classes: as long
as your objects gracefully handle self assignment, it doesn't matter whether
you had to add extra code or not.

If you do need to add extra code to your assignment operator, here's a simple
and effective technique:

    Fred& Fred::operator= (const Fred& f)
    {
      if (this == &f) return *this;   // Gracefully handle self assignment[12.1]

      // Put the normal assignment duties here...

      return *this;
    }

This explicit test isn't always necessary.  For example, if you were to fix the
assignment operator in the previous FAQ[12.2] to handle exceptions thrown by
new[16.5] and/or exceptions thrown by the copy constructor[17.1] of class
Wilma, you might produce the following code.  Note that this code has the
(pleasant) side effect of automatically handling self assignment as well:

    Fred& operator= (const Fred& f)
    {
      // This code gracefully (albeit implicitly) handles self assignment[12.1]
      Wilma* tmp = new Wilma(*f.p_);   // It would be OK if an exception[17] got thrown here
      delete p_;
      p_ = tmp;
      return *this;
    }

Some programmers want to add "if (this == &f) return *this;" to make self
assignment more efficient.  This is generally the wrong tradeoff.  If self
assignment only occurs once in a thousand times, the if would waste cycles in
99.9% of the time (a test-and-branch can put a bubble in the pipeline of many
superscalar processors).

==============================================================================

SECTION [13]: Operator overloading


[13.1] What's the deal with operator overloading?

It allows you to provide an intuitive interface to users of your class.

Operator overloading allows C/C++ operators to have user-defined meanings on
user-defined types (classes).  Overloaded operators are syntactic sugar for
function calls:

    class Fred {
    public:
      // ...
    };

    #if 0

      // Without operator overloading:
      Fred add(Fred, Fred);
      Fred mul(Fred, Fred);

      Fred f(Fred a, Fred b, Fred c)
      {
        return add(add(mul(a,b), mul(b,c)), mul(c,a));    // Yuk...
      }

    #else

      // With operator overloading:
      Fred operator+ (Fred, Fred);
      Fred operator* (Fred, Fred);

      Fred f(Fred a, Fred b, Fred c)
      {
        return a*b + b*c + c*a;
      }

    #endif

==============================================================================

[13.2] What are the benefits of operator overloading?

By overloading standard operators on a class, you can exploit the intuition of
the users of that class.  This lets users program in the language of the
problem domain rather than in the language of the machine.

The ultimate goal is to reduce both the learning curve and the defect rate.

==============================================================================

[13.3] What are some examples of operator overloading?

Here are a few of the many examples of operator overloading:
 * myString + yourString might concatenate two string objects
 * myDate++ might increment a Date object
 * a * b might multiply two Number objects
 * a[i] might access an element of an Array object
 * x = *p might dereference a "smart pointer" that actually "points" to a disk
   record -- it could actually seek to the location on disk where p "points"
   and return the appropriate record into x

==============================================================================

[13.4] But operator overloading makes my class look ugly; isn't it supposed to
       make my code clearer?

Operator overloading makes life easier for the users of a class[13.2], not for
the developer of the class!

Consider the following example.

    class Array {
    public:
      int& operator[] (unsigned i);      // Some people don't like this syntax
      // ...
    };

    inline
    int& Array::operator[] (unsigned i)  // Some people don't like this syntax
    {
      // ...
    }

Some people don't like the keyword operator or the somewhat funny syntax that
goes with it in the body of the class itself.  But the operator overloading
syntax isn't supposed to make life easier for the developer of a class.  It's
supposed to make life easier for the users of the class:

    main()
    {
      Array a;
      a[3] = 4;   // User code should be obvious and easy to understand...
    }

Remember: in a reuse-oriented world, there will usually be many people who use
your class, but there is only one person who builds it (yourself); therefore
you should do things that favor the many rather than the few.

==============================================================================

[13.5] What operators can/cannot be overloaded?

Most can be overloaded. The only C operators that can't be are . and ?: (and
sizeof, which is technically an operator).  C++ adds a few of its own
operators, most of which can be overloaded except :: and .*.

Here's an example of the subscript operator (it returns a reference).  First
without operator overloading:

    class Array {
    public:
      #if 0
        int& elem(unsigned i)        { if (i > 99) error(); return data[i]; }
      #else
        int& operator[] (unsigned i) { if (i > 99) error(); return data[i]; }
      #endif
    private:
      int data[100];
    };

    main()
    {
      Array a;

      #if 0
        a.elem(10) = 42;
        a.elem(12) += a.elem(13);
      #else
        a[10] = 42;
        a[12] += a[13];
      #endif
    }

==============================================================================

[13.6] Can I overload operator== so it lets me compare two char[] using a
       string comparison?

No: at least one operand of any overloaded operator must be of some class type.

But even if C++ allowed you to do this, which it doesn't, you wouldn't want to
do it anyway since you really should be using a string-like class rather than
an array of char in the first place[17.3] since arrays are evil[21.5].

==============================================================================

[13.7] Can I create a operator** for "to-the-power-of" operations?

Nope.

The names of, precedence of, associativity of, and arity of operators is fixed
by the language.  There is no operator** in C++, so you cannot create one for a
class type.

If you're in doubt, consider that x ** y is the same as x * (*y) (in other
words, the compiler assumes y is a pointer).  Besides, operator overloading is
just syntactic sugar for function calls.  Although this particular syntactic
sugar can be very sweet, it doesn't add anything fundamental.  I suggest you
overload pow(base,exponent) (a double precision version is in <math.h>).

By the way, operator^ can work for to-the-power-of, except it has the wrong
precedence and associativity.

==============================================================================

[13.8] How do I create a subscript operator for a Matrix class?

Use operator() rather than operator[].

When you have multiple subscripts, the cleanest way to do it is with operator()
rather than with operator[].  The reason is that operator[] always takes
exactly one parameter, but operator() can take any number of parameters (in the
case of a rectangular matrix, two paramters are needed).

For example:

    class Matrix {
    public:
      Matrix(unsigned rows, unsigned cols);
      double& operator() (unsigned row, unsigned col);
      double  operator() (unsigned row, unsigned col) const;
      // ...
     ~Matrix();                              // Destructor
      Matrix(const Matrix& m);               // Copy constructor
      Matrix& operator= (const Matrix& m);   // Assignment operator
      // ...
    private:
      unsigned rows_, cols_;
      double* data_;
    };

    inline
    Matrix::Matrix(unsigned rows, unsigned cols)
      : rows_ (rows),
        cols_ (cols),
        data_ (new double[rows * cols])
    {
      if (rows == 0 || cols == 0)
        throw BadIndex("Matrix constructor has 0 size");
    }

    inline
    Matrix::~Matrix()
    {
      delete[] data_;
    }

    inline
    double& Matrix::operator() (unsigned row, unsigned col)
    {
      if (row >= rows_ || col >= cols_)
        throw BadIndex("Matrix subscript out of bounds");
      return data_[cols_*row + col];
    }

    inline
    double Matrix::operator() (unsigned row, unsigned col) const
    {
      if (row >= rows_ || col >= cols_)
        throw BadIndex("const Matrix subscript out of bounds");
      return data_[cols_*row + col];
    }

Then you can access an element of Matrix m using m(i,j) rather than m[i][j]:

    main()
    {
      Matrix m;
      m(5,8) = 106.15;
      cout << m(5,8);
      // ...
    }

==============================================================================

[13.9] Should I design my classes from the outside (interfaces first) or from
       the inside (data first)?

From the outside!

A good interface provides a simplified view that is expressed in the vocabulary
of a user[7.3].  In the case of OO software, the interface is normally to a
class or a tight group of classes[14.2].

First think about what the object logically represents, not how you intend to
physically build it.  For example, suppose you have a Stack class that will be
built by containing a LinkedList:

    class Stack {
    public:
      // ...
    private:
      LinkedList list_;
    };

Should the Stack have a get() method that returns the LinkedList? Or a set()
method that takes a LinkedList? Or a constructor that takes a LinkedList?
Obviously the answer is No, since you should design your interfaces from the
outside-in.  I.e., users of Stack objects don't care about LinkedLists; they
care about pushing and popping.

Now for another example that is a bit more subtle.  Suppose class LinkedList is
built using a linked list of Node objects, where each Node object has a pointer
to the next Node:

    class Node { /*...*/ };

    class LinkedList {
    public:
      // ...
    private:
      Node* first_;
    };

Should the LinkedList class have a get() method that will let users access the
first Node? Should the Node object have a get() method that will let users
follow that Node to the next Node in the chain? In other words, what should a
LinkedList look like from the outside? Is a LinkedList really a chain of Node
objects? Or is that just an implementation detail? And if it is just an
implementation detail, how will the LinkedList let users access each of the
elements in the LinkedList one at a time?

One man's answer: A LinkedList is not a chain of Nodes.  That may be how it is
built, but that is not what it is.  What it is is a sequence of elements.
Therefore the LinkedList abstraction should provide a "LinkedListIterator"
class as well, and that "LinkedListIterator" might have an operator++ to go to
the next element, and it might have a get()/set() pair to access its value
stored in the Node (the value in the Node element is solely the responsibility
of the LinkedList user, which is why there is a get()/set() pair that allows
the user to freely manipulate that value).

Starting from the user's perspective, we might want our LinkedList class to
support operations that look similar to accessing an array using pointer
arithmetic:

    void userCode(LinkedList& a)
    {
      for (LinkedListIterator p = a.begin(); p != a.end(); ++p)
        cout << *p << '\n';
    }

To implement this interface, LinkedList will need a begin() method and an end()
method.  These return a "LinkedListIterator" object.  The "LinkedListIterator"
will need a method to go forward, ++p; a method to access the current element,
*p; and a comparison operator, p != a.end().

The code follows.  The key insight is that the LinkedList class does not have
any methods that lets users access the Nodes.  Nodes are an implementation
technique that is completely buried.  The LinkedList class could have its
internals replaced with a doubly linked list, or even an array, and the only
difference would be some performance differences with the prepend(elem) and
append(elem) methods.

    #include <assert.h>   // Poor man's exception handling

    typedef  int  bool;   // Someday we won't have to do this

    class LinkedListIterator;
    class LinkedList;

    class Node {
      // No public members; this is a "private class"
      friend LinkedListIterator;   // A friend class[14]
      friend LinkedList;
      Node* next_;
      int elem_;
    };

    class LinkedListIterator {
    public:
      bool operator== (LinkedListIterator i) const;
      bool operator!= (LinkedListIterator i) const;
      void operator++ ();   // Go to the next element
      int& operator*  ();   // Access the current element
    private:
      LinkedListIterator(Node* p);
      Node* p_;
    };

    class LinkedList {
    public:
      void append(int elem);    // Adds elem after the end
      void prepend(int elem);   // Adds elem before the beginning
      // ...
      LinkedListIterator begin();
      LinkedListIterator end();
      // ...
    private:
      Node* first_;
    };

Here are the methods that are obviously inlinable (probably in the same header
file):

    inline bool LinkedListIterator::operator== (LinkedListIterator i) const
    {
      return p_ == i.p_;
    }

    inline bool LinkedListIterator::operator!= (LinkedListIterator i) const
    {
      return p_ != i.p_;
    }

    inline void LinkedListIterator::operator++()
    {
      assert(p_ != NULL);  // or if (p_==NULL) throw ...
      p_ = p_->next_;
    }

    inline int& LinkedListIterator::operator*()
    {
      assert(p_ != NULL);  // or if (p_==NULL) throw ...
      return p_->elem_;
    }

    inline LinkedListIterator::LinkedListIterator(Node* p)
      : p_(p)
    { }

    inline LinkedListIterator LinkedList::begin()
    {
      return first_;
    }

    inline LinkedListIterator LinkedList::end()
    {
      return NULL;
    }

Conclusion: The linked list had two different kinds of data.  The values of the
elements stored in the linked list are the responsibility of the user of the
linked list (and only the user; the linked list itself makes no attempt to
prohibit users from changing the third element to 5), and the linked list's
infrastructure data (next pointers, etc.), whose values are the responsibility
of the linked list (and only the linked list; e.g., the linked list does not
let users change (or even look at!) the various next pointers).

Thus the only get()/set() methods were to get and set the elements of the
linked list, but not the infrastructure of the linked list.  Since the linked
list hides the infrastructure pointers/etc., it is able to make very strong
promises regarding that infrastructure (e.g., if it was a doubly linked list,
it might guarantee that every forward pointer was matched by a backwards
pointer from the next Node).

So, we see here an example of where the values of some of a class's data is the
responsibility of users (in which case the class needs to have get()/set()
methods for that data) but the data that the class wants to control does not
necessarily have get()/set() methods.

==============================================================================

SECTION [14]: Friends


[14.1] What is a friend?

Something to allow your class to grant access to another class or function.

Friends can be either functions or other classes.  A class grants access
privileges to its friends.  Normally a developer has political and technical
control over both the friend and member functions of a class (else you may need
to get permission from the owner of the other pieces when you want to update
your own class).

==============================================================================

[14.2] Do friends violate encapsulation?

If they're used properly, they actually enhance encapsulation.

You often need to split a class in half when the two halves will have different
numbers of instances or different lifetimes.  In these cases, the two halves
usually need direct access to each other (the two halves used to be in the same
class, so you haven't increased the amount of code that needs direct access to
a data structure; you've simply reshuffled the code into two classes instead of
one).  The safest way to implement this is to make the two halves friends of
each other.

If you use friends like just described, you'll keep private things private.
People who don't understand this often make naive efforts to avoid using
friendship in situations like the above, and often they actually destroy
encapsulation.  They either use public data (grotesque!), or they make the data
accessible between the halves via public get() and set() member functions.
Having a public get() and set() member function for a private datum is OK only
when the private datum "makes sense" from outside the class (from a user's
perspective).  In many cases, these get()/set() member functions are almost as
bad as public data: they hide (only) the name of the private datum, but they
don't hide the existence of the private datum.

Similarly, if you use friend functions as a syntactic variant of a class's
public: access functions, they don't violate encapsulation any more than a
member function violates encapsulation.  In other words, a class's friends
don't violate the encapsulation barrier: along with the class's member
functions, they are the encapsulation barrier.

==============================================================================

[14.3] What are some advantages/disadvantages of using friend functions?

They provide a degree of freedom in the interface design options.

Member functions and friend functions are equally privileged (100% vested).
The major difference is that a friend function is called like f(x), while a
member function is called like x.f().  Thus the ability to choose between
member functions (x.f()) and friend functions (f(x)) allows a designer to
select the syntax that is deemed most readable, which lowers maintenance costs.

The major disadvantage of friend functions is that they require an extra line
of code when you want dynamic binding.  To get the effect of a virtual friend,
the friend function should call a hidden (usually protected:) virtual[20]
member function.  This is called the Virtual Friend Function Idiom[15.6].  For
example:

    class Base {
    public:
      friend void f(Base& b);
      // ...
    protected:
      virtual void do_f();
      // ...
    };

    inline void f(Base& b)
    {
      b.do_f();
    }

    class Derived : public Base {
    public:
      // ...
    protected:
      virtual void do_f();  // "Override" the behavior of f(Base& b)
      // ...
    };

    void userCode(Base& b)
    {
      f(b);
    }

The statement f(b) in userCode(Base&) will invoke b.do_f(), which is
virtual[20].  This means that Derived::do_f() will get control if b is actually
a object of class Derived.  Note that Derived overrides the behavior of the
protected: virtual[20] member function do_f(); it does not have its own
variation of the friend function, f(Base&).

==============================================================================

[14.4] What does it mean that "friendship is neither inherited nor transitive"?

I may declare you as my friend, but that doesn't mean I necessarily trust
either your kids or your friends.
 * I don't necessarily trust the kids of my friends.  The privileges of
   friendship aren't inherited.  Derived classes of a friend aren't necessarily
   friends.  If class Fred declares that class Base is a friend, classes
   derived from Base don't have any automatic special access rights to Fred
   objects.
 * I don't necessarily trust the friends of my friends.  The privileges of
   friendship aren't transitive.  A friend of a friend isn't necessarily a
   friend.  If class Fred declares class Wilma as a friend, and class Wilma
   declares class Betty as a friend, class Betty doesn't necessarily have any
   special access rights to Fred objects.

==============================================================================

[14.5] Should my class declare a member function or a friend function?

Use a member when you can, and a friend when you have to.

Sometimes friends are syntactically better (e.g., in class Fred, friend
functions allow the Fred parameter to be second, while members require it to be
first).  Another good use of friend functions are the binary infix arithmetic
operators.  E.g., aComplex + aComplex should be defined as a friend rather than
a member if you want to allow aFloat + aComplex as well (member functions don't
allow promotion of the left hand argument, since that would change the class of
the object that is the recipient of the member function invocation).

In other cases, choose a member function over a friend function.

==============================================================================

SECTION [15]: Input/output via <iostream.h> and <stdio.h>


[15.1] Why should I use <iostream.h> instead of the traditional <stdio.h>?

Increase type safety, reduce errors, improve performance, allow extensibility,
and provide subclassability.

printf() is arguably not broken, and scanf() is perhaps livable despite being
error prone, however both are limited with respect to what C++ I/O can do.  C++
I/O (using << and >>) is, relative to C (using printf() and scanf()):
 * Better type safety: With <iostream.h>, the type of object being I/O'd is
   known statically by the compiler.  In contrast, <stdio.h> uses "%" fields to
   figure out the types dynamically.
 * Less error prone: With <iostream.h>, there are no redundant "%" tokens that
   have to be consistent with the actual objects being I/O'd.  Removing
   redundancy removes a class of errors.
 * Extensible: The C++ <iostream.h> mechanism allows new user-defined types to
   be I/O'd without breaking existing code.  Imagine the chaos if everyone was
   simultaneously adding new incompatible "%" fields to printf() and
   scanf()?!).
 * Subclassable: The C++ <iostream.h> mechanism is built from real classes such
   as ostream and istream.  Unlike <stdio.h>'s FILE*, these are real classes
   and hence subclassable.  This means you can have other user-defined things
   that look and act like streams, yet that do whatever strange and wonderful
   things you want.  You automatically get to use the zillions of lines of I/O
   code written by users you don't even know, and they don't need to know about
   your "extended stream" class.

==============================================================================

[15.2] Why does my input seem to process past the end of file?

Because the eof state is not set until after a read is attempted past the end
of file.  That is, reading the last byte from a file does not set the eof
state.

For example, the following code has an off-by-one error with the count i:

    int i = 0;
    while (! cin.eof()) {   // WRONG!
      cin >> x;
      ++i;
      // Work with x ...
    }

What you really need is:

    int i = 0;
    while (cin >> x) {      // RIGHT!
      ++i;
      // Work with x ...
    }

==============================================================================

[15.3] Why is my program ignoring my input request after the first iteration?

Because the numerical extractor leaves non-digits behind in the input buffer.

If your code looks like this:

    char name[1000];
    int age;

    for (;;) {
      cout << "Name: ";
      cin >> name;
      cout << "Age: ";
      cin >> age;
    }

What you really want is:

    for (;;) {
      cout << "Name: ";
      cin >> name;
      cout << "Age: ";
      cin >> age;
      cin.ignore(INT_MAX, '\n');
    }

==============================================================================

[15.4] How can I provide printing for my class Fred?

Use operator overloading[13] to provide a friend[14] left-shift operator,
operator<<.

    #include <iostream.h>

    class Fred {
    public:
      friend ostream& operator<< (ostream& o, const Fred& fred);
      // ...
    private:
      int i_;    // Just for illustration
    };

    ostream& operator<< (ostream& o, const Fred& fred)
    {
      return o << fred.i_;
    }

    main()
    {
      Fred f;
      cout << "My Fred object: " << f << "\n";
    }

We use a friend[14] rather than a member since the Fred parameter is second
rather than first.

==============================================================================

[15.5] How can I provide input for my class Fred?

Use operator overloading[13] to provide a friend[14] right-shift operator,
operator>>.  This is similar to the output operator[15.4], except the parameter
doesn't have a const[18]: "Fred&" rather than "const Fred&".

    #include <iostream.h>

    class Fred {
    public:
      friend istream& operator>> (istream& i, Fred& fred);
      // ...
    private:
      int i_;    // Just for illustration
    };

    istream& operator>> (istream& i, Fred& fred)
    {
      return i >> fred.i_;
    }

    main()
    {
      Fred f;
      cout << "Enter a Fred object: ";
      cin >> f;
      // ...
    }

==============================================================================

[15.6] How can I provide printing for an entire hierarchy of classes?

Provide a friend[14] operator<<[15.4] that calls a protected virtual[20]
function:

    class Base {
    public:
      friend ostream& operator<< (ostream& o, const Base& b);
      // ...
    protected:
      virtual void print(ostream& o) const;
    };

    inline ostream& operator<< (ostream& o, const Base& b)
    {
      b.print(o);
      return o;
    }

    class Derived : public Base {
    protected:
      virtual void print(ostream& o) const;
    };

The end result is that operator<< acts as if it was dynamically bound, even
though it's a friend[14] function.  This is called the Virtual Friend Function
Idiom.

Note that derived classes override print(ostream&) const.  In particular, they
do not provide their own operator<<.

Naturally if Base is an ABC[22.3], Base::print(ostream&) const can be declared
pure virtual[22.4] using the "= 0" syntax.

==============================================================================

[15.7] How can I "reopen" cin and cout in binary mode under DOS and/or OS/2?

This is implementation dependent.  Check with your compiler's documentation.

For example, suppose you want to do binary I/O using cin and cout.  Suppose
further that your operating system (such as DOS or OS/2) insists on translating
"\r\n" into "\n" on input from cin, and "\n" to "\r\n" on output to cout or
cerr.

Unfortunately there is no standard way to cause cin, cout, and/or cerr to be
opened in binary mode.  Closing the streams and attempting to reopen them in
binary mode might have unexpected or undesirable results.

On systems where it makes a difference, the implementation might provide a way
to make them binary streams, but you would have to check the manuals to find
out.

==============================================================================

[15.8] Why can't I open a file in a different directory such as "..\test.dat"?

Because "\t" is a tab character.

You should use forward slashes in your filenames, even on an operating system
that uses backslashes such as DOS, Windows, OS/2, etc.  For example:

    #include <iostream.h>
    #include <fstream.h>

    main()
    {
      #if 1
        ifstsream file("../test.dat");     // RIGHT!
      #else
        ifstsream file("..\test.dat");     // WRONG!
      #endif

      // ...
    }

Remember, the backslash ("\") is used in string literals to create special
characters: "\n" is a newline, "\b" is a backspace, and "\t" is a tab, "\a" is
an "alert", "\v" is a vertical-tab, etc.  Therefore the file name
"\version\next\alpha\beta\test.dat" is interpreted as a bunch of very funny
characters; use "/version/next/alpha/beta/test.dat" instead, even on systems
that use a "\" as the directory separator such as DOS, Windows, OS/2, etc.

==============================================================================

-- 
Marshall Cline, Ph.D., President, Paradigm Shift, Inc.
315-353-6100 (voice)
315-353-6110 (fax)
mailto:cline@parashift.com
