Subject: C++ FAQ (#5 of 7)
Date: 6 Nov 1996 23:24:01 GMT
Summary: Please read this before posting to comp.lang.c++

Archive-name: C++-faq/part5
Posting-Frequency: monthly
URL: http://www.cerfnet.com/~mpcline/C++-FAQs-Lite/

AUTHOR: Marshall Cline / cline@parashift.com / Paradigm Shift, Inc. /
One Park St. / Norwood, NY 13668 / 315-353-6100 (voice) / 315-353-6110 (fax)

COPYRIGHT: This posting is part of "C++ FAQs Lite."  The entire "C++ FAQs Lite"
document is Copyright(C) 1991-96 Marshall P. Cline, Ph.D., cline@parashift.com.
All rights reserved.  Copying is permitted only under designated situations.
For details, see section [1].

NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS.  THE AUTHOR PROVIDES NO
WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING
WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR
PURPOSE.

C++-FAQs-Lite != C++-FAQs-Book: This document, C++ FAQs Lite, is not the same
as the C++ FAQs Book.  The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is
500% larger than this document, and is available in bookstores.  For details,
see section [3].

==============================================================================

SECTION [21]: Inheritance -- proper inheritance and substitutability


[21.1] Should I hide member functions that were public in my base class?

Never, never, never do this.  Never.  Never!

Attempting to hide (eliminate, revoke, privatize) inherited public member
functions is an all-too-common design error.  It usually stems from muddy
thinking.

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

[21.2] Derived* --> Base* works OK; why doesn't Derived** --> Base** work?

C++ allows a Derived* to be converted to a Base*, since a Derived object is a
kind of a Base object.  However trying to convert a Derived** to a Base** is
flagged as an error.  Although this error may not be obvious, it is nonetheless
a good thing.  For example, if you could convert a Car** to a Vehicle**, and if
you could similarly convert a NuclearSubmarine** to a Vehicle**, you could
assign those two pointers and end up making a Car* point at a NuclearSubmarine:

    class Vehicle                           { /*...*/ };
    class Car              : public Vehicle { /*...*/ };
    class NuclearSubmarine : public Vehicle { /*...*/ };

    main()
    {
      Car   car;
      Car*  carPtr = &car;
      Car** carPtrPtr = &carPtr;
      Vehicle** vehiclePtrPtr = carPtrPtr;  // This is an error in C++
      NuclearSubmarine  sub;
      NuclearSubmarine* subPtr = &sub;
      *vehiclePtrPtr = subPtr;
      // This last line would have caused carPtr to point to sub !
    }

In other words, if it was legal to convert a Derived** to a Base**, the Base**
could be dereferenced (yielding a Base*), and the Base* could be made to point
to an object of a different derived class, which could cause serious problems
for national security (who knows what would happen if you invoked the
openGasCap() member function on what you thought was a Car, but in reality it
was a NuclearSubmarine!!)..

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

[21.3] Is a parking-lot-of-Car a kind-of parking-lot-of-Vehicle?

Nope.

I know it sounds strange, but it's true.  You can think of this as a direct
consequence of the previous FAQ, or you can reason it this way: if the kind-of
relationship were valid, then someone could point a parking-lot-of-Vehicle
pointer at a parking-lot-of-Car.  But parking-lot-of-Vehicle has a
addNewVehicleToParkingLot(Vehicle&) member function which can add any Vehicle
object to the parking lot.  This would allow you to park a NuclearSubmarine in
a parking-lot-of-Car.  Certainly it would be surprising if someone removed what
they thought was a Car from the parking-lot-of-Car, only to find that it is
actually a NuclearSubmarine.

Another way to say this truth: a container of Thing is not a kind-of container
of Anything even if a Thing is a kind-of an Anything.  Swallow hard; it's true.

You don't have to like it.  But you do have to accept it.

One last example which we use in our OO/C++ training courses: "A Bag-of-Apple
is not a kind-of Bag-of-Fruit." If a Bag-of-Apple could be passed as a
Bag-of-Fruit, someone could put a Banana into the Bag, even though it is
supposed to only contain Apples!

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

[21.4] Is an array of Derived a kind-of array of Base?

Nope.

This is a corollary of the previous FAQ.  Unfortunately this one can get you
into a lot of hot water.  Consider this:

    class Base {
    public:
      virtual void f();             // 1
    };

    class Derived : public Base {
    public:
      // ...
    private:
      int i_;                       // 2
    };

    void userCode(Base* arrayOfBase)
    {
      arrayOfBase[1].f();           // 3
    }

    main()
    {
      Derived arrayOfDerived[10];   // 4
      userCode(arrayOfDerived);     // 5
    }

The compiler thinks this is perfectly type-safe.  Line 5 converts a Derived* to
a Base*.  But in reality it is horrendously evil: since Derived is larger than
Base, the pointer arithmetic done on line 3 is incorrect: the compiler uses
sizeof(Base) when computing the address for arrayOfBase[1], yet the array is an
array of Derived, which means the address computed on line 3 (and the
subsequent invocation of member function f()) isn't even at the beginning of
any object! It's smack in the middle of a Derived object.  Assuming your
compiler uses the usual approach to virtual[20] functions, this will
reinterpret the int i_ of the first Derived as if it pointed to a virtual
table, it will follow that "pointer" (which at this point means we're digging
stuff out of a random memory location), and grab one of the first few words of
memory at that location and interpret them as if they were the address of a C++
member function, then load that (random memory location) into the instruction
pointer and begin grabbing machine instructions from that memory location.  The
chances of this crashing are very high.

The root problem is that C++ can't distinguish between a pointer-to-a-thing and
a pointer-to-an-array-of-things.  Naturally C++ "inherited" this feature from
C.

NOTE: If we had used an array-like class (e.g., vector<Derived> from STL[32.1])
instead of using a raw array, this problem would have been properly trapped as
an error at compile time rather than a run-time disaster.

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

[21.5] Does array-of-Derived is-not-a-kind-of array-of-Base mean arrays are
       bad?

Yes, arrays are evil.  (only half kidding).

Seriously, arrays are very closely related to pointers, and pointers are
notoriously difficult to deal with.  But if you have a complete grasp of why
the above few FAQs were a problem from a design perspective (e.g., if you
really know why a container of Thing is not a kind-of container of Anything),
and if you think everyone else who will be maintaining your code also has a
full grasp on these OO design truths, then you should feel free to use arrays.
But if you're like most people, you should use a template container class such
as vector<T> from STL[32.1] rather than raw arrays.

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

[21.6] Is a Circle a kind-of an Ellipse?

Not if Ellipse promises to be able to change its size asymmetrically.

For example, suppose Ellipse has a setSize(x,y) member function, and suppose
this member function promises the Ellipse's width() will be x, and its height()
will be y.  In this case, Circle can't be a kind-of Ellipse.  Simply put, if
Ellipse can do something Circle can't, then Circle can't be a kind of Ellipse.

This leaves two potential (valid) relationships between Circle and Ellipse:
 * Make Circle and Ellipse completely unrelated classes
 * Derive Circle and Ellipse from a base class representing "Ellipses that
   can't necessarily perform an unequal-setSize() operation"

In the first case, Ellipse could be derived from class AsymmetricShape, and
setSize(x,y) could be introduced in AsymmetricShape.  However Circle could be
derived from SymmetricShape which has a setSize(size) member function.

In the second case, class Oval could only have setSize(size) which sets both
the width() and the height() to size.  Ellipse and Circle could both inherit
from Oval.  Ellipse --but not Circle-- could add the setSize(x,y) operation
(but beware of the hiding rule[23.3] if the same member function name setSize()
is used for both operations).

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

[21.7] Are there other options to the "Circle is/isnot kind-of Ellipse"
       dilemma?

If you claim that all Ellipses can be squashed asymmetrically, and you claim
that Circle is a kind-of Ellipse, and you claim that Circle can't be squashed
asymmetrically, clearly you've got to adjust (revoke, actually) one of your
claims.  Thus you've either got to get rid of Ellipse::setSize(x,y), get rid of
the inheritance relationship between Circle and Ellipse, or admit that your
Circles aren't necessarily circular.

Here are the two most common traps new OO/C++ programmers regularly fall into.
They attempt to use coding hacks to cover up a broken design (they redefine
Circle::setSize(x,y) to throw an exception, call abort(), choose the average of
the two parameters, or to be a no-op).  Unfortunately all these hacks will
surprise users, since users are expecting width() == x and height() == y.  The
one thing you must not do is surprise your users.

If it is important to you to retain the "Circle is a kind-of Ellipse"
inheritance relationship, you can weaken the promise made by Ellipse's
setSize(x,y).  E.g., you could change the promise to, "This member function
might set width() to x and/or it might set height() to y, or it might do
nothing".  Unfortunately this dilutes the contract into dribble, since the user
can't rely on any meaningful behavior.  The whole hierarchy therefore begins to
be worthless (it's hard to convince someone to use an object if you have to
shrug your shoulders when asked what the object does for them).

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

[21.8] But I have a Ph.D. in Mathematics, and I'm sure a Circle is a kind of an
       Ellipse! Does this mean Marshall Cline is stupid? Or that C++ is stupid?
       Or that OO is stupid?

Actually, it doesn't mean any of these things.  The sad reality is that it
means your intuition is wrong.

Look, I have received and answered dozens of passionate e-mail messages about
this subject.  I have taught it hundreds of times to thousands of software
professionals all over the place.  I know it goes against your intuition.  But
trust me; your intuition is wrong.

The real problem is your intuitive notion of "kind of" doesn't match the OO
notion of proper inheritance (technically called "subtyping").  The bottom line
is that the derived class objects must be substitutable for the base class
objects.  In the case of Circle/Ellipse, the setSize(x,y) member function
violates this substitutability.

You have three choices: [1] remove the setSize(x,y) member function from
Ellipse (thus breaking existing code that calls the setSize(x,y) member
function), [2] allow a Circle to have a different height than width (an
asymmetrical circle; hmmm), or [3] drop the inheritance relationship.  Sorry,
but there simply are no other choices.  Note that some people mention the
option of deriving both Circle and Ellipse from a third common base class, but
that's just a variant of option [3] above.

Another way to say this is that you have to either make the base class weaker
(in this case braindamage Ellipse to the point that you can't set its width and
height to different values), or make the derived class stronger (in this case
empower a Circle with the ability to be both symmetric and, ahem, asymmetric).
When neither of these is very satisfying (such as in the Circle/Ellipse case),
one normally simply removes the inheritance relationship.  If the inheritance
relationship simply has to exist, you may need to remove the mutator member
functions (setHeight(y), setWidth(x), and setSize(x,y)) from the base class.

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

[21.9] But my problem doesn't have anything to do with circles and ellipses, so
       what good is that silly example to me?

Ahhh, there's the rub.  You think the Circle/Ellipse example is just a silly
example.  But in reality, your problem is an isomorphism to that example.

I don't care what your inheritance problem is, but all (yes all) bad
inheritances boil down to the Circle-is-not-a-kind-of-Ellipse example.

Here's why: Bad inheritances always have a base class with an extra capability
(often an extra member function or two; sometimes an extra promise made by one
or a combination of member functions) that a derived class can't satisfy.
You've either got to make the base class weaker, make the derived class
stronger, or eliminate the proposed inheritance relationship.  I've seen lots
and lots and lots of these bad inheritance proposals, and believe me, they all
boil down to the Circle/Ellipse example.

Therefore, if you truly understand the Circle/Ellipse example, you'll be able
to recognize bad inheritance everywhere.  If you don't understand what's going
on with the Circle/Ellipse problem, the chances are high that you'll make some
very serious and very expensive inheritance mistakes.

Sad but true.

(Note: this FAQ has to do with public inheritance; private and protected
inheritance[24] are different.)

==============================================================================

SECTION [22]: Inheritance -- abstract base classes (ABCs)


[22.1] What's the big deal of separating interface from implementation?

Interfaces are a company's most valuable resources.  Designing an interface
takes longer than whipping together a concrete class which fulfills that
interface.  Furthermore interfaces require the time of more expensive people.

Since interfaces are so valuable, they should be protected from being tarnished
by data structures and other implementation artifacts.  Thus you should
separate interface from implementation.

==============================================================================

[22.2] How do I separate interface from implementation in C++ (like Modula-2)?

Use an ABC[22.3].

==============================================================================

[22.3] What is an ABC?

An abstract base class.

At the design level, an abstract base class (ABC) corresponds to an abstract
concept.  If you asked a mechanic if he repaired vehicles, he'd probably wonder
what kind-of vehicle you had in mind.  Chances are he doesn't repair space
shuttles, ocean liners, bicycles, or nuclear submarines.  The problem is that
the term "vehicle" is an abstract concept (e.g., you can't build a "vehicle"
unless you know what kind of vehicle to build).  In C++, class Vehicle would be
an ABC, with Bicycle, SpaceShuttle, etc, being subclasses (an OceanLiner
is-a-kind-of-a Vehicle).  In real-world OO, ABCs show up all over the place.

At the programming language level, an ABC is a class that has one or more pure
virtual[22.4] member functions.  You cannot make an object (instance) of an
ABC.

==============================================================================

[22.4] What is a "pure virtual" member function?

A member function declaration that turns a normal class into an abstract class
(i.e., an ABC).  You normally only implement it in a derived class.

Some member functions exist in concept; they don't have any reasonable
definition.  E.g., suppose I asked you to draw a Shape at location (x,y) that
has size 7.  You'd ask me "what kind of shape should I draw?" (circles,
squares, hexagons, etc, are drawn differently).  In C++, we must indicate the
existence of the draw() member function (so users can call it when they have a
Shape* or a Shape&), but we recognize it can (logically) be defined only in
subclasses:

    class Shape {
    public:
      virtual void draw() const = 0;  // = 0 means it is "pure virtual"
      // ...
    };

This pure virtual function makes Shape an ABC.  If you want, you can think of
the "= 0;" syntax as if the code were at the NULL pointer.  Thus Shape promises
a service to its users, yet Shape isn't able to provide any code to fulfill
that promise.  This forces any actual object created from a [concrete] class
derived from Shape to have the indicated member function, even though the base
class doesn't have enough information to actually define it yet.

Note that it is possible to provide a definition for a pure virtual function,
but this usually confuses novices and is best avoided until later.

==============================================================================

[22.5] How do you define a copy constructor or assignment operator for a class
       that contains a pointer to a (abstract) base class?

If the class "owns" the object pointed to by the (abstract) base class pointer,
use the Virtual Constructor Idiom[20.5] in the (abstract) base class.  As usual
with this idiom, we declare a pure virtual[22.4] clone() method in the base
class:

    class Shape {
    public:
      // ...
      virtual Shape* clone() const = 0;   // The Virtual (Copy) Constructor[20.5]
      // ...
    };

Then we implement this clone() method in each derived class:

    class Circle {
    public:
      // ...
      virtual Shape* clone() const { return new Circle(*this); }
      // ...
    };

    class Square {
    public:
      // ...
      virtual Shape* clone() const { return new Square(*this); }
      // ...
    };

Now suppose that each Fred object "has-a" Shape object.  Naturally the Fred
object doesn't know whether the Shape is Circle or a Square or ...  Fred's copy
constructor and assignment operator will invoke Shape's clone() method to copy
the object:

    class Fred {
    public:
      Fred(Shape* p) : p_(p) { assert(p != NULL); }   // p must not be NULL
     ~Fred() { delete p; }
      Fred(const Fred& f) : p_(f.p_->clone()) { }
      Fred& operator= (const Fred& f)
        {
          if (this != &f) {              // Check for self-assignment
            Shape* p2 = f.p_->clone();   // Create the new one FIRST...
            delete p_;                   // ...THEN delete the old one
            p_ = p2;
          }
          return *this;
        }
      // ...
    private:
      Shape* p_;
    };

==============================================================================

SECTION [23]: Inheritance -- what your mother never told you


[23.1] When my base class's constructor calls a virtual function, why doesn't
       my derived class's override of that virtual function get invoked?

During the class Base's constructor, the object isn't yet a Derived, so if
Base::Base() calls a virtual[20] function virt(), the Base::virt() will be
invoked, even if Derived::virt() exists.

Similarly, during Base's destructor, the object is no longer a Derived, so when
Base::~Base() calls virt(), Base::virt() gets control, not the Derived::virt()
override.

You'll quickly see the wisdom of this approach when you imagine the disaster if
Derived::virt() touched a member object from class Derived.

==============================================================================

[23.2] Should a derived class replace ("override") a non-virtual function from
       a base class?

It's legal, but it ain't moral.

Experienced C++ programmers will sometimes redefine a non-virtual function
(e.g., the derived class implementation might make better use of the derived
class's resources for efficiency), or to get around the hiding rule[23.3].
However the client-visible effects must be identical, since non-virtual
functions are dispatched based on the static type of the pointer/reference
rather than the dynamic type of the pointed-to/referenced object.

==============================================================================

[23.3] What's the meaning of, Warning: Derived::f(int) hides Base::f(float)?

It means you're going to die.

Here's the mess you're in: if Derived declares a member function named f(), and
Base declares a member function named f() with a different signature (e.g.,
different parameter types and/or constness), then the Base f() is "hidden"
rather than "overloaded" or "overridden" (even if the Base f() is virtual[20]).

Here's how you get out of the mess: Derived must redefine the Base member
function(s) that are hidden (even if they are non-virtual).  Normally this
re-definition merely calls the appropriate Base member function.  E.g.,

    class Base {
    public:
      void f(int);
    };

    class Derived : public Base {
    public:
      void f(double);
      void f(int i) { Base::f(i); }  // The redefinition merely calls Base::f(int)
    };

==============================================================================

[23.4] What does it mean that the "virtual table" is an unresolved external?

If you get a link error of the form
"Error: Unresolved or undefined symbols detected: virtual table for class Fred,"
you probably have an undefined virtual[20] member function in class Fred.

The compiler typically creates a magical data structure called the "virtual
table" for classes that have virtual functions (this is how it handles dynamic
binding[20.2]).  Normally you don't have to know about it at all.  But if you
forget to define a virtual function for class Fred, you will sometimes get this
linker error.

Here's the nitty gritty: Many compilers put this magical "virtual table" in the
compilation unit that defines the first non-inline virtual function in the
class.  Thus if the first non-inline virtual function in Fred is wilma(), the
compiler will put Fred's virtual table in the same compilation unit where it
sees Fred::wilma().  Unfortunately if you accidentally forget to define
Fred::wilma(), rather than getting a Fred::wilma() is undefined, you may get a
"Fred's virtual table is undefined".  Sad but true.

==============================================================================

SECTION [24]: Inheritance -- private and protected inheritance


[24.1] How do you express "private inheritance"?

When you use : private instead of : public.  E.g.,

    class Foo : private Bar {
    public:
      // ...
    };

==============================================================================

[24.2] How are "private inheritance" and "composition" similar?

private inheritance is a syntactic variant of composition (has-a).

E.g., the "Car has-a Engine" relationship can be expressed using composition:

    class Engine {
    public:
      Engine(int numCylinders);
      void start();                 // Starts this Engine
    };

    class Car {
    public:
      Car() : e_(8) { }             // Initializes this Car with 8 cylinders
      void start() { e_.start(); }  // Start this Car by starting its Engine
    private:
      Engine e_;                    // Car has-a Engine
    };

The same "has-a" relationship can also be expressed using private inheritance:

    class Car : private Engine {    // Car has-a Engine
    public:
      Car() : Engine(8) { }         // Initializes this Car with 8 cylinders
      Engine::start;                // Start this Car by starting its Engine
    };

There are several similarities between these two forms of composition:
 * In both cases there is exactly one Engine member object contained in a Car
 * In neither case can users (outsiders) convert a Car* to an Engine*

There are also several distinctions:
 * The first form is needed if you want to contain several Engines per Car
 * The second form can introduce unnecessary multiple inheritance
 * The second form allows members of Car to convert a Car* to an Engine*
 * The second form allows access to the protected members of the base class
 * The second form allows Car to override Engine's virtual[20] functions

Note that private inheritance is usually used to gain access into the
protected: members of the base class, but this is usually a short-term solution
(translation: a band-aid[24.3]).

==============================================================================

[24.3] Which should I prefer: composition or private inheritance?

Use composition when you can, private inheritance when you have to.

Normally you don't want to have access to the internals of too many other
classes, and private inheritance gives you some of this extra power (and
responsibility).  But private inheritance isn't evil; it's just more expensive
to maintain, since it increases the probability that someone will change
something that will break your code.

A legitimate, long-term use for private inheritance is when you want to build a
class Fred that uses code in a class Wilma, and the code from class Wilma needs
to invoke member functions from your new class, Fred.  In this case, Fred calls
non-virtuals in Wilma, and Wilma calls (usually pure virtuals[22.4]) in itself,
which are overridden by Fred.  This would be much harder to do with
composition.

    class Wilma {
    protected:
      void fredCallsWilma()
        {
          cout << "Wilma::fredCallsWilma()\n";
          wilmaCallsFred();
        }
      virtual void wilmaCallsFred() = 0;   // A pure virtual function[22.4]
    };

    class Fred : private Wilma {
    public:
      void barney()
        {
          cout << "Fred::barney()\n";
          Wilma::fredCallsWilma();
        }
    protected:
      virtual void wilmaCallsFred()
        {
          cout << "Fred::wilmaCallsFred()\n";
        }
    };

==============================================================================

[24.4] Should I pointer-cast from a private derived class to its base class?

Generally, No.

From a member function or friend[14] of a privately derived class, the
relationship to the base class is known, and the upward conversion from
PrivatelyDer* to Base* (or PrivatelyDer& to Base&) is safe; no cast is needed
or recommended.

However users of PrivateDer should avoid this unsafe conversion, since it is
based on a private decision of PrivateDer, and is subject to change without
notice.

==============================================================================

[24.5] How is protected inheritance related to private inheritance?

Similarities: both allow overriding virtual[20] functions in the
private/protected base class, neither claims the derived is a kind-of its base.

Dissimilarities: protected inheritance allows derived classes of derived
classes to know about the inheritance relationship.  Thus your grand kids are
effectively exposed to your implementation details.  This has both benefits (it
allows subclasses of the protected derived class to exploit the relationship to
the protected base class) and costs (the protected derived class can't change
the relationship without potentially breaking further derived classes).

Protected inheritance uses the : protected syntax:

    class Car : protected Engine {
    public:
      // ...
    };

==============================================================================

[24.6] What are the access rules with private and protected inheritance?

Take these classes as examples:

    class B                    { /*...*/ };
    class D_priv : private   B { /*...*/ };
    class D_prot : protected B { /*...*/ };
    class D_publ : public    B { /*...*/ };
    class UserClass            { B b; /*...*/ };

None of the subclasses can access anything that is private in B.  In D_priv,
the public and protected parts of B are private.  In D_prot, the public and
protected parts of B are protected.  In D_publ, the public parts of B are
public and the protected parts of B are protected (D_publ is-a-kind-of-a B).
class UserClass can access only the public parts of B, which "seals off"
UserClass from B.

To make a public member of B so it is public in D_priv or D_prot, state the
name of the member with a B:: prefix.  E.g., to make member B::f(int,float)
public in D_prot, you would say:

    class D_prot : protected B {
    public:
      B::f;    // Note: Not B::f(int,float)
    };

==============================================================================

SECTION [25]: Coding standards


[25.1] What are some good C++ coding standards?

Thank you for reading this answer rather than just trying to set your own
coding standards.

But beware that some people on comp.lang.c++ are very sensitive on this issue.
Nearly every software engineer has, at some point, been exploited by someone
who used coding standards as a "power play." Furthermore some attempts to set
C++ coding standards have been made by those who didn't know what they were
talking about, so the standards end up being based on what was the
state-of-the-art when the standards setters where writing code.  Such
impositions generate an attitude of mistrust for coding standards.

Obviously anyone who asks this question wants to be trained so they don't run
off on their own ignorance, but nonetheless posting a question such as this one
to comp.lang.c++ tends to generate more heat than light.

==============================================================================

[25.2] Are coding standards necessary? Are they sufficient?

Coding standards do not make non-OO programmers into OO programmers; only
training and experience do that.  If coding standards have merit, it is that
they discourage the petty fragmentation that occurs when large organizations
coordinate the activities of diverse groups of programmers.

But you really want more than a coding standard.  The structure provided by
coding standards gives neophytes one less degree of freedom to worry about,
which is good.  However pragmatic guidelines should go well beyond
pretty-printing standards.  Organizations need a consistent philosophy of
design and implementation.  E.g., strong or weak typing? references or pointers
in interfaces? stream I/O or stdio? should C++ code call C code? vice versa?
how should ABCs[22.3] be used? should inheritance be used as an implementation
technique or as a specification technique? what testing strategy should be
employed? inspection strategy? should interfaces uniformly have a get() and/or
set() member function for each data member? should interfaces be designed from
the outside-in or the inside-out? should errors be handled by try/catch/throw
or by return codes? etc.

What is needed is a "pseudo standard" for detailed design. I recommend a
three-pronged approach to achieving this standardization: training,
mentoring[26.1], and libraries.  Training provides "intense instruction,"
mentoring allows OO to be caught rather than just taught, and high quality C++
class libraries provide "long term instruction." There is a thriving commercial
market for all three kinds of "training." Advice by organizations who have been
through the mill is consistent: Buy, Don't Build. Buy libraries, buy training,
buy tools, buy consulting.  Companies who have attempted to become a
self-taught tool-shop as well as an application/system shop have found success
difficult.

Few argue that coding standards are "ideal," or even "good," however they are
necessary in the kind of organizations/situations described above.

The following FAQs provide some basic guidance in conventions and styles.

==============================================================================

[25.3] Should our organization determine coding standards from our C
       experience?

No!

No matter how vast your C experience, no matter how advanced your C expertise,
being a good C programmer does not make you a good C++ programmer.  Converting
from C to C++ is more than just learning the syntax and semantics of the ++
part of C++.  Organizations who want the promise of OO, but who fail to put the
"OO" into "OO programming", are fooling themselves; the balance sheet will show
their folly.

C++ coding standards should be tempered by C++ experts.  Asking comp.lang.c++
is a start.  Seek out experts who can help guide you away from pitfalls.  Get
training.  Buy libraries and see if "good" libraries pass your coding
standards.  Do not set standards by yourself unless you have considerable
experience in C++.  Having no standard is better than having a bad standard,
since improper "official" positions "harden" bad brain traces.  There is a
thriving market for both C++ training and libraries from which to pool
expertise.

One more thing: whenever something is in demand, the potential for charlatans
increases.  Look before you leap.  Also ask for student-reviews from past
companies, since not even expertise makes someone a good communicator.
Finally, select a practitioner who can teach, not a full time teacher who has a
passing knowledge of the language/paradigm.

==============================================================================

[25.4] Should I declare locals in the middle of a function or at the top?

Declare near first use.

An object is initialized (constructed) the moment it is declared.  If you don't
have enough information to initialize an object until half way down the
function, you should create it half way down the function when it can be
initialized correctly.  Don't initialize it to an "empty" value at the top then
"assign" it later.  The reason for this is runtime performance.  Building an
object correctly is faster than building it incorrectly and remodeling it
later.  Simple examples show a factor of 350% speed hit for simple classes like
String.  Your mileage may vary; surely the overall system degradation will be
less that 350%, but there will be degradation.  Unnecessary degradation.

A common retort to the above is: "we'll provide set() member functions for
every datum in our objects so the cost of construction will be spread out."
This is worse than the performance overhead, since now you're introducing a
maintenance nightmare.  Providing a set() member function for every datum is
tantamount to public data: you've exposed your implementation technique to the
world.  The only thing you've hidden is the physical names of your member
objects, but the fact that you're using a List and a String and a float, for
example, is open for all to see.

Bottom line: Locals should be declared near their first use.  Sorry that this
isn't familiar to C experts, but new doesn't necessarily mean bad.

==============================================================================

[25.5] What source-file-name convention is best? foo.cpp? foo.C? foo.cc?

If you already have a convention, use it.  If not, consult your compiler to see
what the compiler expects.  Typical answers are: .C, .cc, .cpp, or .cxx
(naturally the .C extension assumes a case-sensitive file system to distinguish
.C from .c).

At Paradigm Shift, Inc., we have used both .cpp for our C++ source files, and
we have also used .C.  In the latter case, we supply the compiler option forces
.c files to be treated as C++ source files (-Tdp for IBM CSet++, -cpp for
Zortech C++, -P for Borland C++, etc.) when porting to case-insensitive file
systems.  None of these approaches have any striking technical superiority to
the others; we generally use whichever technique is preferred by our customer
(again, these issues are dominated by business considerations, not by technical
considerations).

==============================================================================

[25.6] What header-file-name convention is best? foo.H? foo.hh? foo.hpp?

If you already have a convention, use it.  If not, and if you don't need your
editor to distinguish between C and C++ files, simply use .h.  Otherwise use
whatever the editor wants, such as .H, .hh, or .hpp.

At Paradigm Shift, Inc., we tend to use either .hpp or .h for our C++ header
files.

==============================================================================

[25.7] Are there any lint-like guidelines for C++?

Yes, there are some practices which are generally considered dangerous.
However none of these are universally "bad," since situations arise when even
the worst of these is needed:
 * A class Fred's assignment operator should return *this as a Fred& (allows
   chaining of assignments)
 * A class with any virtual[20] functions ought to have a virtual
   destructor[20.4]
 * A class with any of {destructor, assignment operator, copy constructor}
   generally needs all 3
 * A class Fred's copy constructor and assignment operator should have const in
   the parameter: respectively Fred::Fred(const Fred&) and
   Fred& Fred::operator= (const Fred&)
 * When initializing an object's member objects in the constructor, always use
   initialization lists rather than assignment.  The performance difference for
   user-defined classes can be substantial (3x!)
 * Assignment operators should make sure that self assignment[12.1] does
   nothing, otherwise you may have a disaster[12.2].  In some cases, this may
   require you to add an explicit test to your assignment operators[12.3].
 * In classes that define both += and +, a += b and a = a + b should generally
   do the same thing; ditto for the other identities of built-in types (e.g.,
   a += 1 and ++a; p[i] and *(p+i); etc).  This can be enforced by writing the
   binary operations using the op= forms.  E.g.,

       Fred operator+ (const Fred& a, const Fred& b)
       {
         Fred ans = a;
         ans += b;
         return ans;
       }

   This way the "constructive" binary operators don't even need to be
friends[14].  But it is sometimes possible to more efficiently implement common
operations (e.g., if class Fred is actually String, and += has to
reallocate/copy string memory, it may be better to know the eventual length
from the beginning).

==============================================================================

[25.8] Which is better: identifier names that_look_like_this or identifier
       names thatLookLikeThis?

It's a precedent thing.  If you have a Pascal or Smalltalk background,
youProbablySquashNamesTogether like this.  If you have an Ada background,
You_Probably_Use_A_Large_Number_Of_Underscores like this.  If you have a
Microsoft Windows background, you probably prefer the "Hungarian" style which
means you jkuidsPrefix vndskaIdentifiers ncqWith ksldjfTheir nmdsadType.  And
then there are the folks with a Unix C background, who abbr evthng n use vry
srt idntfr nms.

So there is no universal standard.  If your organization has a particular
coding standard for identifier names, use it.  But starting another Jihad over
this will create a lot more heat than light.  From a business perspective,
there are only two things that matter: The code should be generally readable,
and everyone in the organization should use the same style.

Other than that, th difs r minr.

==============================================================================

[25.9] Are there any other sources of coding standards? [UPDATED!]

[Recently added URLs (on 10/96) and rewrote and added more URLs (on 11/96).]

Yep, there are several.

Here are a few sources that you might be able to use as starting points for
developing your organization's coding standards:
 * A bunch of coding standards are provided at
   http://fndaub.fnal.gov:8000/standards/standards.html
 * The Ellemtel coding guidelines are available at
   - http://web2.airmail.net/~rks/ellhome.htm
   - http://www.rhi.hi.is/~harri/cpprules.html
   - http://euagate.eua.ericsson.se/pub/eua/c++
   - http://nestor.ceid.upatras.gr/programming/ellemtel/ellhome.htm
   - http://www.doc.ic.ac.uk/lab/cplus/c++.rules/
 * Todd Hoff's coding standard guidelines are available at
   http://www.possibility.com/cpp.

Note: I do NOT warrant or endorse these URLs and/or their contents.  They are
listed as a public service only.  I haven't checked their details, so I don't
know if they'll help you or hurt you.  Caveat emptor.

==============================================================================

SECTION [26]: Learning OO/C++


[26.1] What is mentoring?

It's the most important tool in learning OO.

Object-oriented thinking is caught, not just taught.  Get cozy with someone who
really knows what they're talking about, and try to get inside their head and
watch them solve problems.  Listen.  Learn by emulating.

If you're working for a company, get them to bring someone in who can act as a
mentor and guide.  We've seen gobs and gobs of money wasted by companies who
"saved money" by simply buying their employees a book ("Here's a book; read it
over the weekend; on Monday you'll be an OO developer").

==============================================================================

[26.2] Should I learn C before I learn OO/C++?

Don't bother.

If your ultimate goal is to learn OO/C++ and you don't already know C, reading
books or taking courses in C will not only waste your time, but it will teach
you a bunch of things that you'll explicitly have to un-learn when you finally
get back on track and learn OO/C++ (e.g., malloc()[16.3], printf()[15.1],
unnecessary use of switch statements[20], error-code exception handling[17],
unnecessary use of #define macros[9.3], etc.).

If you want to learn OO/C++, learn OO/C++.  Taking time out to learn C will
waste your time and confuse you.

==============================================================================

[26.3] Should I learn Smalltalk before I learn OO/C++?

Don't bother.

If your ultimate goal is to learn OO/C++ and you don't already know Smalltalk,
reading books or taking courses in Smalltalk will not only waste your time, but
it will teach you a bunch of things that you'll explicitly have to un-learn
when you finally get back on track and learn OO/C++ (e.g., dynamic
typing[27.3], non-subtyping inheritance[27.5], error-code exception
handling[17], etc.).

Knowing a "pure" OO language doesn't make the transition to OO/C++ any easier.
This is not a theory; I speak from experience (Paradigm Shift, Inc.,
info@parashift.com, has trained and mentored literally thousands of software
professionals in OO).  In fact, Smalltalk experience can make it harder for
some people: they need to unlearn some rather deep notions about typing and
inheritance in addition to needing to learn new syntax and idioms.  This
unlearning process is especially painful and slow for those who cling to
Smalltalk with religious zeal ("C++ is not like Smalltalk, therefore C++ is
evil").

If you want to learn OO/C++, learn OO/C++.  Taking time out to learn Smalltalk
will waste your time and confuse you.

Note: I sit on both the ANSI C++ (X3J16) and ANSI Smalltalk (X3J20)
standardization committees[6.11].  I am not a language bigot[6.4].  I'm not
saying C++ is better or worse than Smalltalk; I'm simply saying that they are
different[27.1].

==============================================================================

[26.4] Should I buy one book, or several?

At least two.

There are two categories of insight and knowledge in OO programming using C++.
You're better off getting a "best of breed" book from each category rather than
trying to find a single book that does an OK job at everything.  The two OO/C++
programming categories are:
 * C++ legality guides -- what you can and can't do in C++[26.6].
 * C++ morality guides -- what you should and shouldn't do in C++[26.5].

Legality guides describe all language features with roughly the same level of
emphasis; morality guides focus on those language features that you will use
most often in typical programming tasks.  Legality guides tell you how to get a
given feature past the compiler; morality guides tell you whether or not to use
that feature in the first place.

Meta comments:
 * Neither of these categories is optional.  You must have a good grasp of
   both.
 * These categories do not trade off against each other.  You shouldn't argue
   in favor of one over the other.  They dove-tail.

==============================================================================

[26.5] What are some best-of-breed C++ morality guides?

Here's my personal (subjective and selective) short-list of must-read C++
morality guides, alphabetically by author:
 * Cline and Lomow, C++ FAQs, 461 pgs, Addison-Wesley, 1995, ISBN
   0-201-58958-3.  Covers 470 topics in a FAQ-like Q&A format.
 * Meyers, Effective C++, 224 pgs, Addison-Wesley, 1992, ISBN 0-201-56364-9.
   Covers 50 topics in a short essay format.
 * Meyers, More Effective C++, 336 pgs, Addison-Wesley, 1996, ISBN
   0-201-63371-X.  Covers 35 topics in a short essay format.

Similarities: All three books are extensively illustrated with code examples.
All three are excellent, insightful, useful, gold plated books.  All three have
excellent sales records.

Differences: Cline and Lomow's examples are complete, working programs rather
than code fragments or standalone classes.  Meyers contains numerous
line-drawings that illustrate the points.

==============================================================================

[26.6] What are some best-of-breed C++ legality guides?

Here's my personal (subjective and selective) short-list of must-read C++
legality guides, alphabetically by author:
 * Lippman, C++ Primer, Second Edition, 614 pgs, Addison-Wesley, 1991, ISBN
   0-201-54848-8.  Very readable/approachable.
 * Stroustrup, The C++ Programming Language, Second Edition, 646 pgs,
   Addison-Wesley, 1991, ISBN 0-201-53992-6.  Covers a lot of ground.

Similarities: Both books are excellent overviews of almost every language
feature.  I reviewed them for back-to-back issues of C++ Report, and I said
that they are both top notch, gold plated, excellent books.  Both have
excellent sales records.

Differences: If you don't know C, Lippman's book is better for you.  If you
know C and you want to cover a lot of ground quickly, Stroustrup's book is
better for you.

==============================================================================

[26.7] Are there other OO books that are relevant to OO/C++?

Yes! Tons!

The morality[26.5] and legality[26.6] categories listed above were for OO
programming.  The areas of OO analysis and OO design are also relevant, and
have their own best-of-breed books.

There are tons and tons of good books in these other areas.  I'll only mention
the book that's closest to the code on the grand spectrum from specification
down to code (this is, after all, the FAQ on C++, not the FAQ on OO analysis or
OO design).  Here's my personal (subjective and selective) short-list of
must-read books on OO design patterns:
 * Gamma et al., Design Patterns, 395 pgs, Addison-Wesley, 1995, ISBN
   0-201-63361-2.  Describes "patterns" that commonly show up in good OO
   designs.  You must read this book if you intend to do OO design work.

==============================================================================

SECTION [27]: Learning C++ if you already know Smalltalk


[27.1] What's the difference between C++ and Smalltalk?

Both fully support the OO paradigm.  Neither is categorically and universally
"better" than the other[6.4].  But there are differences.  The most important
differences are:
 * Static typing vs. dynamic typing[27.2]
 * Whether inheritance must be used only for subtyping[27.5]
 * Value vs. reference semantics[28]

Note: Many new C++ programmers come from a Smalltalk background.  If that's
you, this section will tell you the most important things you need know to make
your transition.  Please don't get the notion that either language is somehow
"inferior" or "bad"[6.4], or that this section is promoting one language over
the other (I am not a language bigot; I serve on both the ANSI C++ and ANSI
Smalltalk standardization committees[6.11]).  Instead, this section is designed
to help you understand (and embrace!) the differences.

==============================================================================

[27.2] What is "static typing," and how is it similar/dissimilar to Smalltalk?

Static typing says the compiler checks the type safety of every operation
statically (at compile-time), rather than to generate code which will check
things at run-time.  For example, with static typing, the signature matching
for function arguments is checked at compile time, not at run-time.  An
improper match is flagged as an error by the compiler, not by the run-time
system.

In OO code, the most common "typing mismatch" is invoking a member function
against an object which isn't prepared to handle the operation.  E.g., if class
Fred has member function f() but not g(), and fred is an instance of class
Fred, then fred.f() is legal and fred.g() is illegal.  C++ (statically typed)
catches the error at compile time, and Smalltalk (dynamically typed) catches
the error at run-time.  (Technically speaking, C++ is like Pascal --pseudo
statically typed-- since pointer casts and unions can be used to violate the
typing system; which reminds me: only use pointer casts and unions as often as
you use gotos).

==============================================================================

[27.3] Which is a better fit for C++: "static typing" or "dynamic typing"?

[For context, please read the previous FAQ[27.2]].

If you want to use C++ most effectively, use it as a statically typed language.

C++ is flexible enough that you can (via pointer casts, unions, and #define
macros) make it "look" like Smalltalk.  But don't.  Which reminds me: try to
avoid #define[9.3].

There are places where pointer casts and unions are necessary and even
wholesome, but they should be used carefully and sparingly.  A pointer cast
tells the compiler to believe you.  An incorrect pointer cast might corrupt
your heap, scribble into memory owned by other objects, call nonexistent member
functions, and cause general failures.  It's not a pretty sight.  If you avoid
these and related constructs, you can make your C++ code both safer and faster,
since anything that can be checked at compile time is something that doesn't
have to be done at run-time.

If you're interested in using a pointer cast, use the new style pointer casts.
The most common example of these is to change old-style pointer casts such as
(X*)p into new-style dynamic casts such as dynamic_cast<X*>(p), where p is a
pointer and X is a type.  In addition to dynamic_cast, there is static_cast and
const_cast, but dynamic_cast is the one that simulates most of the advantages
of dynamic typing (the other is the typeid() construct; for example,
typeid(*p).name() will return the name of the type of *p).

==============================================================================

[27.4] How do you use inheritance in C++, and is that different from Smalltalk?

Some people believe that the purpose of inheritance is code reuse.  In C++,
this is wrong.  Stated plainly, "inheritance is not for code reuse."

The purpose of inheritance in C++ is to express interface compliance
(subtyping), not to get code reuse.  In C++, code reuse usually comes via
composition rather than via inheritance.  In other words, inheritance is mainly
a specification technique rather than an implementation technique.

This is a major difference with Smalltalk, where there is only one form of
inheritance (C++ provides private inheritance to mean "share the code but don't
conform to the interface", and public inheritance to mean "kind-of").  The
Smalltalk language proper (as opposed to coding practice) allows you to have
the effect of "hiding" an inherited method by providing an override that calls
the "does not understand" method.  Furthermore Smalltalk allows a conceptual
"is-a" relationship to exist apart from the subclassing hierarchy (subtypes
don't have to be subclasses; e.g., you can make something that is-a Stack yet
doesn't inherit from class Stack).

In contrast, C++ is more restrictive about inheritance: there's no way to make
a "conceptual is-a" relationship without using inheritance (the C++ work-around
is to separate interface from implementation via ABCs[22.3]).  The C++ compiler
exploits the added semantic information associated with public inheritance to
provide static typing.

==============================================================================

[27.5] What are the practical consequences of differences in Smalltalk/C++
       inheritance?

[For context, please read the previous FAQ[27.4]].

Smalltalk lets you make a subtype that isn't a subclass, and allows you to make
a subclass that isn't a subtype.  This allows Smalltalk programmers to be very
carefree in putting data (bits, representation, data structure) into a class
(e.g., you might put a linked list into class Stack).  After all, if someone
wants an array-based-Stack, they don't have to inherit from Stack; they could
inherit such a class from Array if desired, even though an ArrayBasedStack is
not a kind-of Array!

In C++, you can't be nearly as carefree.  Only mechanism (member function
code), but not representation (data bits) can be overridden in subclasses.
Therefore you're usually better off not putting the data structure in a class.
This leads to a stronger reliance on abstract base classes[22.3].

I like to think of the difference between an ATV and a Maseratti.  An ATV (all
terrain vehicle) is more fun, since you can "play around" by driving through
fields, streams, sidewalks, and the like.  A Maseratti, on the other hand, gets
you there faster, but it forces you to stay on the road.  My advice to C++
programmers is simple: stay on the road.  Even if you're one of those people
who like the "expressive freedom" to drive through the bushes, don't do it in
C++; it's not a good fit.

==============================================================================

-- 
Marshall Cline, Ph.D., President, Paradigm Shift, Inc.
315-353-6100 (voice)
315-353-6110 (fax)
mailto:cline@parashift.com
