Subject: C++ FAQ (#6 of 7)
Date: 6 Nov 1996 23:24:14 GMT
Summary: Please read this before posting to comp.lang.c++

Archive-name: C++-faq/part6
Posting-Frequency: monthly
URL: http://www.cerfnet.com/~mpcline/C++-FAQs-Lite/

AUTHOR: Marshall Cline / cline@parashift.com / Paradigm Shift, Inc. /
One Park St. / Norwood, NY 13668 / 315-353-6100 (voice) / 315-353-6110 (fax)

COPYRIGHT: This posting is part of "C++ FAQs Lite."  The entire "C++ FAQs Lite"
document is Copyright(C) 1991-96 Marshall P. Cline, Ph.D., cline@parashift.com.
All rights reserved.  Copying is permitted only under designated situations.
For details, see section [1].

NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS.  THE AUTHOR PROVIDES NO
WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING
WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR
PURPOSE.

C++-FAQs-Lite != C++-FAQs-Book: This document, C++ FAQs Lite, is not the same
as the C++ FAQs Book.  The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is
500% larger than this document, and is available in bookstores.  For details,
see section [3].

==============================================================================

SECTION [28]: Reference and value semantics


[28.1] What is value and/or reference semantics, and which is best in C++?

With reference semantics, assignment is a pointer-copy (i.e., a reference).
Value (or "copy") semantics mean assignment copies the value, not just the
pointer.  C++ gives you the choice: use the assignment operator to copy the
value (copy/value semantics), or use a pointer-copy to copy a pointer
(reference semantics).  C++ allows you to override the assignment operator to
do anything your heart desires, however the default (and most common) choice is
to copy the value.

Pros of reference semantics: flexibility and dynamic binding (you get dynamic
binding in C++ only when you pass by pointer or pass by reference, not when you
pass by value).

Pros of value semantics: speed.  "Speed" seems like an odd benefit to for a
feature that requires an object (vs. a pointer) to be copied, but the fact of
the matter is that one usually accesses an object more than one copies the
object, so the cost of the occasional copies is (usually) more than offset by
the benefit of having an actual object rather than a pointer to an object.

There are three cases when you have an actual object as opposed to a pointer to
an object: local objects, global/static objects, and fully contained member
objects in a class.  The most important of these is the last ("composition").

More info about copy-vs-reference semantics is given in the next FAQs.  Please
read them all to get a balanced perspective.  The first few have intentionally
been slanted toward value semantics, so if you only read the first few of the
following FAQs, you'll get a warped perspective.

Assignment has other issues (e.g., shallow vs. deep copy) which are not covered
here.

==============================================================================

[28.2] What is "virtual data," and how-can / why-would I use it in C++?

virtual data allows a derived class to change the exact class of a base class's
member object.  virtual data isn't strictly "supported" by C++, however it can
be simulated in C++.  It ain't pretty, but it works.

To simulate virtual data in C++, the base class must have a pointer to the
member object, and the derived class must provide a new object to be pointed to
by the base class's pointer.  The base class would also have one or more normal
constructors that provide their own referent (again via new), and the base
class's destructor would delete the referent.

For example, class Stack might have an Array member object (using a pointer),
and derived class StretchableStack might override the base class member data
from Array to StretchableArray.  For this to work, StretchableArray would have
to inherit from Array, so Stack would have an Array*.  Stack's normal
constructors would initialize this Array* with a new Array, but Stack would
also have a (possibly protected:) constructor that would accept an Array* from
a derived class.  StretchableArray's constructor would provide a
new StretchableArray to this special constructor.

Pros:
 * Easier implementation of StretchableStack (most of the code is inherited)
 * Users can pass a StretchableStack as a kind-of Stack

Cons:
 * Adds an extra layer of indirection to access the Array
 * Adds some extra freestore allocation overhead (both new and delete)
 * Adds some extra dynamic binding overhead (reason given in next FAQ)

In other words, we succeeded at making our job easier as the implementer of
StretchableStack, but all our users pay for it[28.5].  Unfortunately the extra
overhead was imposed on both users of StretchableStack and on users of Stack.

Please read the rest of this section.  (You will not get a balanced perspective
without the others.)

==============================================================================

[28.3] What's the difference between virtual data and dynamic data?

The easiest way to see the distinction is by an analogy with virtual
functions[20]: A virtual member function means the declaration (signature) must
stay the same in subclasses, but the definition (body) can be overridden.  The
overriddenness of an inherited member function is a static property of the
subclass; it doesn't change dynamically throughout the life of any particular
object, nor is it possible for distinct objects of the subclass to have
distinct definitions of the member function.

Now go back and re-read the previous paragraph, but make these substitutions:
 * "member function" --> "member object"
 * "signature" --> "type"
 * "body" --> "exact class"

After this, you'll have a working definition of virtual data.

Another way to look at this is to distinguish "per-object" member functions
from "dynamic" member functions.  A "per-object" member function is a member
function that is potentially different in any given instance of an object, and
could be implemented by burying a function pointer in the object; this pointer
could be const, since the pointer will never be changed throughout the object's
life.  A "dynamic" member function is a member function that will change
dynamically over time; this could also be implemented by a function pointer,
but the function pointer would not be const.

Extending the analogy, this gives us three distinct concepts for data members:
 * virtual data: the definition (class) of the member object is overridable in
   subclasses provided its declaration ("type") remains the same, and this
   overriddenness is a static property of the subclass
 * per-object-data: any given object of a class can instantiate a different
   conformal (same type) member object upon initialization (usually a "wrapper"
   object), and the exact class of the member object is a static property of
   the object that wraps it
 * dynamic-data: the member object's exact class can change dynamically over
   time

The reason they all look so much the same is that none of this is "supported"
in C++.  It's all merely "allowed," and in this case, the mechanism for faking
each of these is the same: a pointer to a (probably abstract) base class.  In a
language that made these "first class" abstraction mechanisms, the difference
would be more striking, since they'd each have a different syntactic variant.

==============================================================================

[28.4] Should I normally use pointers to freestore allocated objects for my
       data members, or should I use "composition"?

Composition.

Your member objects should normally be "contained" in the composite object (but
not always; "wrapper" objects are a good example of where you want a
pointer/reference; also the N-to-1-uses-a relationship needs something like a
pointer/reference).

There are three reasons why fully contained member objects ("composition") has
better performance than pointers to freestore-allocated member objects:
 * Extra layer of indirection every time you need to access the member object
 * Extra freestore allocations (new in constructor, delete in destructor)
 * Extra dynamic binding (reason given below)

==============================================================================

[28.5] What are relative costs of the 3 performance hits associated with
       allocating member objects from the freestore?

The three performance hits are enumerated in the previous FAQ:
 * By itself, an extra layer of indirection is small potatoes
 * Freestore allocations can be a performance issue (the performance of the
   typical implementation of malloc() degrades when there are many allocations;
   OO software can easily become "freestore bound" unless you're careful)
 * The extra dynamic binding comes from having a pointer rather than an object.
   Whenever the C++ compiler can know an object's exact class, virtual[20]
   function calls can be statically bound, which allows inlining.  Inlining
   allows zillions (would you believe half a dozen :-) optimization
   opportunities such as procedural integration, register lifetime issues, etc.
   The C++ compiler can know an object's exact class in three circumstances:
   local variables, global/static variables, and fully-contained member objects

Thus fully-contained member objects allow significant optimizations that
wouldn't be possible under the "member objects-by-pointer" approach.  This is
the main reason that languages which enforce reference-semantics have
"inherent" performance challenges.

Note: Please read the next three FAQs to get a balanced perspective!

==============================================================================

[28.6] Are "inline virtual" member functions ever actually "inlined"?

Occasionally...

When the object is referenced via a pointer or a reference, a call to a
virtual[20] function cannot be inlined, since the call must be resolved
dynamically.  Reason: the compiler can't know which actual code to call until
run-time (i.e., dynamically), since the code may be from a derived class that
was created after the caller was compiled.

Therefore the only time an inline virtual call can be inlined is when the
compiler knows the "exact class" of the object which is the target of the
virtual function call.  This can happen only when the compiler has an actual
object rather than a pointer or reference to an object.  I.e., either with a
local object, a global/static object, or a fully contained object inside a
composite.

Note that the difference between inlining and non-inlining is normally much
more significant than the difference between a regular function call and a
virtual function call.  For example, the difference between a regular function
call and a virtual function call is often just two extra memory references, but
the difference between an inline function and a non-inline function can be as
much as an order of magnitude (for zillions of calls to insignificant member
functions, loss of inlining virtual functions can result in 25X speed
degradation! [Doug Lea, "Customization in C++," proc Usenix C++ 1990]).

A practical consequence of this insight: don't get bogged down in the endless
debates (or sales tactics!) of compiler/language vendors who compare the cost
of a virtual function call on their language/compiler with the same on another
language/compiler.  Such comparisons are largely meaningless when compared with
the ability of the language/compiler to "inline expand" member function calls.
I.e., many language implementation vendors make a big stink about how good
their dispatch strategy is, but if these implementations don't inline member
function calls, the overall system performance would be poor, since it is
inlining --not dispatching-- that has the greatest performance impact.

Note: Please read the next two FAQs to see the other side of this coin!

==============================================================================

[28.7] Sounds like I should never use reference semantics, right?

Wrong.

Reference semantics are A Good Thing.  We can't live without pointers.  We just
don't want our s/w to be One Gigantic Rats Nest Of Pointers.  In C++, you can
pick and choose where you want reference semantics (pointers/references) and
where you'd like value semantics (where objects physically contain other
objects etc).  In a large system, there should be a balance.  However if you
implement absolutely everything as a pointer, you'll get enormous speed hits.

Objects near the problem skin are larger than higher level objects.  The
identity of these "problem space" abstractions is usually more important than
their "value." Thus reference semantics should be used for problem-space
objects.

Note that these problem space objects are normally at a higher level of
abstraction than the solution space objects, so the problem space objects
normally have a relatively lower frequency of interaction.  Therefore C++ gives
us an ideal situation: we choose reference semantics for objects that need
unique identity or that are too large to copy, and we can choose value
semantics for the others.  Thus the highest frequency objects will end up with
value semantics, since we install flexibility where it doesn't hurt us (only),
and we install performance where we need it most!

These are some of the many issues the come into play with real OO design.
OO/C++ mastery takes time and high quality training.  If you want a powerful
tool, you've got to invest.

Don't stop now! Read the next FAQ too!!

==============================================================================

[28.8] Does the poor performance of reference semantics mean I should
       pass-by-value?

Nope.

The previous FAQ were talking about member objects, not parameters.  Generally,
objects that are part of an inheritance hierarchy should be passed by reference
or by pointer, not by value, since only then do you get the (desired) dynamic
binding (pass-by-value doesn't mix with inheritance, since larger subclass
objects get "sliced" when passed by value as a base class object).

Unless compelling reasons are given to the contrary, member objects should be
by value and parameters should be by reference.  The discussion in the previous
few FAQs indicates some of the "compelling reasons" for when member objects
should be by reference.

==============================================================================

SECTION [29]: How to mix C and C++


[29.1] What do I need to know when mixing C and C++ code?

There are several caveats:
 * Your must use your C++ compiler when compiling main() (e.g., for static
   initialization)
 * Your C++ compiler should direct the linking process (e.g., so it can get its
   special libraries)
 * Your C and C++ compilers probably need to come from same vendor and have
   compatible versions (e.g., so they have the same calling conventions)

In addition, you'll need to read the rest of this section to find out how to
make your C functions callable by C++ and/or your C++ functions callable by C.

==============================================================================

[29.2] How can I include a standard C header file in my C++ code?

To #include a standard header file (such as <stdio.h>), you don't have to do
anything unusual.  E.g.,

    // This is C++ code

    #include <stdio.h>          // Note: nothing unusual in #include line

    main()
    {
      printf("Hello world\n");  // Note: nothing unusual in the call
    }

Note: Somewhat different guidelines apply for non-system C headers.  There are
two cases: either you can't change the header[29.3], or you can change the
header[29.4].

==============================================================================

[29.3] How can I include a non-system C header file in my C++ code?

If you are including a C header file that isn't provided by the system, you may
need to wrap the #include line in an extern C { /*...*/ } construct.  This
tells the C++ compiler that the functions declared in the header file are are C
functions.

    // This is C++ code

    extern "C" {
      // Get declaration for f(int i, char c, float x)
      #include "my-C-code.h"
    }

    main()
    {
      f(7, 'x', 3.14);   // Note: nothing unusual in the call
    }

Note: Somewhat different guidelines apply for C headers provided by the system
(such as <stdio.h>)[29.2] and for C headers that you can change[29.4].

==============================================================================

[29.4] How can I modify my own C header files so it's easier to #include them
       in C++ code?

If you are including a C header file that isn't provided by the system, and if
you are able to change the C header, you should strongly consider adding the
extern C {...} logic inside the header to make it easier for C++ users to
#include it into their C++ code.  Since a C compiler won't understand the
extern C construct, you must wrap the extern C { and } lines in an #ifdef so
they won't be seen by normal C compilers.

Step #1: Put the following lines at the very top of your C header file (note:
the symbol __cplusplus is #defined if/only-if the compiler is a C++ compiler):

    #ifdef __cplusplus
    extern "C" {
    #endif

Step #2: Put the following lines at the very bottom of your C header file:

    #ifdef __cplusplus
    }
    #endif

Now you can #include your C header without any extern C nonsense in your C++
code:

    // This is C++ code

    // Get declaration for f(int i, char c, float x)
    #include "my-C-code.h"   // Note: nothing unusual in #include line

    main()
    {
      f(7, 'x', 3.14);       // Note: nothing unusual in the call
    }

Note: Somewhat different guidelines apply for C headers provided by the system
(such as <stdio.h>)[29.2] and for C headers that you can't change[29.3].

==============================================================================

[29.5] How can I call a non-system C function f(int,char,float) from my C++
       code?

If you have an individual C function that you want to call, and for some reason
you don't have or don't want to #include a C header file in which that function
is declared, you can declare the individual C function in your C code using the
extern C syntax.  Naturally you need to use the full function prototype:

    extern "C" void f(int i, char c, float x);

A block of several C functions can be grouped via braces:

    extern "C" {
      void   f(int i, char c, float x);
      int    g(char* s, const char* s2);
      double sqrtOfSumOfSquares(double a, double b);
    }

After this you simply call the function just as if it was a C++ function:

    main()
    {
      f(7, 'x', 3.14);   // Note: nothing unusual in the call
    }

==============================================================================

[29.6] How can I create a C++ function f(int,char,float) that is callable by my
       C code?

The C++ compiler must know that f(int,char,float) is to be called by a C
compiler using the extern C construct[29.3]:

    // This is C++ code

    // Declare f(int,char,float) using extern C:
    extern "C" void f(int i, char c, float x);

    // ...

    // Define f(int,char,float) in some C++ module:
    void f(int i, char c, float x)
    {
      // ...
    }

The extern C line tells the compiler that the external information sent to the
linker should use C calling conventions and name mangling (e.g., preceded by a
single underscore).  Since name overloading isn't supported by C, you can't
make several overloaded functions simultaneously callable by a C program.

==============================================================================

[29.7] Why is the linker giving errors for C/C++ functions being called from
       C++/C functions?

If you didn't get your extern C right, you'll sometimes get linker errors
rather than compiler errors.  This is due to the fact that C++ compilers
usually "mangle" function names (e.g., to support function overloading)
differently than C compilers.

See the previous two FAQs on how to use extern C.

==============================================================================

[29.8] How can I pass an object of a C++ class to/from a C function?

Here's an example (for info on extern C, see the previous two FAQs).

Fred.h:

    /* This header can be read by both C and C++ compilers */

    #ifdef __cplusplus
      class Fred {
      public:
        Fred();
        void wilma(int);
      private:
        int a_;
      };
    #else
      typedef
        struct Fred
          Fred;
    #endif

    #ifdef __cplusplus
    extern "C" {
    #endif

    #if defined(__STDC__) || defined(__cplusplus)
      extern void c_function(Fred*);   /* ANSI-C prototypes */
      extern Fred* cplusplus_callback_function(Fred*);
    #else
      extern void c_function();        /* K&R style */
      extern Fred* cplusplus_callback_function();
    #endif

    #ifdef __cplusplus
    }
    #endif

Fred.cpp:

    // This is C++ code

    #include "Fred.h"

    Fred::Fred() : a_(0) { }

    void Fred::wilma(int a) { }

    Fred* cplusplus_callback_function(Fred* fred)
    {
      fred->wilma(123);
      return fred;
    }

main.cpp:

    // This is C++ code

    #include "Fred.h"

    int main()
    {
      Fred fred;
      c_function(&fred);
      return 0;
    }

c-function.c:

    /* This is C code */

    #include "Fred.h"

    void c_function(Fred* fred)
    {
      cplusplus_callback_function(fred);
    }

Passing pointers to C++ objects to/from C functions will fail if you pass and
get back something that isn't exactly the same pointer.  For example, don't
pass a base class pointer and receive back a derived class pointer, since your
C compiler won't understand the pointer conversions necessary to handle
multiple and/or virtual inheritance.

==============================================================================

[29.9] Can my C function directly access data in an object of a C++ class?

Sometimes.

(For basic info on passing C++ objects to/from C functions, read the previous
FAQ).

You can safely access a C++ object's data from a C function if the C++ class:
 * Has no virtual[20] functions (including inherited virtual functions)
 * Has all its data in the same access-level section (private/protected/public)
 * Has no fully-contained subobjects with virtual[20] functions

If the C++ class has any base classes at all (or if any fully contained
subobjects have base classes), accessing the data will technically be
non-portable, since class layout under inheritance isn't imposed by the
language.  However in practice, all C++ compilers do it the same way: the base
class object appears first (in left-to-right order in the event of multiple
inheritance), and member objects follow.

Furthermore, if the class (or any base class) contains any virtual functions,
almost all C++ compliers put a void* into the object either at the location of
the first virtual function or at the very beginning of the object.  Again, this
is not required by the language, but it is the way "everyone" does it.

If the class has any virtual base classes, it is even more complicated and less
portable.  One common implementation technique is for objects to contain an
object of the virtual base class (V) last (regardless of where V shows up as a
virtual base class in the inheritance hierarchy).  The rest of the object's
parts appear in the normal order.  Every derived class that has V as a virtual
base class actually has a pointer to the V part of the final object.

==============================================================================

[29.10] Why do I feel like I'm "further from the machine" in C++ as opposed to
        C?

Because you are.

As an OO programming language, C++ allows you to model the problem domain
itself, which allows you to program in the language of the problem domain
rather than in the language of the solution domain.

One of C's great strengths is the fact that it has "no hidden mechanism": what
you see is what you get.  You can read a C program and "see" every clock cycle.
This is not the case in C++; old line C programmers (such as many of us once
were) are often ambivalent (can you say, "hostile"?) about this feature.
However after they've made the transition to OO thinking, they often realize
that although C++ hides some mechanism from the programmer, it also provides a
level of abstraction and economy of expression which lowers maintenance costs
without destroying run-time performance.

Naturally you can write bad code in any language; C++ doesn't guarantee any
particular level of quality, reusability, abstraction, or any other measure of
"goodness."

C++ doesn't try to make it impossible for bad programmers to write bad
programs; it enables reasonable developers to create superior software.

==============================================================================

SECTION [30]: Pointers to member functions


[30.1] Is the type of "pointer-to-member-function" different from
       "pointer-to-function"?

Yep.

Consider the following function:

    int f(char a, float b);

The type of this function is different depending on whether it is an ordinary
function or a non-static member function of some class:
 * It's type is "int (*)(char,float)" if an ordinary function
 * It's type is "int (Fred::*)(char,float)" if a non-static member function of
   class Fred

Note: if it's a static member function of class Fred, its type is the same as
if it was an ordinary function: "int (*)(char,float)".

==============================================================================

[30.2] How do I pass a pointer to member function to a signal handler, X event
       callback, etc?

Don't.

Because a member function is meaningless without an object to invoke it on, you
can't do this directly (if The X Windows System was rewritten in C++, it would
probably pass references to objects around, not just pointers to functions;
naturally the objects would embody the required function and probably a whole
lot more).

As a patch for existing software, use a top-level (non-member) function as a
wrapper which takes an object obtained through some other technique (held in a
global, perhaps).  The top-level function would apply the desired member
function against the global object.

E.g., suppose you want to call Fred::memberFunction() on interrupt:

    class Fred {
    public:
      void memberFunction();
      static void staticMemberFunction();  // A static member function can handle it
      // ...
    };

    // Wrapper function uses a global to remember the object:
    Fred* object_which_will_handle_signal;
    void Fred_memberFunction_wrapper()
    {
      object_which_will_handle_signal->memberFunction();
    }

    main()
    {
      /* signal(SIGINT, Fred::memberFunction); */   // Can NOT do this
      signal(SIGINT, Fred_memberFunction_wrapper);  // OK
      signal(SIGINT, Fred::staticMemberFunction);   // Also OK
    }

Note: static member functions do not require an actual object to be invoked, so
pointers-to-static-member-functions are type compatible with regular
pointers-to-functions.

==============================================================================

[30.3] Why do I keep getting compile errors (type mismatch) when I try to use a
       member function as an interrupt service routine?

This is a special case of the previous two questions, therefore read the
previous two answers first.

Non-static member functions have a hidden parameter that corresponds to the
this pointer.  The this pointer points to the instance data for the object.
The interrupt hardware/firmware in the system is not capable of providing the
this pointer argument.  You must use "normal" functions (non class members) or
static member functions as interrupt service routines.

One possible solution is to use a static member as the interrupt service
routine and have that function look somewhere to find the instance/member pair
that should be called on interrupt.  Thus the effect is that a member function
is invoked on an interrupt, but for technical reasons you need to call an
intermediate function first.

==============================================================================

[30.4] Why am I having trouble taking the address of a C++ function?

This is a corollary to the previous FAQ.

Long answer: In C++, member functions have an implicit parameter which points
to the object (the this pointer inside the member function).  Normal C
functions can be thought of as having a different calling convention from
member functions, so the types of their pointers (pointer-to-member-function
vs. pointer-to-function) are different and incompatible.  C++ introduces a new
type of pointer, called a pointer-to-member, which can be invoked only by
providing an object.

NOTE: do not attempt to "cast" a pointer-to-member-function into a
pointer-to-function; the result is undefined and probably disastrous.  E.g., a
pointer-to-member-function is not required to contain the machine address of
the appropriate function.  As was said in the last example, if you have a
pointer to a regular C function, use either a top-level (non-member) function,
or a static (class) member function.

==============================================================================

[30.5] How can I avoid syntax errors when calling a member function using a
       pointer-to-member-function?

Two things: (1) use a typedef, and (2) use a #define macro.

Here's the way you create the typedef:

    class Fred {
    public:
      int f(char x, float y);
      int g(char x, float y);
      int h(char x, float y);
      int i(char x, float y);
      // ...
    };

    // FredMemberFn points to a member of Fred that takes (char,float)
    typedef  int (Fred::*FredMemberFn)(char x, float y);

Here's the way you create the #define macro (normally I dislike #define
macros[9.3], but this is one of those rare cases where they actually improve
the readability and writability of your code):

    #define callMemberFunction(object,ptrToMember)  ((object).*(ptrToMember))

Here's how you use these features:

    void userCode(Fred& fred, FredMemberFn memFn)
    {
      callMemberFunction(fred,memFn)('x', 3.14);
      // Would normally be: (fred.*memFn)('x', 3.14);
    }

I strongly recommend these features.  In the real world, member function
invocations are a lot more complex than the simple example just given, and the
difference in readability and writability is significant.  comp.lang.c++ has
had to endure hundreds and hundreds of postings from confused programmers who
couldn't quite get the syntax right.  Almost all these errors would have
vanished had they used these features.

==============================================================================

[30.6] How do I create and use an array of pointers to member functions?

Use the usual typedef and #define macro[30.5] and you're 90% done.

First, use a typedef:

    class Fred {
    public:
      int f(char x, float y);
      int g(char x, float y);
      int h(char x, float y);
      int i(char x, float y);
      // ...
    };

    // FredMemberFn points to a member of Fred that takes (char,float)
    typedef  int (Fred::*FredMemberFn)(char x, float y);

That makes the array of pointers-to-member-functions straightforward:

    FredMemberFn a[4] = { &Fred::f, &Fred::g, &Fred::h, &Fred::i };

Second, use the callMemberFunction macro:

    #define callMemberFunction(object,ptrToMember)  ((object).*(ptrToMember))

That makes calling one of the member functions on object "fred"
straightforward:

    void userCode(Fred& fred, int memberFunctionNum)
    {
      // Assume memberFunctionNum is between 0 and 3 inclusive:
      callMemberFunction(fred, a[memberFunctionNum]) ('x', 3.14);
    }

==============================================================================

SECTION [31]: Container classes and templates


[31.1] How can I make a perl-like associative array in C++?

Use the standard class template map<Key,Val>:

    #include <string>
    #include <map>
    #include <iostream>
    using namespace std;

    main()
    {
      map<string,int,less<string> >  age;   // age is a map from string to int

      age["Fred"] = 42;                     // Fred is 42 years old
      age["Barney"] = 37;                   // Barney is 37

      if (todayIsFredsBirthday())           // On Fred's birthday,
        ++ age["Fred"];                     // increment Fred's age

      cout << "Fred is " << age["Fred"] << " years old\n";
    }

==============================================================================

[31.2] How can I build a <favorite container> of objects of different types?

You can't, but you can fake it pretty well.  In C/C++ all arrays are
homogeneous (i.e., the elements are all the same type).  However, with an extra
layer of indirection you can give the appearance of a heterogeneous container
(a heterogeneous container is a container where the contained objects are of
different types).

There are two cases with heterogeneous containers.

The first case occurs when all objects you want to store in a container are
publicly derived from a common base class.  You can then declare/define your
container to hold pointers to the base class.  You indirectly store a derived
class object in a container by storing the object's address as an element in
the container.  You can then access objects in the container indirectly through
the pointers (enjoying polymorphic behavior).  If you need to know the exact
type of the object in the container you can use dynamic_cast<> or typeid().
You'll probably need the Virtual Constructor Idiom[20.5] to copy a container of
disparate object types.  The downside of this approach is that it makes memory
management a little more problematic (who "owns" the pointed-to objects? if you
delete these pointed-to objects when you destroy the container, how can you
guarantee that no one else has a copy of one of these pointers? if you don't
delete these pointed-to objects when you destroy the container, how can you be
sure that someone else will eventually do the deleteing?).  It also makes
copying the container more complex (may actually break the container's copying
functions since you don't want to copy the pointers, at least not when the
container "owns" the pointed-to objects).

The second case occurs when the object types are disjoint -- they do not share
a common base class.  The approach here is to use a handle class.  The
container is a container of handle objects (by value or by pointer, your
choice; by value is easier).  Each handle object knows how to "hold on to"
(i.e. ,maintain a pointer to) one of the objects you want to put in the
container.  You can use either a single handle class with several different
types of pointers as instance data, or a hierarchy of handle classes that
shadow the various types you wish to contain (requires the container be of
handle base class pointers).  The downside of this approach is that it opens up
the handle class(es) to maintenance every time you change the set of types that
can be contained.  The benefit is that you can use the handle class(es) to
encapsulate most of the ugliness of memory management and object lifetime.
Thus using handle objects may be beneficial even in the first case.

==============================================================================

[31.3] How can I insert/access/change elements from a linked
       list/hashtable/etc?

I'll use an "inserting into a linked list" as a prototypical example.  It's
easy to allow insertion at the head and tail of the list, but limiting
ourselves to these would produce a library that is too weak (a weak library is
almost worse than no library).

This answer will be a lot to swallow for novice C++'ers, so I'll give a couple
of options.  The first option is easiest; the second and third are better.
 1. Empower the List with a "current location," and member functions such as
    advance(), backup(), atEnd(), atBegin(), getCurrElem(), setCurrElem(Elem),
    insertElem(Elem), and removeElem().  Although this works in small examples,
    the notion of a current position makes it difficult to access elements at
    two or more positions within the list (e.g., "for all pairs x,y do the
    following...").
 2. Remove the above member functions from List itself, and move them to a
    separate class, ListPosition.  ListPosition would act as a "current
    position" within a list.  This allows multiple positions within the same
    list.  ListPosition would be a friend[14] of class List, so List can hide
    its innards from the outside world (else the innards of List would have to
    be publicized via public member functions in List).  Note: ListPosition can
    use operator overloading for things like advance() and backup(), since
    operator overloading is syntactic sugar for normal member functions.
 3. Consider the entire iteration as an atomic event, and create a class
    template to embodies this event.  This enhances performance by allowing the
    public access member functions (which may be virtual[20] functions) to be
    avoided during the inner loop.  Unfortunately you get extra object code in
    the application, since templates gain speed by duplicating code.  For more,
    see [Koenig, "Templates as interfaces," JOOP, 4, 5 (Sept 91)], and
    [Stroustrup, "The C++ Programming Language Second Edition," under
    "Comparator"].

==============================================================================

[31.4] What's the idea behind templates?

A template is a cookie-cutter that specifies how to cut cookies that all look
pretty much the same (although the cookies can be made of various kinds of
dough, they'll all have the same basic shape).  In the same way, a class
template is a cookie cutter for a description of how to build a family of
classes that all look basically the same, and a function template describes how
to build a family of similar looking functions.

Class templates are often used to build type safe containers (although this
only scratches the surface for how they can be used).

==============================================================================

[31.5] What's the syntax / semantics for a "function template"?

Consider this function that swaps its two integer arguments:

    void swap(int& x, int& y)
    {
      int tmp = x;
      x = y;
      y = tmp;
    }

If we also had to swap floats, longs, Strings, Sets, and FileSystems, we'd get
pretty tired of coding lines that look almost identical except for the type.
Mindless repetition is an ideal job for a computer, hence a function template:

    template<class T>
    void swap(T& x, T& y)
    {
      T tmp = x;
      x = y;
      y = tmp;
    }

Every time we used swap() with a given pair of types, the compiler will go to
the above definition and will create yet another "template function" as an
instantiation of the above.  E.g.,

    main()
    {
      int    i,j;  /*...*/  swap(i,j);  // Instantiates a swap for int
      float  a,b;  /*...*/  swap(a,b);  // Instantiates a swap for float
      char   c,d;  /*...*/  swap(c,d);  // Instantiates a swap for char
      String s,t;  /*...*/  swap(s,t);  // Instantiates a swap for String
    }

Note: A "template function" is the instantiation of a "function template".

==============================================================================

[31.6] What's the syntax / semantics for a "class template"?

Consider a container class Array that acts like an array of integers:

    // This would go into a header file such as "Array.h"
    class Array {
    public:
      Array(int len=10)                  : len_(len), data_(new int[len]) { }
     ~Array()                            { delete [] data_; }
      int len() const                    { return len_;     }
      const int& operator[](int i) const { return data_[check(i)]; }
            int& operator[](int i)       { return data_[check(i)]; }
      Array(const Array&);
      Array& operator= (const Array&);
    private:
      int  len_;
      int* data_;
      int  check(int i) const
        { if (i < 0 || i >= len_) throw BoundsViol("Array", i, len_);
          return i; }
    };

Just as with swap() above, repeating the above over and over for Array of
float, of char, of String, of Array-of-String, etc, will become tedious.

    // This would go into a header file such as "Array.h"
    template<class T>
    class Array {
    public:
      Array(int len=10)                : len_(len), data_(new T[len]) { }
     ~Array()                          { delete [] data_; }
      int len() const                  { return len_;     }
      const T& operator[](int i) const { return data_[check(i)]; }
            T& operator[](int i)       { return data_[check(i)]; }
      Array(const Array<T>&);
      Array& operator= (const Array<T>&);
    private:
      int len_;
      T*  data_;
      int check(int i) const
        { if (i < 0 || i >= len_) throw BoundsViol("Array", i, len_);
          return i; }
    };

Unlike template functions, template classes (instantiations of class templates)
need to be explicit about the parameters over which they are instantiating:

    main()
    {
      Array<int>           ai;
      Array<float>         af;
      Array<char*>         ac;
      Array<String>        as;
      Array< Array<int> >  aai;
    }

Note the space between the two >'s in the last example.  Without this space,
the compiler would see a >> (right-shift) token instead of two >'s.

==============================================================================

[31.7] What is a "parameterized type"?

Another way to say, "class templates."

A parameterized type is a type that is parameterized over another type or some
value.  List<int> is a type (List) parameterized over another type (int).

==============================================================================

[31.8] What is "genericity"?

Yet another way to say, "class templates."

Not to be confused with "generality" (which just means avoiding solutions which
are overly specific), "genericity" means class templates.

==============================================================================

SECTION [32]: Class libraries


[32.1] Where can I get a copy of "STL"?

"STL" is the "Standard Templates Library".  You can get a copy from:
 * An STL site: ftp://ftp.cs.rpi.edu/pub/stl
 * STL HP official site: ftp://butler.hpl.hp.com/stl/
 * Mirror site in Europe: http://www.maths.warwick.ac.uk/ftp/mirrors/c++/stl/
 * STL code alternate: ftp://ftp.cs.rpi.edu/stl
 * STL code + examples: http://www.cs.rpi.edu/~musser/stl.html

STL hacks for GCC-2.6.3 are part of the GNU libg++ package 2.6.2.1 or later
(and they may be in an earlier version as well).  Thanks to Mike Lindner.

==============================================================================

[32.2] How can I find a Fred object in an STL container of Fred* such as
       vector<Fred*>?

STL functions such as std::find_if() help you find a T element in a container
of T's.  But if you have a container of pointers such as vector<Fred*>, these
functions will enable you to find an element that matches a given Fred*
pointer, but they don't let you find an element that matches a given Fred
object.

The solution is to use an optional parameter that specifies the "match"
function.  The following class template lets you compare the objects on the
other end of the dereferenced pointers.

    template<class T>
    class DereferencedEqual {
    public:
      DereferencedEqual(const T* p) : p_(p) { }
      bool operator== (const T* p2) const { return *p == *p2; }
    private:
      const T* p_;
    };

Now you can use this template to find an appropriate Fred object:

    void userCode(vector<Fred*> v, const Fred& match)
    {
      find_if(v.begin(), v.end(), DereferencedEqual<Fred>(&match));
      // ...
    }

==============================================================================

[32.3] Where can I get help on how to use STL?

Kenny Zalewski's STL guide: http://www.cs.rpi.edu/projects/STL/htdocs/stl.html

Dave Musser's STL guide: http://www.cs.rpi.edu/~musser/stl.html

Mumit's STL Newbie's guide:
http://www.xraylith.wisc.edu/~khan/software/stl/STL.newbie.html

==============================================================================

[32.4] How can you tell if you have a dynamically typed C++ class library?

 * Hint #1: when everything is derived from a single root class, usually
   Object.
 * Hint #2: when the container classes (List, Stack, Set, etc) are
   non-templates.
 * Hint #3: when the container classes (List, Stack, Set, etc) insert/extract
   elements as pointers to Object.  This lets you put an Apple into such a
   container, but when you get it out, the compiler knows only that it is
   derived from Object, so you have to use a pointer cast to convert it back to
   an Apple*; and you better pray a lot that it really is an Apple, cause your
   blood is on your own head).

You can make the pointer cast "safe" by using dynamic_cast, but this dynamic
testing is just that: dynamic.  This coding style is the essence of dynamic
typing in C++.  You call a function that says "convert this Object into an
Apple or give me NULL if its not an Apple," and you've got dynamic typing: you
don't know what will happen until run-time.

When you use templates to implement your containers, the C++ compiler can
statically validate 90+% of an application's typing information (the figure
"90+%" is apocryphal; some claim they always get 100%, those who need
persistence[34.4] get something less than 100% static type checking).  The
point is: C++ gets genericity from templates, not from inheritance.

==============================================================================

[32.5] What is the NIHCL? Where can I get it?

NIHCL stands for "National-Institute-of-Health's-class-library." It can be
acquired via ftp://128.231.128.7/pub/NIHCL/nihcl-3.0.tar.Z

NIHCL (some people pronounce it "N-I-H-C-L," others pronounce it like "nickel")
is a C++ translation of the Smalltalk class library[32.4].  There are some ways
where NIHCL's use of dynamic typing helps (e.g., persistent[34.4] objects).
There are also places where its use of dynamic typing creates tension[27.3]
with the static typing of the C++ language.

==============================================================================

[32.6] Where can I ftp the code that accompanies "Numerical Recipes"?

This software is sold and therefore it would be illegal to provide it on the
net.  However, it's only about $30.

==============================================================================

[32.7] Why is my executable so large?

Many people are surprised by how big executables are, especially if the source
code is trivial.  For example, a simple "hello world" program can generate an
executable that is larger than most people expect (40+K bytes).

One reason executables can be large is that portions of the C++ runtime library
gets linked with your program. How much gets linked in depends on how much of
it you are using, and on how the implementer split up the library into pieces.
For example, the <iostream.h> library is quite large, and consists of numerous
classes and virtual[20] functions.  Using any part of it might pull in nearly
all of the <iostream.h> code as a result of the interdependencies.

You might be able to make your program smaller by using a dynamically-linked
version of the library instead of the static version.

You have to consult your compiler manuals or the vendor's technical support for
a more detailed answer.

==============================================================================

[32.8] Where can I get tons and tons of more information on C++ class
       libraries?

The C++ Libraries FAQ is maintained by Nikki Locke cpplibs@trmphrst.demon.co.uk
and is available at http://www.trmphrst.demon.co.uk/cpplibs1.html

==============================================================================

SECTION [33]: Compiler dependencies


[33.1] How do I display text in the status bar using MFC? [NEW!]

[Recently created with the help of Paul Ganney (on 11/96).]

Use the following code snipped:

    CString s = "Text";
    CStatusBar* p =
     (CStatusBar*)AfxGetApp()->m_pMainWnd->GetDescendantWindow(AFX_IDW_STATUS_BAR);
    p->SetPaneText(1, s);

This works with MFC v.1.00 which hopefully means it will work other versions as
well.

==============================================================================

[33.2] How can I decompile an executable program back into C++ source code?

You gotta be kidding, right?

Here are a few of the many reasons this is not even remotely feasible:
 * What makes you think the program was written in C++ to begin with?
 * Even if you are sure it was originally written (at least partially) in C++,
   which one of the gazillion C++ compilers produced it?
 * Even if you know the compiler, which particular version of the compiler was
   used?
 * Even if you know the compiler's manufacturer and version number, what
   compile-time options were used?
 * Even if you know the compiler's manufacturer and version number and
   compile-time options, what third party libraries were linked-in, and what
   was their version?
 * Even if you know all that stuff, most executables have had their debugging
   information stripped out, so the resulting decompiled code will be totally
   unreadable.
 * Even if you know everything about the compiler, manufacturer, version
   number, compile-time options, third party libraries, and debugging
   information, the cost of writing a decompiler that works with even one
   particular compiler and has even a modest success rate at generating code
   would be a monumental effort -- on the par with writing the compiler itself
   from scratch.

But the biggest question is not how you can decompile someone's code, but why
do you want to do this? If you're trying to reverse-engineer someone elses
code, shame on you; go find honest work.  If you're trying to recover from
losing your own source, the best suggestion I have is to make better backups
next time.

==============================================================================

[33.3] Where can I get information about the C++ compiler from {Borland, IBM,
       Microsoft, Semantic, Sun, etc.}? [UPDATED!]

[Recently added URL for Metrowerks and Watcom compilers (on 10/96).]

In alphabetical order by vendor name:
 * Borland C++ 5.0 FAQs: http://www.mdex.net/~kentr/bc50faq.htm
 * IBM VisualAge C++: http://www.software.ibm.com/ad/cset/
 * Metrowerks C++: http://metrowerks.com or http://www.metrowerks.com
 * Microsoft Visual C++: a tutorial [please let me know about other links;
   thanks; (cline@parashift.com)]
 * Silicon Graphics C++:
   http://www.sgi.com/Products/DevMagic/products/cplusplus.html
 * Sun C++: [please let me know; thanks; (cline@parashift.com)]
 * Symantec C++: http://www.symantec.com/lit/dev/
 * Watcom C++: http://www.powersoft.com/products/languages/watccpl.html
 * [for those I forgot, please let me know; thanks; (cline@parashift.com)]

==============================================================================

[33.4] How do compilers use "over-allocation" to remember the number of
       elements in an allocated array?

Recall that when you delete[] an array, the runtime system magically knows how
many destructors to run[16.13].  This FAQ describes a technique used by some
C++ compilers to do this (the other common technique is to use an associative
array[33.5]).

If the compiler uses the "over-allocation" technique, the code for
p = new Fred[n] looks something like the following.  Note that WORDSIZE is an
imaginary machine-dependent constant that is at least sizeof(size_t), possibly
rounded up for any alignment constraints.  On many machines, this constant will
have a value of 4 or 8.  It is not a real C++ identifier that will be defined
for your compiler.

    // Original code: Fred* p = new Fred[n];
    char* tmp = (char*) operator new[] (WORDSIZE + n * sizeof(Fred));
    Fred* p = (Fred*) (tmp + WORDSIZE);
    *(size_t*)tmp = n;
    size_t i;
    try {
      for (i = 0; i < n; ++i)
        new(p + i) Fred();           // Placement new[11.10]
    } catch (...) {
      while (i-- != 0)
        (p + i)->~Fred();            // Explicit call to the destructor[11.10]
      operator delete[] ((char*)p - WORDSIZE);
      throw;
    }

Then the delete[] p statement becomes:

    // Original code: delete[] p;
    size_t n = * (size_t*) ((char*)p - WORDSIZE);
    while (n-- != 0)
      (p + n)->~Fred();
    operator delete[] ((char*)p - WORDSIZE);

Note that the address passed to operator delete[] is not the same as p.

Compared to the associative array technique[33.5], this technique is faster,
but more sensitive to the problem of programmers saying delete p rather than
delete[] p.  For example, if you make a programming error by saying delete p
where you should have said delete[] p, the address that is passed to
operator delete(void*) is not the address of any valid heap allocation.  This
will probably corrupt the heap.  Bang! You're dead!

==============================================================================

[33.5] How do compilers use an "associative array" to remember the number of
       elements in an allocated array?

Recall that when you delete[] an array, the runtime system magically knows how
many destructors to run[16.13].  This FAQ describes a technique used by some
C++ compilers to do this (the other common technique is to
over-allocate[33.4]).

If the compiler uses the associative array technique, the code for
p = new Fred[n] looks something like this (where arrayLengthAssociation is the
imaginary name of a hidden, global associative array that maps from void* to
"size_t"):

    // Original code: Fred* p = new Fred[n];
    Fred* p = (Fred*) operator new[] (n * sizeof(Fred));
    size_t i;
    try {
      for (i = 0; i < n; ++i)
        new(p + i) Fred();           // Placement new[11.10]
    } catch (...) {
      while (i-- != 0)
        (p + i)->~Fred();            // Explicit call to the destructor[11.10]
      operator delete[] (p);
      throw;
    }
    arrayLengthAssociation.insert(p, n);

Then the delete[] p statement becomes:

    // Original code: delete[] p;
    size_t n = arrayLengthAssociation.lookup(p);
    while (n-- != 0)
      (p + n)->~Fred();
    operator delete[] (p);

Cfront uses this technique (it uses an AVL tree to implement the associative
array).

Compared to the over-allocation technique[33.4], the associative array
technique is slower, but less sensitive to the problem of programmers saying
delete p rather than delete[] p.  For example, if you make a programming error
by saying delete p where you should have said delete[] p, only the first Fred
in the array gets destructed, but the heap may survive (unless you've replaced
operator delete[] with something that doesn't simply call operator delete, or
unless the destructors for the other Fred objects were necessary).

==============================================================================

[33.6] If name mangling was standardized, could I link code compiled with
       compilers from different compiler vendors? [UPDATED!]

[Recently reworded and added v-table and v-pointer references[20.3] (on
11/96).]

Short answer: Probably not.

In other words, some people would like to see name mangling standards
incorporated into the proposed C++ ANSI standards in an attempt to avoiding
having to purchase different versions of class libraries for different compiler
vendors.  However name mangling differences are one of the smallest differences
between implementations, even on the same platform.

Here is a partial list of other differences:
 * Number and type of hidden arguments to member functions.
   - is this handled specially?
   - where is the return-by-value pointer passed?
 * Assuming a v-table[20.3] is used:
   - what is its contents and layout?
   - where/how is the adjustment to this made for multiple and/or virtual
     inheritance?
 * How are classes laid out, including:
   - location of base classes?
   - handling of virtual base classes?
   - location of v-pointers[20.3], if they are used at all?
 * Calling convention for functions, including:
   - where are the actual parameters placed?
   - in what order are the actual parameters passed?
   - how are registers saved?
   - where does the return value go?
   - does caller or callee pop the stack after the call?
   - special rules for passing or returning structs or doubles?
   - special rules for saving registers when calling leaf functions?
 * How is the run-time-type-identification laid out?
 * How does the runtime exception handling system know which local objects need
   to be destructed during an exception throw?

==============================================================================

[33.7] GNU C++ (g++) produces big executables for tiny programs; Why?

libg++ (the library used by g++) was probably compiled with debug info (-g).
On some machines, recompiling libg++ without debugging can save lots of disk
space (approximately 1 MB; the down-side: you'll be unable to trace into libg++
calls).  Merely strip-ping the executable doesn't reclaim as much as
recompiling without -g followed by subsequent strip-ping the resultant a.out's.

Use size a.out to see how big the program code and data segments really are,
rather than ls -s a.out which includes the symbol table.

==============================================================================

[33.8] Is there a yacc-able C++ grammar?

There used to be a yacc grammar that was pretty close to C++.  As far as I am
aware, it has not kept up with the evolving C++ standard.  For example, the
grammar doesn't handle templates, "exceptions", nor
run-time-type-identification, and it deviates from the rest of the language in
some subtle ways.

It is available at http://srawgw.sra.co.jp/.a/pub/cmd/c++grammar2.0.tar.gz

==============================================================================

[33.9] What is C++ 1.2? 2.0? 2.1? 3.0?

These are not versions of the language, but rather versions of cfront, which
was the original C++ translator implemented by AT&T.  It has become generally
accepted to use these version numbers as if they were versions of the language
itself.

Very roughly speaking, these are the major features:
 * 2.0 includes multiple/virtual inheritance and pure virtual[22.4] functions
 * 2.1 includes semi-nested classes and delete[] pointerToArray
 * 3.0 includes fully-nested classes, templates and i++ vs. ++i
 * 4.0 will include exceptions

==============================================================================

-- 
Marshall Cline, Ph.D., President, Paradigm Shift, Inc.
315-353-6100 (voice)
315-353-6110 (fax)
mailto:cline@parashift.com
