Thursday, May 29, 2008

C++ puzzles #3

Beware of Using setjmp and longjmp

Using a longjmp() in C++ is dangerous because it jumps out of a function, without unwinding the stack first. Consequently, destructors of local objects are not invoked. As a rule, long jumps should be used in pure C exclusively. In C++ code, you should use standard exception handling instead.

Initializing const static Data Members

The C++ Standard now allows initialization of const static data members of an integral type inside their class.

 
#include <string>
class Buff
{
private:
  static const int MAX = 512; //definition
  static const char flag = 'a'; //also a defintion
  static const std::string msg; //non-integral type; must be defined outside the class body
public:
//..
};
const std::string Buff::msg = "hello";

The initialization inside the class body also defines the data member, so it shouldn't be defined outside the class, as opposed const static data members of non-integral types, which have to be defined outside the class body.

std::string and Reference Counting

The Standard's specification of class std::string is formulated to allow a reference counted implementation. However, this is not a requirement. A reference counted implementation must have the same semantics as a non-reference counted one. For example:

 
string str1("xyz");
string::iterator i = str1.begin();
string str2 = str1;
*i = 'w';  //must modify only  str1

Correct syntax for automatic object instantiation

When you instantiate an automatic object (e.g., on the stack) using its default constructor, mind that the correct syntax is:

 String str;              //correct 

And not this:

 String str(); //entirely different meaning 

Can you see why? The second statement is parsed as a declaration: ‘str’ is a function taking no arguments and returns a

 String 

object.

When dynamic_cast<> Fails

The dynamic_cast<> operator may convert an object to another related (derived or a base) object at run-time. When it fails to convert a pointer to the target pointer, it returns NULL:

 
Date date;
 
string * p = dynamic_cast &date; //string and 
Date                                                                     //are not related;  cast fails
if (p) { 
//...
}
else { //failure
//...
}

When it fails to convert an object reference to a reference of an object of the desired type, it throws an exception of type std::bad_cast:

 
try {
string s = dynamic_cast<STRING&> (date); //will surely fail and 
                                                                            //throw an exception
}
 
catch (std::bad_cast)
{
 
}

So make sure to catch whenever you're using reference cast and always check the returned value when using it for a pointer cast.

Global objects construction and destruction

In object-oriented programming (OOP), global objects are considered harmful. Particularly in C++, there's even more reason to think twice before using them: a global object construction occurs before a program's outset, so any exception thrown from its constructor can never be caught. This is also true for a global object's destructor throwing an exception: the destruction of a global object takes place conceptually after a program's termination so this exception can never be caught either.

Retrieving actual object type in run-time

C++ supports Run-time Type Identification (RTTI), which enables detection of actual object type during program execution using the built-in typeid() operator. Here is an example:

 
//file: classes.h
class base{
               base();
               virtual ~base();
};
 
class derived : public base
               {
               derived();
               virtual ~derived();
};
 
//file: RTTIdemo.cpp
#include <typeinfo>
#include <iostream>
#include "classes.h"
 
using namespace std;
 
void identify(base & b) { //a demonstartion of RTTI
 
if ( typeid(b) == typeid(base))
cout<<"base object received" <<endl;
else         //an object derived from base 
cout<<typeid(b).name()<<endl;
}

When using typeid(), and RTTI in general, please keep in mind the following:

  1. typeid() may take either an object or a type-name as its argument and return a const object of type typeinfo containing all necessary type information.
  2. In order to enable RTTI support, an object must have at least one virtual member function.
  3. Comparing to static (i.e., compile-time) type information, RTTI incurs a performance penalty, so you consider that when performance matters.
  4. In most cases, the use of typeid() is not required and even not recommended. Note that a virtual member function like name() could have achieved the same effect more easily.

Use reinterpret_cast<> operator for unsafe, non-portable casts

If your code contains unsafe type casts, such as casting an int to a pointer (which is not considered a standard conversion as opposed to casting an int to a float) or converting a pointer-to-function into void *, you should use the reinterpret_cast<> operator:

 
void * = (void *) 0x00ff; //C-style cast; should not be used in C++ code
 

Operator reinterpret_cast<> has the following form:

 
reinterpret_cast<to> (from)

For example:

 
void *p = reiterpret_cast<void *> (0x00fff); //correct form

You should choose the C++ cast operators over C-style cast for the following reasons:

  1. C-style cast is a deprecated feature and hence is not guaranteed to be supported in future versions of the C++ language.
  2. When using reinterpret_cast<> for unsafe casts (for example, casts of one type to another non-related type), you make your intention more explicit to both the human reader and the compiler.
  3. When porting software to different platforms, the reinterpret_cast<> statements will most likely have to be modified, so marking them in advance will make the migration task easier.

When an explicit creation of a temporary object is useful

If you need an object serving only as a function argument, you can use an explicit constructor call to instantiate a nameless object rather than a named one:

 
void putPixel(const point& location);
main() 
{
                               PutPixel(  point(100, 100) /*create a temporary point object as an argument*/ );
}

In the example above, a temporary object is preferred since it's automatically destroyed right after it's use and there is no danger that it will be mistakenly used elsewhere in the program . One more benefit is a potential optimization: the compiler can suppress the construction and destruction of the temporary object by inline substitution.

Assert() is Dangerous

The standard assert() macro tests its argument. If the result evaluates to 0, the standard abort() routine is called. Back in C heyday, it was a very useful debugging tool but it's less so in C++. However, if you are using it in C++ code, please mind the following:

1. Since assert() macro exists only when the DEBUG symbol is defined (otherwise assert() collapses to nothing), it should never be used to test runtime errors such as failed connection, file not found, invalid input and the like.

2. When an assertion fails, abort() is called. This is dangerous, since destructors of local objects are not invoked in this case. Throwing an exception, on the other hand, ensures that local objects' destructors are appropriately invoked.

An Object Size May Never Be Zero

An empty class doesn't have any data members or member functions. You'd think that the size of an object of such a class would be zero. However, the Standard states that a complete object shall never have a zero value. For example:

 
class Empty {};
Empty  e; // e occupies at least 1 byte of memory

There is a good reason for this restriction. If an object were allowed to occupy 0 bytes of memory, its address could overlap with the address of a different object. The most obvious case would be an array of empty objects. To avoid this, an object occupies at least one byte of memory, which guarantees that it also has a distinct memory address.

The Default Value Returned From main()

In this example, main() is missing an explicit return statement.

 
  int main()
  {
      printf("hello world");
  }

In C, when control reaches the end of main() without encountering a return statement, it returns an undefined value to the environment. In C++, however, main() implicitly executes a return 0; statement in this case.

What's in a Signature?

A function's signature consists of the function's parameter list and the parameters' ordering. A member function's signature also includes the const/volatile qualifiers (if any) of that function. Signatures provide the information needed to perform the overload resolution.
A function's return type is not considered part of its signature. Note also that the return type does not participate in overload resolution.

The this Pointer

In the body of a non-static member function, the keyword this is a non-lvalue expression whose value is the address of the object for which the function is called. However, in pre-standard C++, this was actually a pointer. Although the difference between a real pointer and a "non-lvalue expression whose value is the address of the object" may seem like hair-splitting, the distinction between the two is important because the new definition guarantees that programmers can't change the value of this because it's not an l-value. In other words, you can think of this as a function returning the address of its object, rather than the pointer itself. Assigning to this was a valid programming practice in the preliminary stages of C++; nowadays, it's both illegal and impossible due to the new definition.

The Difference Between Implicit and Explicit Copy Constructors


If there are no calls to constructors in the initialization list, an implicit copy constructor of a derived class always calls the copy constructors of the base classes. On the other hand an explicit copy constructor of the derived class (one defined by the programmer) will call the default constructors of the base classes.

DLLs and Dynamic Memory Allocation


Dynamic linking -- either in the form of shared libraries in Unix or Windows DLLs -- is not defined by standard C++. However, this is a widely used feature among C++ programmers. One of the peculiarities of dynamic linking is memory management. Unlike an ordinary program that has a single heap, a DLL-based application may have several distinct heaps: one for the main program and one for each DLL. The virtual memory system maps the DLLs' address space into the main application's address space. However, you should remember to delete dynamically allocated objects in the same executable section that created them. Thus, an object that was allocated inside a DLL must be destroyed within that DLL. Trying to destroy it in the main application is likely to raise a runtime exception. The most reasonable solution is to define allocation and deallocation functions in the DLL that create and destroy the requested object within the DLL's address space.

How Do 'New' and 'Delete' Actually Work?


The C++ runtime library includes a heap manager (subroutines responsible for maintaining the heap). The heap manager does bookkeeping to keep track of which parts of the heap have been loaned out for usage (allocated), and which parts are still free (deallocated).

Every time a call to 'new' is made, the heap manager does the following things:

(1) It searches the free part of the heap for a chunk of memory that is big enough.

(2) It makes a record of the fact that the chunk of memory is now allocated.

(3) It returns the starting address of the allocated chunk as a result.

When you call delete, the heap manager updates its records to note that the block is free.

The std::unexpected() Function


The Standard Library defines a function called std::unexpected() which is invoked when a function throws an exception not listed in its exception specification. std::unexpected invokes a user-defined function that was registered by std::set_unexpected. If the user hasn't registered such a function, unexpected() will invoke std::terminate(), which in turn calls abort() and terminates the program. All these functions are declared in the standard header <except>.

Detecting the vptr's Location


The precise location of the vptr (the pointer to the class's table of virtual functions' addresses) is implementation-dependent. Some compilers, e.g., Visual C++ and C++ Builder, place it offset 0, before the user-declared data members. Other compilers, such as GCC and DEC CXX, place the vptr at the end of the class, after all the user-declared data members. Normally, you wouldn't care about the vptr's position. However, under certain conditions, for example, in applications that dump an object's content to a file so that it can be retrieved afterwards, the vptr's position matters. To detect it, first take the address of an object that. Then compare it to the address of the first data member of that object. If the two addresses are identical, it's likely that the vptr is located at the end. If, however, the member's address is higher than the object's address, this means that the vptr is located at the object's beginning. To detect where your compiler places the vptr, run the following program:

 
class A
{
public:
 virtual void f() {}
 int n;
};
 
int main()
{
 A a;
 char *p1=reinterpret_cast <char *> (&a);
 char *p2=reinterpret_cast <char *> (&a.n);
 if (p1==p2)
  cout<<"vptr is located at the object's 
end"<<endl;
 else
  cout<<"vptr is located at the object's 
beginning"<<endl;
}

Setting a New Handler


The function std::set_new_handler() installs a function to be called when the global operator new or the global operator new[] fail. By default, operator new throws an exception of type std::bad_alloc in the event of a failure (note that Visual C++ still retains the traditional version of new, which doesn't throw an exception in this case). set_new_handler() and its associated typedef's are declared in the header as follows:

 
typedef void (*new_handler) ();
new_handler set_new_handler( new_handler new_p );


set_new_hanlder() returns the address of the current handler and installs an overriding handler.

The following program overrides the default behavior of the global operators new and new[]. Instead of throwing an std::bad_alloc exception, it invokes the user-defined function my_handler():

 
#include 
using namespace std;
class my_exception{};
void my_hanlder()
{ 
 cerr << "allocation failure!" << endl;
 throw my_exception();
}
 
int main()
{
 set_new_handler(my_handler);
 try
 {
  int *p = new int[2000000000]; // will probably fail 
  delete[] p;//we get here only if the allocation succeeded
 }
 catch (my_exception & e)
 {
 //..deal with the exception
 }
}

 

No comments: