Automatic Storage

One of the things I like about C++ is the ability to have the compiler create code for me that does actual work.

What do I mean? I am thinking about implicit conversions (wrapping) of data types and constructing/destructing data types when they go in/out of scope.

I will focus on the latter in this blog post, show how it can be used with Objective-C and how it can track leaks in C++ code.

RAII

Popular in C++ is the resource acquisition is initialization idiom. For example if we find ourself using pthread_mutex_t a lot then we may want to create this C++ type:

struct mutex_t
{
    mutex_t ()
    {
        pthread_mutexattr_t attr;
        pthread_mutexattr_init(&attr);
        pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
        pthread_mutex_init(&mutex, &attr);
        pthread_mutexattr_destroy(&attr);
    }

    ~mutex_t ()
    {
        pthread_mutex_destroy(&mutex);
    }

    operator pthread_mutex_t* ()
    {
        return &mutex;
    }

private:
    pthread_mutex_t mutex;
};

Whenever we need a mutex we then declare it like this:

mutex_t mutex;

Should we need to lock/unlock it, we can do:

pthread_mutex_lock(mutex);
⋮
pthread_mutex_unlock(mutex);

Notice that because our custom data type implements operator pthread_mutex_t* it can be given directly to the pthread calls which expect pthread_mutex_t*.

The obvious benefit of having this data type is of course that we don’t need to write five lines of code each time we need a recursive pthread_mutex_t, but another benefit is that we now leave cleanup entirely to the compiler.

The advantage of this is more apparent when we create types that use other types, for example imagine we have the following:

struct my_type_t
{
    my_type_t ();
    ~my_type_t ();
    ⋮

private:
    thread_t                thread;
    mutex_t                 mutex;
    std::vector<packet_t>   packets;
    ⋮
}

When constructing an instance of my_type_t, the mutex is automatically setup, and when my_type_t is destroyed, so is the mutex, the thread, the vector containing packets, all the packets the vector contained, etc.

A cool addition here is that this construction/destruction when going in/out of scope works with exceptions as well, and it even works for partially constructed types, so imagine we instantiate my_type_t, the compiler will generate code to first construct each member of the type, then construct the actual type. If any of the constructors throw an exception, it will properly destruct only those types which were constructed.

It’s a very simple system, i.e. construct when entering the scope, destruct when going out of scope, but almost all other languages require you to explicitly call a function to create the type, and if the language has garbage collection, you don’t know when the type will be destroyed, so there are limits to what you can have it do.

For example above we created our own mutex type, we can also create a lock type which locks the mutex when being constructed and unlocks it when destroyed. Now declaring such lock as a local variable in a block of code ensures the mutex is locked in that scope but only that scope. This can be very useful in code where the scope may have multiple exit points like the following:

void my_type_t::do_work ()
{
    lock_t lock(mutex); // lock the mutex
    if(should_abort())
        return;         // we return prematurely
    ⋮
}

Objective-C

How does this fit into Objective-C? Well, it doesn’t. But as you may know, it is possible to give your source files mm as extension to enable C++ in Objective-C, this hybrid is referred to as Objective-C++.

You can say what you want, but C++ has its merits even in Objective-C. For example if we create a custom view that has to manage tracking rectangles.

When you add such tracking rectangle you get back an NSTrackingRectTag type, which is a primitive type (i.e. you can’t store it in an NSMutableArray) which you have to keep somewhere (if you plan to ever remove that tracking rectangle).

So a convenient way is to have std::vector<NSTrackingRectTag> part of the class instance data. But can you do that? Well, starting with 10.4 you actually can declare non-POD types as part of your instance data if you add -fobjc-call-cxx-cdtors to the compiler options.

Here is what the GCC manual says about how the option works:

For each Objective-C class, check if any of its instance variables is a C++ object with a non-trivial default constructor. If so, synthesize a special "- (id) .cxx_construct" instance method that will run non-trivial default constructors on any such instance variables, in order, and then return "self". Similarly, check if any instance variable is a C++ object with a non-trivial destructor, and if so, synthesize a special "- (void) .cxx_destruct" method that will run all such default destructors, in reverse order.

The "- (id) .cxx_construct" and/or "- (void) .cxx_destruct" methods thusly generated will only operate on instance variables declared in the current Objective-C class, and not those inherited from superclasses. It is the responsibility of the Objective-C runtime to invoke all such methods in an object’s inheritance hierarchy. The "- (id) .cxx_construct" methods will be invoked by the runtime immediately after a new object instance is allocated; the "- (void) .cxx_destruct" methods will be invoked immediately before the runtime deallocates an object instance.

As of this writing, only the NeXT runtime on Mac OS X 10.4 and later has support for invoking the "- (id) .cxx_construct" and "- (void) .cxx_destruct" methods.

Tracking Leaks

As shown above, we can execute code when a type goes in/out of scope. By adding a dummy type to all our objects/structures, we can have the constructor/destructor for this dummy type called whenever the type it is part of, goes in/out of scope.

This is the key to how we can track leaks. If we let the dummy type increase/decrease a counter, we will know if anything was leaked, if that counter is not zero during exit. Unfortunately we will not know what exactly leaked.

By giving the dummy type an argument, we can have it use that as key for which counter to increase/decrease, but then the leak-tracking becomes rather intrusive, since all constructors need to pass such argument to the dummy type, and ideally this should be at most one line added to each object for which we want to track the leak count (so we can easily #define it to the empty string when compiling with NDEBUG).

Another approach is to make our dummy type a template type and give it a different template argument for each object it is part of. We can achieve this by templating it on the object it is part of.

This effectively creates a new type per object we add it to, and we can then use its type id or similar as key for which counter it should manage.

Since I don’t use RTTI I don’t have a type id, so I constructed my dummy type like this:

std::map<std::string, size_t> count;

template <typename T>
struct dummy_t
{
    dummy_t ()                   { ++count[T::name()]; }
    dummy_t (dummy_t const& rhs) { ++count[T::name()]; }
    ~dummy_t ()                  { --count[T::name()]; }
}

Now to add tracking for my_type_t we need to add both an instance of this type and a static name member function, e.g.:

struct my_type_t
{
    static std::string name () { return "my_type_t"; }
    dummy_t<my_type_t> dummy;
    ⋮
}

We can do a macro for this, e.g.:

#define WATCH_LEAKS(type) dummy_t<type> dummy; \
        static std::string name () { return #type; }

So that it just becomes:

struct my_type_t
{
    WATCH_LEAKS(my_type_t);
    ⋮
}

Of course if you actually do use this, change the names to something which is less likely to clash with actual members, and add a mutex to the instance counts map if you are using multi-threaded code.

One More Thing

Now that we have a map of instance counts, we need to actually check it at exit.

Remember how a type gets its destructor called when it goes out of scope? Well, that also applies to global variables. So adding something like this (at the global scope) will print the names of all types with an instance count above zero, when the program terminates:

struct check_counts_t
{
    ~check_counts_t ()
    {
        iterate(it, count)
        {
            if(it->second != 0)
                fprintf(stderr, "%s has instance count of %zd\n", it->first.c_str(), it->second)
        }
    }

} check_counts;

SIGPIPE 13