teaching machines

CS 330 Lecture 17 – Heap of Trouble

March 6, 2017 by . Filed under cs330, lectures, spring 2017.

Dear students,

Let’s start with a What Does This Do! Absorb this code into your brain, and predict its output. After a minute of silent reflection, we’ll discuss.

void mkfoo(char *s) {
  s = malloc(sizeof(char) * 4);
  strcpy(s, "foo");
}

int main(int argc, char **argv) {
  char *text = "slar";
  mkfoo(text);
  printf("%s", text);
  return 0;
}

Why doesn’t this code behave as we think it should? Because C and C++ are call-by-value languages. A better term would be pass-a-copy. mkfoo‘s s is different memory than main‘s text. They are initially both keys to the same door ("slar"), but mkfoo mutilates one to open a different door ("foo"). These pass-a-copy semantics will resurface later in our discussion.

Let’s talk about another one of the many choices that C++ gives us. In C++, we get to choose where to place our data: on the stack or on the heap.

When should we use stack-allocated memory? Most of the time. It is cheaper to allocate, is automatically reclaimed at the end of the function, and promotes good cache hygiene.

When should we use heap-allocated memory? When a chunk of non-primitive memory has to outlive the function that creates it. For example, if we create a ReallyBigImage inside a function, what will happen in the following code?

class ReallyBigImage {
  public:
    ReallyBigImage() {
      std::cout << "ctor" << std::endl; 
    }

    ~ReallyBigImage() {
      std::cout << "dtor" << std::endl; 
    }

  private:
    int pixels[100 * 200 * 3]; 
};

ReallyBigImage generate() {
  ReallyBigImage img;
  return img;
}

int main(int argc, char **argv) {
  ReallyBigImage img = generate();
}

Under the hood, C++ may invoke a special constructor called the copy constructor that creates a second instance of ReallyBigImage and shallowly copies over all the instance variables. Then it calls the deconstructor on generate‘s image. That’s unnecessarily expensive. We just want generate‘s image to live beyond the end of generate. That’s where the heap comes in. The heap is not bound up in the lifecycle of functions like the stack is.

ReallyBigImage *generate() {
  ReallyBigImage *img = new ReallyBigImage();
  return img;
}

int main(int argc, char **argv) {
  ReallyBigImage *img = generate();
  delete img;
}

The copy that happens in this code is a mere 8 bytes—the address.

In reality, a good compiler will recognize that the copy is unnecessary and will just have generate construct the ReallyBigImage in the memory of the caller. But you shouldn’t rely on compiler optimizations. Also, C++11 introduce a notion of move semantics that turns this compiler optimization into standardized behavior. The advances of C++ have greatly diminished the need for actively thinking about the heap, but it has not gone away entirely.

What about needing memory addresses of our data? Maybe a library function needs a pointer to our data. Is that a reason to use the heap? Not by itself, it isn’t. Working with memory doesn’t mean that the memory needs to be on the heap. Stack is memory too. Suppose for some reason you need to get both the quotient and remainder of a division operation. We might use pointers to get around the fact that C and C++ only allow a single return value:

void divmod(int numerator, int denominator, int *quotient, int *remainder) {
  *quotient = numerator / denominator;
  *remainder = numerator % denominator;
}
The caller of this code can just as easily use stack variables to hold the out parameters:
int main(int argc, char **argv) {
  int row, col;
  int width = 3;
  divmod(7, width, &row, &col);
  return 0;
}

However, there’s a better way to write this code: references. These give you the cheap sharing that pointers afford you, without the indirection:

void divmod(int numerator, int denominator, int &quotient, int &remainder) {
  quotient = numerator / denominator;
  remainder = numerator % denominator;
}

int main(int argc, char **argv) {
  int row, col;
  int width = 3;
  divmod(7, width, row, col);
  return 0;
}

So, really the only reason to use the heap is when we want to decouple the lifetime of our data from the function that created it. It’s unlikely that we write clean code to solve interesting problems without putting some data on the heap. So, we should at least make using the heap as pleasant as possible. In languages like Java, Ruby, Python, Javascript, and so on, we use the heap all the time. What makes our heap use so pleasant in these languages? We don’t have to worry about managing the lifecycle of heap data. An automatic garbage collector will do that for us.

The general idea behind all garbage collection algorithms is this: when heap memory becomes inaccessible, when no pointers to it are in still in scope anywhere, the memory may be reclaimed.

So, can we implement such an algorithm in C++ to automatically reclaim our heap memory. Yes. We. Can. And we will use templates to do it! We’ll create a class that wraps around a pointer to the heap:

template<typename T>
class Autofree {
  public:
    Autofree(T *pointer) :
      pointer(pointer) {
    }

  private:
    T *pointer;
};

As soon as we put some data on the heap, we will wrap it up in an Autofree instance. We formerly did this:

string *war_and_peace = new string("Well, Prince, Genoa and Lucca are now...");

But now we do this:

Autofree<string> war_and_peace = Autofree(new string("Well, Prince, Genoa and Lucca are now..."));

At this point, we could just add a destructor:

~Autofree() {
  delete pointer;
}

What does valgrind report? Success! But what happens with this code?

Autofree<string> war_and_peace = Autofree(new string("Well, Prince, Genoa and Lucca are now..."));
Autofree<string> copy = war_and_peace;

Autofree must be careful not to free the memory until all copies of it have gone out of circulation. Therefore, it must actively count how many copies of this pointer are in circulation. At the time an Autofree is born, exactly one copy exists:

template<typename T>
class Autofree {
  public:
    Autofree(T *pointer) :
      pointer(pointer),
      nreferences(new int(1)) {
    }

  private:
    T *pointer;
    int *nreferences;
};

The counter has to be dynamically allocated so that—just like the heap memory itself—all copies can share it.

But how do we grow the number of references? We have to find hooks for all the ways copies of an Autofree can be made. Consider this code again:

Autofree<string> war_and_peace = Autofree(new string("Well, Prince, Genoa and Lucca are now..."));
Autofree<string> copy = war_and_peace;

The compiler turns the second line into a call to a copy constructor, which is kind of like clone from the world of Java. A default copy constructor is implicitly provided by the compiler. It does a shallow copy. That’s almost what we want here, but not quite. Let’s define it explicitly to also increment the number of references:

Autofree(const Autofree<T> &that) :
  pointer(that.pointer),
  nreferences(that.nreferences) {
  ++*nreferences;
}

Now we can add in our destructor:

template<typename T>
class Autofree {
  public:
    Autofree(T *pointer) :
      pointer(pointer),
      nreferences(new int(1)) {
    }

    ~Autofree() {
      --*nreferences;
      if (*nreferences == 0) {
        delete pointer;
        delete nreferences;
      }
    }

  private:
    T *pointer;
    int *nreferences;
}

What does valgrind say now?

Okay, this is nice as far as memory management goes. But is it still usable? Does this code work?

std::cout << war_and_peace->length() << std::endl;
std::cout << *war_and_peace << std::endl;

No. Autofree has locked up its pointer, and these operations don’t make sense. However, both the arrow and dereference operators can be overloaded for custom classes. To provide consistent semantics, the arrow operator should return a pointer and the dereference operator should return the value to which the pointer points:

T *operator->() const {
  return pointer;
}

T &operator*() const {
  return *pointer;
}

Okay, are we done? No. We need to think about other ways that copies can go into or out of circulation. How about this sort of thing?

Autofree<int> a = Autofree<int>(new int(1));
Autofree<int> b = Autofree<int>(new int(2));
b = a;

We get leaks, which we’ll have to fix.

That third line isn’t going to generate a constructor call. The memory of b has already been initialized. Instead, this turns into a call to the operator= method, which we must overload to do the right thing. What is the right thing? Well, we gain a reference to whatever a pointed at, but we lose a reference to whatever b originally pointed at. So, we’ll have to reclaim memory sometimes even though an object hasn’t gone of out scope. Let’s factor out the work of our destructor into a private method:

~Autofree() {
  Decirculate();
}

void Decirculate() {
  --*nreferences;
  if (*nreferences == 0) {
    delete pointer;
    delete nreferences;
  }
}

Now we can write operator=:

void operator=(Autofree<T> &that) {
  Decirculate(); 
  this->pointer = that.pointer;
  this->nreferences = that.nreferences;
  ++*nreferences;
}

This takes care of the memory leaks. But what about this?

Autofree<int> a = Autofree<int>(new int(1));
Autofree<int> b = Autofree<int>(new int(2));
Autofree<int> c = Autofree<int>(new int(3));
c = b = a;

C++ generally allows the = operator to chain together. We should support this too:

Autofree<T> &operator=(Autofree<T> &that) {
  Decirculate(); 
  this->pointer = that.pointer;
  this->nreferences = that.nreferences;
  ++*nreferences;
  return *this;
}

Now how about this code?

Autofree<int> a = Autofree<int>(new int(1));
a = a;

The left-hand side gets decirculated so it can take on the right-hand side. But they’re the same! Whoops. We should prevent this by asserting that this and that are different:

Autofree<T> &operator=(Autofree<T> &that) {
  if (this != &that) {
    Decirculate(); 
    this->pointer = that.pointer;
    this->nreferences = that.nreferences;
    ++*nreferences;
  }
  return *this;
}

Whew. What if we throw some Autofree objects into a vector?

vector<Autofree<int>> nums;
for (int i = 0; i < 10; ++i) {
  Autofree<int> num = Autofree<int>(new int(i));
  nums.push_back(num);
}

It works great! No memory leaks. How about passing an Autofree to a method?

void foo(Autofree<int> param) {
  std::cout << *param << std::endl;
}

int main(int argc, char **argv) {
  Autofree<int> n(new int(17));
  foo(n);
  return 0;
}

C++ is a call-by-value language, so foo gets a copy of n, which is built by invoking the copy constructor. So, there are no memory leaks here either.

This implementation looks pretty robust, but writing your own pointer manager is not recommended by most professionals. The C++ standard library provides a professional implementation that makes better use of the cache and that avoids issues that we have almost certainly overlooked. Like exceptions being thrown. Or multithreading. We implement it ourselves to get a feel for how the library implementations work and to better understand the lifecycle of our data.

I will not be here the rest of the week. I will be at SIGCSE 2017 to learn about how to better grow your brains. When I return, we will explore a radically different view of programming through the language Haskell.

Here’s your TODO list:

See you next time!

Sincerely,