teaching machines

CS 330 Lecture 16 – Parametric Polymorphism

March 3, 2017 by . Filed under cs330, lectures, spring 2017.

Dear students,

Don’t get me started on arrays. What sort of world is this where an array is fixed in size, but yet it doesn’t know what that size is? We must fix this. Let’s create a growable array:

class Flexray {
  public:
    Flexray() :
      capacity(10),
      nitems(0),
      elements(new int[capacity]) {
    }

    ~Flexray() {
      delete[] elements;
    }

    void operator<<(int item) {
      if (nitems == capacity) {
        grow();
      }
      elements[nitems] = item;
      ++nitems;
    }

    int size() const {
      return nitems;
    }

    int &operator[](int i) {
      // TODO bounds check
      return elements[i];
    }

    const int &operator[](int i) const {
      // TODO bounds check
      return elements[i];
    }

  private:
    void grow() {
      int new_capacity = 2 * capacity;
      int *new_elements = new int[new_capacity];
      memcpy(new_elements, elements, sizeof(int) * capacity); 
      delete[] elements;
      capacity = new_capacity;
      elements = new_elements;
    }

    int capacity;
    int nitems;
    int *elements;
}

Okay, great. Now the boss tells us we need a growable array of strings. Then a growable array of stringstreams. Then a growable array of std::out_of_range. What are we going to do? Copy and paste and tweak types? No.

There is actually more than one answer to this question. We’ve already seen one vehicle for reusing code: subtype polymorphism. We could just have our collection appeal to the supertype. This is how Java’s collections worked when I first learned it. Collections were written in terms of Object, like this:

public class Flexray {
  private int capacity;
  private int nitems;
  private Object[] elements;

  public Flexray() {
    capacity = 10;
    nitems = 0;
    elements = new Object[capacity];
  }

  public void add(Integer item) {
    if (nitems == capacity) {
      grow();
    }
    elements[nitems] = item;
    ++nitems;
  }

  public int size() {
    return nitems;
  }

  public Object get(int i) {
    return elements[i];
  }

  private void grow() {
    int newCapacity = 2 * capacity;
    Object[] newElements = new Object[newCapacity];
    System.arraycopy(elements, 0, newElements, 0, nitems);
    capacity = newCapacity;
    elements = newElements;
  }
}

Now we can make a list that works for any type, right? Because everything’s an Object. Sadly, a generalizing polymorphic solution suffers from a couple of drawbacks:

A better fix is to define Flexray once but let Flexray‘s type come in as a parameter. This rounds out our four forms of polymorphism:

In C++, we decorate our class with a template and replace all references to int elements to T elements:

template<typename T>
class Flexray {
  public:
    Flexray() :
      capacity(10),
      nitems(0),
      elements(new T[capacity]) {
    }

    ~Flexray() {
      delete[] elements;
    }

    void operator<<(T item) {
      if (nitems == capacity) {
        grow();
      }
      elements[nitems] = item;
      ++nitems;
    }

    int size() const {
      return nitems;
    }

    T &operator[](int i) {
      // TODO bounds check
      return elements[i];
    }

    const T &operator[](int i) const {
      // TODO bounds check
      return elements[i];
    }

  private:
    void grow() {
      int new_capacity = 2 * capacity;
      T *new_elements = new T[new_capacity];
      memcpy(new_elements, elements, sizeof(T) * capacity); 
      delete[] elements;
      capacity = new_capacity;
      elements = new_elements;
    }

    int capacity;
    int nitems;
    T *elements;
}

A templated class or function isn’t actually compilable code. It’s just a pattern. Only when the compiler sees a reference to the templated construct with the type value does it go and create a specialization of that pattern. If it sees another reference with a different type, it generates a completely different specialization for just that type. So, we might have 20 different versions of Flexray, all catering to elements of different types. The source code will have just one definition, but the machine code will have 20 different representations.

Java handles templates differently. The source code looks mostly the same:

public class Flexray<T> {
  private int capacity;
  private int nitems;
  private T[] elements;

  public Flexray() {
    capacity = 10;
    nitems = 0;
    elements = (T[]) new Object[capacity];
  }

  public void add(T item) {
    if (nitems == capacity) {
      grow();
    }
    elements[nitems] = item;
    ++nitems;
  }

  public int size() {
    return nitems;
  }

  public T get(int i) {
    return elements[i];
  }

  private void grow() {
    int newCapacity = 2 * capacity;
    T[] newElements = (T[]) new Object[newCapacity];
    System.arraycopy(elements, 0, newElements, 0, nitems);
    capacity = newCapacity;
    elements = newElements;
  }
}

The designers decided that they didn’t want 20 different machine code representations. They decided that they would use the generic parameters only for type checking and not for code generation. The generic type is erased completely from the single machine code representation and is replaced with a general supertype—either Object or the generic type’s upper bound. Their design suffers from a couple of issues:

Hey, at least we fixed the type checking issues!

The most common application of templates is to create reusable collections. But in C++ we can also pass the compiler a value to help it specialize the templated pattern. For example, suppose we want lightning fast multiplication. Let’s take a first stab, which is anything but lightning fast:

int multiply(int a, int n) {
  int sum = 0;
  for (int i = 0; i < n; ++i) {
    sum += a;
  }
  return sum;
}

Compare this to a templated version of multiply. Here one of the coefficients comes in as a template parameter:

template<int n>
int multiply(int a) {
  int sum = 0;
  for (int i = 0; i < n; ++i) {
    sum += a;
  }
  return sum;
}

Now consider this function that invokes the two multiplys:

void foo() {
  multiply(2, 7);
  multiply<2>(7);
}

What will the compiler do in the first case? It will generate very general machine code that invokes the multiplication hardware through a multiplication instruction, which tends to consume more cycles than addition—though they are close on modern architectures.

In the second case, the compiler is equipped with more information. It knows one of the operands. That means it can tailor the machine code it generates to the situation at hand. It may be able to avoid the multiplication unit! Let’s verify this using Matt Godbolt’s Compiler Explorer.

What else can we do with templates? We can eliminate memory leaks. But that’s a story for another day.

Sincerely,

P.S. It’s Haiku Friday!

First it was New York
Now you can ♥ anything
‘Cuz they are T-shirts