teaching machines

CS 330: Lecture 18 – Memory

March 14, 2018 by . Filed under cs330, lectures, spring 2018.

Dear students,

We’ve been talking about memory a fair bit lately, and I think it’s helpful that we have a clear picture of what’s happening when we run an executable. So, let’s have a quick discussion. When we double-click on a program or invoke it in the shell, the machine code for that program gets pulled from disk and turned into a process. Each process is given a chunk of RAM to work with. From the process’s perspective, its RAM footprint looks like this:

address content
high addresses stack
heap
uninitialized global data (bss)
initialized global data
low addresses code/text segment
0 NULL

To execute the program, the processor’s instruction pointer register is set to wherever main appears in the text segment.

Different chunks of memory are managed differently. The text segment tends not to be writable. The heap must be managed explicitly by the programmer in languages like C and C++. A service called the garbage collector will manage the heap in other languages. Doing anything to address 0 will result in a segmentation fault.

Static variables, those that are owned by a class and not a particular instance, go in the global data sections. Instance variables go in the heap or stack, depending on where the object is instantiated.

When a function is invoked, its parameters get pushed onto the stack automatically. The current instruction pointer—the return address—is also pushed onto the stack. The instruction pointer jumps to the function’s code in the text segment. When the function completes, the stack memory is automatically reclaimed when a function returns. The return address is popped off the stack and is used to restore the instruction pointer.

Stack overflow exploits often put malicious code into memory somewhere. Then they overflow fixed size buffers on the stack and alter the return address that was pushed on earlier. They set the return address to the location of their malicious code and carry out their nefarious deed.

Okay, with that background, we can have a better discussion about memory management in C++.

Last time we started exploring a polymorphic hierarchy for functions. Here was a supertype:

class QuadraticFunction {
  public:
    QuadraticFunction(double a,
                      double b,
                      double c) :
      a(a),
      b(b),
      c(c) {
    }

    double operator()(double x) const {
      return a * x * x + b * x + c;
    }

    string toString() const {
      stringstream ss;
      ss << a << " * x^2 + " << b << " * x + " << c;
      return ss.str();
    }

  protected:
    double a, b, c; 
};

And a subtype:

class LinearFunction : public QuadraticFunction {
  public:
    LinearFunction(double a,
                   double b) :
      QuadraticFunction(0, a, b) {
    }

    double operator()(double x) const {
      return b * x + c;
    }

    string toString() const {
      stringstream ss;
      ss << b << " * x + " << c;
      return ss.str();
    }
};

Let’s extend our polymorphic hierarchy by adding one more subtype, a DerivativeFunction:

class DerivativeFunction : public Function {
  public:
    DerivativeFunction(const Function &f) :
      f(f) {
    }

    double operator()(double x) const {
      const double DELTA = 0.001;
      return (f(x + DELTA) - f(x - DELTA)) / (2 * DELTA);
    }

  private:
    const Function &f;
};

A real derivative would examine the change in x as y gets infinitesimally small. Let’s pretend we’re happy with finite differencing. The DerivativeFunction class becomes a second conductor. It doesn’t know what sort of Function for which it’s computing the derivative.

In our main, we can get the derivative plotted without modifying the calculator in any way (or shape or form!):

DerivativeFunction fprime(f);
...
calc.add(&fprime);

We also started a round of What Does This Do:

In the last of these, the static type is QuadraticFunction even though the actual, underlying, runtime type is LinearFunction. What will happen when we call toString? In Java, we’ll invoke the subtype’s method. In C++, by default, we will get the static type’s method. To make C++ behave like Java, we must explicitly mark overrideable methods as virtual.

Why on earth should we have to do this? Why shouldn’t virtualness be the default behavior? Well, C++ is a language concerned with performance. When the compiler sees a call to a non-virtual method, it can figure out at compile time what the program counter should jump to. But in a polymorphic hierarchy with virtual functions, we’re not exactly certain what kind of object we have.

Consider method conduct:

void conduct(const Instrument &instrument) {
  instance.emlouden();
}

Where will this jump to? Tuba‘s emlouden? Viola‘s? Kazoo‘s? We just don’t know at compile time. This decision has to be made at runtime, and is therefore more expensive. The C++ designers believe that you shouldn’t have to pay for what you don’t need, so virtual is not the default. You have to sign up to get punched in the face.

For subtype polymorphism to work in languages like Java and C++, we need some for way for each object to carry around some information to help us figure out what method to call at runtime. What exactly is that information and how does the runtime decision—the dynamic dispatch—get made? Computer graphics pioneer Jim Blinn wrote a column in 1990s when C++ was gaining in popularity, and he dismissed polymorphism, saying he’d doing something like that in Fortran for the longest time. Here’s his polymorphism in Fortran:

Subroutine DRAW is the conductor code. It walks through a conditional ladder, inspects a type tag, and dispatches accordingly. This looks a lot like our dynamic typing system we wrote in C.

If this was how dynamic dispatch was implemented, its designer would have been eaten alive on Reddit and Stack Overflow. Why? For one, the conductor code is full of conditional statements. We saw in computer architecture that conditional statements thwart pipelining. For two, every time we add a new type to our hierarchy, we have to go back to the conductor code and update it. We’d really like to avoid touching code once it’s working. We tend to mess stuff up. There’s that saying, “separate that which changes from that which doesn’t.” This is why we make battery compartments. This is why we use domain names instead of IP addresses. This is why oil filters are always in locations that are easy to access. (Right?)

Thankfully, dynamic dispatch is implemented differently.

Here’s your TODO list for next time:

See you then!

Sincerely,

P.S. Here’s the code we wrote together:

dennis.cpp

#include <iostream>
#include <cmath>
#include <ncurses.h>
#include <vector>
#include <sstream>
#include <string>

using std::vector;
using std::stringstream;
using std::string;

class Function {
  public:
    virtual double evaluate(double x) const = 0;
};

class QuadraticFunction : public Function {
  public:
    QuadraticFunction(double a,
                      double b,
                      double c) :
      a(a),
      b(b),
      c(c) {
    }

    double evaluate(double x) const {
      return a * x * x + b * x + c;
    }

    virtual string toString() const {
      stringstream ss;
      ss << a << " * x^2 + " << b << " * x + " << c;
      return ss.str();
    }

  protected:
    double a, b, c; 
};

class LinearFunction : public QuadraticFunction {
  public:
    LinearFunction(double slope,
                   double intercept) :
      QuadraticFunction(0, slope, intercept) {
    }

    double evaluate(double x) const {
      return b * x + c;
    }

    string toString() const {
      stringstream ss;
      ss << b << " * x + " << c;
      return ss.str();
    }

  private:
};

class DerivativeFunction : public Function {
  public:
    DerivativeFunction(Function *f) :
      f(f) {
    }

    double evaluate(double x) const {
      const double DELTA = 1;
      double y2 = f->evaluate(x + DELTA);
      double y1 = f->evaluate(x - DELTA);
      double rise = y2 - y1;
      std::cerr << "rise: " << rise << std::endl;
      double run = DELTA * 2;
      std::cerr << "run: " << run << std::endl;
      return rise / run;
    }

  private:
    Function *f;
};

class SineFunction : public Function {
  public:
    SineFunction(double amplitude,
                 double frequency) :
      amplitude(amplitude),
      frequency(frequency) {
    }

    double evaluate(double x) const {
      return amplitude * sin(frequency * x);
    }

  private:
    double amplitude;
    double frequency;
};

class Calc {
  public:
    Calc() {
      initscr();
      clear();
      getmaxyx(stdscr, height, width);
    }

    void plot() {

      for (int x = 0; x < width; ++x) {
        mvprintw(height / 2, x, "-");
      }

      for (int y = 0; y < height; ++y) {
        mvprintw(y, width / 2, "|");
      }

      mvprintw(height / 2, width / 2, "+");

      for (auto f : fs) {
        for (int x = -width / 2; x < width / 2; ++x) {
          double y = f->evaluate(x);
          mvprintw((int) (-y + height / 2), (int) (x + width / 2), "*");
        }
      }

      getch();
      endwin();
    }

    void add(Function *f) {
      fs.push_back(f);
    }

  private:
    vector<Function *> fs;

    int width;
    int height;
};

int main(int argc, char **argv) {
  LinearFunction f(1, 4);
  DerivativeFunction g(&f);
  /* SineFunction g(5, 0.1); */

  QuadraticFunction x_squared(1, 0, 0);
  std::cout << x_squared.toString() << std::endl;
  
  LinearFunction x_plus_5(1, 5);
  std::cout << x_plus_5.toString() << std::endl;

  QuadraticFunction &ref = x_plus_5;
  std::cout << ref.toString() << std::endl;

  Calc c; 
  c.add(&f);
  c.add(&g);
  c.plot();

  return 0;
}