teaching machines

CS 488: Lecture 10 – Camera

March 1, 2021 by . Filed under graphics-3d, lectures, spring-2021.

Dear students:

All the renderers we’ve written so far have put a single object at the origin. The only thing we’ve done is spin that object around. It’s time to build up a larger world that the user can move around in. What new features of our graphics API do we need to construct an immersive world? None. We will use transforms to move the portion of the world that we want to see in the unit cube that WebGL expects. To make those transformations feel a little more natural, we will introduce the notion of a camera into our scenes.

Updated Pipeline

As we introduce the notion of a mobile viewer in our 3D scene, we will find some benefit in adding one more stage to our transformation pipeline. That stage is called eye space or camera space or view space. All told, our models will go through this sequence of spaces:

When we talk about perspective, clip space and NDC space will be distinguished. With orthographic projections, they are the same thing.

In eye space, the eye is at the center and the rest of the world is situated around it. Some of the calculations we’ll need for advanced lighting are easier and cheaper to do in eye space. The benefit we’ll see today is purely logical. It’s convenient to think about the camera being an entity in our scene.

In adding a space, we also need another matrix. To take a vertex from model coordinates all the way to clip/NDC space in the vertex shader, we’ll write:

gl_Position = eyeToClip * worldToEye * modelToWorld * vec4(position, 1.0);

In the future, I think I’ll swap the order of names in my matrices. This is easier to spot check:

gl_Position = clipFromEye * eyeFromWorld * worldFromModel * vec4(position, 1.0);

The modelToWorld matrix situates a model within the larger 3D world. Each model in our scene might have its own modelToWorld matrix. The worldToEye matrix transforms world coordinates into a coordinate system where the eye is at the origin and looking along the negative z-axis. The eyeToClip matrix transforms eye space coordinates relative to the unit cube that WebGL expects.


To ease the generation of the worldToEye matrix, we will create an abstraction representing a camera or viewer in our scene. It will turn properties of the viewer into a transformation matrix that orients the world around the camera. What properties do we need to know about a viewer in our scenes? At least these:

But these are not enough to uniquely identify a camera’s view of the world. There are an infinite number of possible viewers at some location looking in some direction. What more information do we need to know?

With these three pieces of information, we can form our matrix. The construction process relies on these amazing and not obvious properties of rotation matrices:

Let’s work through the interface of our Camera class.


When a client creates a camera, we’ll ask them to provide these parameters, which we’ll use to determine our three axes:

Right off the bat we can figure out one of our axes. Surprisingly, it’s not the up axis. The world’s up vector is not always the same as the camera’s. As we walk up and down slopes, we tilt our view. Rather, it’s the z-axis we can figure out.

Recall that in WebGL the positive z-axis points out of the screen. What vector in world space points out at us? This one: (from - to).normalize(). That’s the vector that starts at the focal point and points at the camera’s location.

Later we’re going to find it useful to have the inverse of this vector lying around, so we are going to store a flipped version of this direction:

this.forward = (to - from).normalize()

The forward vector is our focal direction. It tells us which way the camera is pointing. Its inverse -this.forward is the third row of our matrix. There are still two more rows to figure out. We push the remaining logic into the orient helper method, which we’ll call from the constructor and from some of our other methods that move the camera around the world.


In the orient helper method, we know the camera’s location and forward direction and the world’s up axis. This information is enough figure out the other axes for our matrix.

What axis in the world points right and should become the x-axis in eye space? We cross our two vectors to figure that out:

this.right = this.forward.cross(this.worldUp).normalize();

What axis in the world do we want to become the y-axis? Well, we know two of the axes. And we know that all of the axes should be indepedent. We cross the two known axes to get the camera’s up vector:

this.up = this.right.cross(this.forward);

With all three axes figured out, we can construct a rotation matrix:

rotation = |this.right.x     this.right.y     this.right.z     0|
           |this.up.x        this.up.y        this.up.z        0|
           |-this.forward.x  -this.forward.y  -this.forward.z  0|
           |0                0                0                1|

Orienting the camera is not just a matter of rotation. We must also put the camera at the origin, which we achieve by subtracting off the camera’s location. Our complete orienting matrix is the product of a rotation and translation:

this.matrix = rotation * Matrix4.translate(-this.from.x, -this.from.y, -this.from.z)

The renderer will access this matrix and use it to convert world coordinates into eye space coordinates.


Once we have a camera, we might as well give it some operations for moving around in the world. A common action is to strafe left or right, which we achieve by moving the camera’s location along its right axis:

this.from = this.from.add(this.right.scalarMultiply(distance))

After updating the camera’s location, we reorient the camera by calling this.orient again.


Another common action is to move the camera forward or backward, which we achieve by moving the camera’s location along the forward axis:

this.from = this.from.add(this.forward.scalarMultiply(distance))

After updating the camera’s location, we reorient the camera.


Besides moving the camera, we also turn it. In an airplane, a turn left or right is called a yaw. We yaw the camera by rotating the forward direction around the world’s up vector:

this.forward = Matrix4.rotateAroundAxis(this.worldUp, degrees) * this.forward;

In your implementation, make sure that your types match. If rotateAroundAxis expects a Vector4, you’ll need to convert this.forward.

After updating the camera’s forward vector, we reorient the camera.


In an airplane, a turn up or down is called a pitch. We pitch the camera by rotating the forward direction around the right vector:

this.forward = Matrix4.rotateAroundAxis(this.right, degrees) * this.forward;

After updating the camera’s forward vector, we reorient the camera.


Inside our renderer, we make an instance of Camera and initialize it. We need to update our vertex shader and uniforms to reflect our new matrix. To get the camera moving around, we’ll need event handlers for both the keyboard and mouse.


In our renderer, we implement WASD inputs to strafe and advance the camera:

window.addEventListener('keydown', event => {
  if (event.key === 'a') {
  } else if (event.key === 'd') {
  } else if (event.key === 'w') {
  } else if (event.key === 's') {

Some games use Q and E to yaw the camera, but we’ll use the mouse for that.


First-person games that use the mouse typically hide the mouse. Otherwise the mouse runs into the walls of the viewport or monitor. JavaScript provides the PointerLock API to turn the mouse cursor off. When the cursor is hidden, the mouse events report the relative movement between events rather than the mouse’s absolute location. We use the x-delta to yaw the camera and the y-delta to pitch.

window.addEventListener('mousedown', () => {

window.addEventListener('mousemove', event => {
  if (document.pointerLockElement) {
    camera.yaw(-event.movementX * 0.1);
    camera.pitch(-event.movementY * 0.1);


Here’s your TODO list:

See you next time.


P.S. It’s time for a haiku!

Adam and Eve gamed
Guess their favorite genre
First-person shooters