CS 488: Lecture 10 – Camera
Dear students:
All the renderers we’ve written so far have put a single object at the origin. The only thing we’ve done is spin that object around. It’s time to build up a larger world that the user can move around in. What new features of our graphics API do we need to construct an immersive world? None. We will use transforms to move the portion of the world that we want to see in the unit cube that WebGL expects. To make those transformations feel a little more natural, we will introduce the notion of a camera into our scenes.
Updated Pipeline
As we introduce the notion of a mobile viewer in our 3D scene, we will find some benefit in adding one more stage to our transformation pipeline. That stage is called eye space or camera space or view space. All told, our models will go through this sequence of spaces:
- model space: the space in which the model is designed
- world space: the space in which all the models are arranged
- eye space: the space in which the camera is at the origin
- clip/NDC space: the space inside the unit cube that WebGL projects to the viewport
- pixel space: the space inside the viewport
When we talk about perspective, clip space and NDC space will be distinguished. With orthographic projections, they are the same thing.
In eye space, the eye is at the center and the rest of the world is situated around it. Some of the calculations we’ll need for advanced lighting are easier and cheaper to do in eye space. The benefit we’ll see today is purely logical. It’s convenient to think about the camera being an entity in our scene.
In adding a space, we also need another matrix. To take a vertex from model coordinates all the way to clip/NDC space in the vertex shader, we’ll write:
gl_Position = eyeToClip * worldToEye * modelToWorld * vec4(position, 1.0);
In the future, I think I’ll swap the order of names in my matrices. This is easier to spot check:
gl_Position = clipFromEye * eyeFromWorld * worldFromModel * vec4(position, 1.0);
The modelToWorld
matrix situates a model within the larger 3D world. Each model in our scene might have its own modelToWorld
matrix. The worldToEye
matrix transforms world coordinates into a coordinate system where the eye is at the origin and looking along the negative z-axis. The eyeToClip
matrix transforms eye space coordinates relative to the unit cube that WebGL expects.
Camera
To ease the generation of the worldToEye
matrix, we will create an abstraction representing a camera or viewer in our scene. It will turn properties of the viewer into a transformation matrix that orients the world around the camera. What properties do we need to know about a viewer in our scenes? At least these:
- The location of the viewer.
- The direction the viewer is facing.
But these are not enough to uniquely identify a camera’s view of the world. There are an infinite number of possible viewers at some location looking in some direction. What more information do we need to know?
- Which way is up.
With these three pieces of information, we can form our matrix. The construction process relies on these amazing and not obvious properties of rotation matrices:
- The first row of a rotation matrix corresponds to the axis of incoming space that will become the x-axis of the outgoing space.
- The second row of a rotation matrix corresponds to the axis of incoming space that will become the y-axis of the outgoing space.
- The third row of a rotation matrix corresponds to the axis of incoming space that will become the z-axis of the outgoing space.
Let’s work through the interface of our Camera
class.
Constructor
When a client creates a camera, we’ll ask them to provide these parameters, which we’ll use to determine our three axes:
- The location of the camera, which we’ll call
from
. We hang on to this in an instance variable. - The focal point, which we’ll call
to
. We don’t need to hang on to this. - The world’s up direction, which we’ll call
worldUp
. We hang on to this in an instance variable.
Right off the bat we can figure out one of our axes. Surprisingly, it’s not the up axis. The world’s up vector is not always the same as the camera’s. As we walk up and down slopes, we tilt our view. Rather, it’s the z-axis we can figure out.
Recall that in WebGL the positive z-axis points out of the screen. What vector in world space points out at us? This one: (from - to).normalize()
. That’s the vector that starts at the focal point and points at the camera’s location.
Later we’re going to find it useful to have the inverse of this vector lying around, so we are going to store a flipped version of this direction:
this.forward = (to - from).normalize()
The forward vector is our focal direction. It tells us which way the camera is pointing. Its inverse -this.forward
is the third row of our matrix. There are still two more rows to figure out. We push the remaining logic into the orient
helper method, which we’ll call from the constructor and from some of our other methods that move the camera around the world.
Orient
In the orient
helper method, we know the camera’s location and forward direction and the world’s up axis. This information is enough figure out the other axes for our matrix.
What axis in the world points right and should become the x-axis in eye space? We cross our two vectors to figure that out:
this.right = this.forward.cross(this.worldUp).normalize();
What axis in the world do we want to become the y-axis? Well, we know two of the axes. And we know that all of the axes should be indepedent. We cross the two known axes to get the camera’s up vector:
this.up = this.right.cross(this.forward);
With all three axes figured out, we can construct a rotation matrix:
rotation = |this.right.x this.right.y this.right.z 0| |this.up.x this.up.y this.up.z 0| |-this.forward.x -this.forward.y -this.forward.z 0| |0 0 0 1|
Orienting the camera is not just a matter of rotation. We must also put the camera at the origin, which we achieve by subtracting off the camera’s location. Our complete orienting matrix is the product of a rotation and translation:
this.matrix = rotation * Matrix4.translate(-this.from.x, -this.from.y, -this.from.z)
The renderer will access this matrix and use it to convert world coordinates into eye space coordinates.
Strafe
Once we have a camera, we might as well give it some operations for moving around in the world. A common action is to strafe left or right, which we achieve by moving the camera’s location along its right axis:
this.from = this.from.add(this.right.scalarMultiply(distance))
After updating the camera’s location, we reorient the camera by calling this.orient
again.
Advance
Another common action is to move the camera forward or backward, which we achieve by moving the camera’s location along the forward axis:
this.from = this.from.add(this.forward.scalarMultiply(distance))
After updating the camera’s location, we reorient the camera.
Yaw
Besides moving the camera, we also turn it. In an airplane, a turn left or right is called a yaw. We yaw the camera by rotating the forward direction around the world’s up vector:
this.forward = Matrix4.rotateAroundAxis(this.worldUp, degrees) * this.forward;
In your implementation, make sure that your types match. If rotateAroundAxis
expects a Vector4
, you’ll need to convert this.forward
.
After updating the camera’s forward vector, we reorient the camera.
Pitch
In an airplane, a turn up or down is called a pitch. We pitch the camera by rotating the forward direction around the right vector:
this.forward = Matrix4.rotateAroundAxis(this.right, degrees) * this.forward;
After updating the camera’s forward vector, we reorient the camera.
Renderer
Inside our renderer, we make an instance of Camera
and initialize it. We need to update our vertex shader and uniforms to reflect our new matrix. To get the camera moving around, we’ll need event handlers for both the keyboard and mouse.
Keyboard
In our renderer, we implement WASD inputs to strafe and advance the camera:
window.addEventListener('keydown', event => {
if (event.key === 'a') {
camera.strafe(-0.1);
render();
} else if (event.key === 'd') {
camera.strafe(0.1);
render();
} else if (event.key === 'w') {
camera.advance(0.1);
render();
} else if (event.key === 's') {
camera.advance(-0.1);
render();
}
});
Some games use Q and E to yaw the camera, but we’ll use the mouse for that.
Mouse
First-person games that use the mouse typically hide the mouse. Otherwise the mouse runs into the walls of the viewport or monitor. JavaScript provides the PointerLock API to turn the mouse cursor off. When the cursor is hidden, the mouse events report the relative movement between events rather than the mouse’s absolute location. We use the x-delta to yaw the camera and the y-delta to pitch.
window.addEventListener('mousedown', () => {
document.body.requestPointerLock();
});
window.addEventListener('mousemove', event => {
if (document.pointerLockElement) {
camera.yaw(-event.movementX * 0.1);
camera.pitch(-event.movementY * 0.1);
render();
}
});
TODO
Here’s your TODO list:
- Start working on the Gyromesh project. The only part we haven’t talked about yet is specular lighting, which we’ll hit up on Wednesday. There’s a fair bit of work involved in this project, as you are integrating file parsing, normal generation, a trackball, and lighting. Give yourself plenty of time.
See you next time.
P.S. It’s time for a haiku!
Adam and Eve gamed
Guess their favorite genre
First-person shooters