Last time we introduced the idea of pipelining, which allowed multiple instructions to progress through the processor at a time. But we saw some dangers that crept in. If this were politics, we would tell pipelining to apologize and resign. But we’re computer scientists, and we fix things rather than give up on them. Let’s walk through the issues we saw last time and talk about how to avoid them or lessen their impact.
First we saw data hazards. We looked at read-after-write, in which a later instruction relies on the result of an earlier instruction. If the earlier instruction hasn’t written back the result to the register file, the later instruction will read the wrong value. What can we do about it?
nop(no operation) instructions that pad out the stages or by letting the later instruction dwell a bit longer in some stages. Such delays are often called stalls or bubbles. Though they provide correctness of our computation, they undermine the performance advantages we hoped pipelining would provide.
nops but by other, independent instructions. There might be useful work that can be done while we are waiting. For this to work, the hardware will need to collect up a handful of instructions so it can have a few on hand when a data hazard appears. Consider this example:
add r1, r2, #1 lsl r3, r1, #2 mov r5, #57 mvn r4, r4 and r6, r6, r7That
r1, but the three instructions after it don’t have any interdepencies, so we might as well execute the program like this:
add r1, r2, #1 mov r5, #57 mvn r4, r4 and r6, r6, r7 lsl r3, r1, #2
We also saw control hazards. When we have a branch instruction, there are two possible next instructions. (If we have
mov pc, ?, there are a lot more than two choices!) Since we don’t know exactly what the program counter is going to be, we can’t schedule the next instruction in the pipeline. Is this a big deal? Yes, given that about 20% of instructions are branches! What do we do? Well, we have a few choices:
for y to 240 for x to 320 puts (x, y)On the inner loop, how many times is the prediction wrong? It’s wrong on the first iteration, because the last time that loop was run, the branch was not taken. It’s also wrong on the last iteration, because on the previous iteration, the branch was taken.
Next class we have a couple folks from Superion visiting to talk about FPGAs. So, this will be my last time at the front of the room. I just want to revisit what we’ve done this semester:
Why did we talk about all this stuff, which seems fairly removed from the activities of our internships and other classes? After all, critics of the computer-science-for-all initiatives argue that one does not need to understand how an engine works in order to drive across town. They may be right, but you are not just a driver, a user of a computer. You are making it do things it didn’t know how to do before you came along. For that to happen to its fullest, we have to crack open the hood.
Here’s your TODO list for next time:
See you next class!
P.S. Here’s the code we wrote together…