V22.0436 - Prof. Grishman
Lecture 16: Pipelining -- Some details
The pipeline registers have to hold all the information required
to complete execution of the instruction. This includes both data
and control information. Fig. 4.46 shows
the processor wiith the control signals it requires; Fig. 4.50 shows how these control signals are
passed along from pipeline register to pipeline register. Fig. 4.51 puts these together.
were discussed in the overview above (Lecture 15)
Data Hazards (Text 4.7)
Consider the code from p. 363 of the text
Figure 4.52 shows the pipeline dependencies
in this sequence, and Fig. 4.53 shows how
some of them can be addressed by sending the output of the ALU from one
instruction directly to a following instruction, bypassing the register
file ("forwarding"). Fig. 4.54 shows
the additions to the data path to do this.
Not all data hazards can be addressed bt forwarding. Sometimes a
result is really needed before it is available; for example, if a
load is immediately followed by an R-type instruction which uses the
loaded data (p. 372, Figure 4.58):
In that case the only thing we can do is wait ("stall"). We do
this by changing the second instruction going down the pipeline to a
"no operation" and re-issuing it in the next cycle (Fig. 4.59).
length of the pipeline increases, this problem gets worse ... more
stalls are required and the CPI (which in ideal conditions is 1) goes
up. The compiler can reduce this effect bt judicious scheduling
of instructions. In particular, for our pipelined MIPS, by moving
loads earlier so that there is at least one unrelated instruction
between a load and the instruction which uses the results of the load.
Control Hazards (Text 4.8)
Connceptually the simplest solution is to stall after a branch, waiting
until the branch has been resolved (decided). That exacts a heavy
speed penalty. Almost as simple is to assume that a branch is not
taken and to continue issuing instructions along that path. We
will know whether the branch is taken before any subsequent
instructions store results into registers or memory. If it turns
out that the branch is taken, we flush
the pipeline (turn all operations in the pipeline into no-operations),
reset the PC, and continue.
With the standard pipeline, we discard 3 instructions (lose 3 cycles)
if a branch is taken. P&H show how this can be reduced to one
cycle by making the branch decision earlier. But again as the
pipeline becomes longer the problem gets worse.
The better we can predict whether a branch is taken, the smaller the
penalty. This requires dynamic
branch prediction ... keeping track of which branches were
previously taken with a branch
history table. Keeping 1 bit for each recent branch
instruction already helps a lot; a 2-bit history (Fig. 4.63) does even better.
Note: for the final exam, you will be expected to examine
sequences of instructions and answer questions about the effect on
performance: when data forwarding is sufficient, when a stall is
required, how many cycles are lost, whether a branch is correctly
predicted. You will not be asked about the details of how
forwarding, stalling, branch prediction is implemented in MIPS.