V22.0436 - Prof. Grishman

Lecture 16: Pipelining -- Some details

Pipeline Control

The pipeline registers have to  hold all the information required to complete execution of the instruction.  This includes both data and control information. Fig. 4.46 shows the processor wiith the control signals it requires;  Fig. 4.50 shows how these control signals are passed along from pipeline register to pipeline register.  Fig. 4.51 puts these together.

Structural Hazards

were discussed in the overview above (Lecture 15)

Data Hazards (Text 4.7)

Consider the  code  from p. 363 of the text

sub   $2,$1,$3
and   $12,$2,$5
or    $13,$6,$2
add   $14,$2,$2
sw    $15,100($2)

Figure 4.52 shows the pipeline dependencies in this sequence, and Fig. 4.53 shows how some of them can be addressed by sending the output of the ALU from one instruction directly to a following instruction, bypassing the register file ("forwarding").  Fig. 4.54 shows the additions to the data path to do this.

Not all data hazards can be addressed bt forwarding.  Sometimes a result is really needed before it is available;  for example, if a load is immediately followed by an R-type instruction which uses the loaded data (p. 372, Figure 4.58):

lw    $2,20($1)
and   $4,$2,$5

In that case the only thing we can do is wait ("stall").  We do this by changing the second instruction going down the pipeline to a "no operation" and re-issuing it in the next cycle (Fig. 4.59).

As the  length of the pipeline increases, this problem gets worse ... more stalls are required and the CPI (which in ideal conditions is 1) goes up.  The compiler can reduce this effect bt judicious scheduling of instructions.  In particular, for our pipelined MIPS, by moving loads earlier so that there is at least one unrelated instruction between a load and the instruction which uses the results of the load.

Control Hazards (Text 4.8)

Connceptually the simplest solution is to stall after a branch, waiting until the branch has been resolved (decided).  That exacts a heavy speed penalty.  Almost as simple is to assume that a branch is not taken and to continue issuing instructions along that path.  We will know whether the branch is taken before any subsequent instructions store results into registers or memory.  If it turns out that the branch is taken, we flush the pipeline (turn all operations in the pipeline into no-operations), reset the PC, and continue.

With the standard pipeline, we discard 3 instructions (lose 3 cycles) if a branch is taken.  P&H show how this can be reduced to one cycle by making the branch decision earlier.  But again as the pipeline becomes longer the problem gets worse.

The better we can predict whether a branch is taken, the smaller the penalty.  This requires dynamic branch prediction ... keeping track of which branches were previously taken with a branch history table.  Keeping 1 bit for each recent branch instruction already helps a lot;  a 2-bit history (Fig. 4.63) does even better.

Note:  for the final exam, you will be expected to examine sequences of instructions and answer questions about the effect on performance:  when data forwarding is sufficient, when a stall is required, how many cycles are lost, whether a branch is correctly predicted.  You will not be asked about the details of how forwarding, stalling, branch prediction is implemented in MIPS.