How do we compute the net effect on performance? We need to compute
cycle time * average CPI. Average CPI depends in turn on the
frequency of instructions: in computing the average, we need to weight
the CPI of each instruction by its relative frequency. These relative
are determined by simulating a variety of programs and counting the
of each instructions. The frequencies for SPECint and SPECfp for MIPS
are shown in P&H, figure 3.26,
The net gains from this design are small or negative for this
instruction set (p. 331). They would be greater if we used a
faster clock (say, 100 ps) and allowed 2 clock cycles for memory
operations, or if the instruction
set included more complex instructions such as a multiply.
However, the main significance of the multicycle design is as a step
towards a pipelined design.
The control unit must be more complex, to handle the sequential execution: we would create a finite-state machine in which the transition between steps is determined by the opcode. The state transition diagram for this machine is shown in Fig. 5.38, page 339. This is implemented using a state register and combinational logic to both determine the next state and set the control lines (Fig. 5.37, p. 338). This control unit can be optimized down to individual gates, as was the design of the combinational control unit (for the single-cycle design). Alternatively, we can employ a microprogrammed design, in which the tables for the control unit (the state transition table and the output table) are stored directly in a microprogam memory. This provides a more uniform structure and a design which is easier to change.
The simplest such overlap is instruction fetch overlap: fetch the next instruction while executing the current instruction. Even relatively simple processors employed such overlap.
Greater gain can be achieved by overlapping the execution (register fetch, ALU operation, ...) of successive instructions. A full pipelining scheme overlaps such operations completely, resulting ideally in a CPI (cycles per instruction) of 1. However, machines which employ such overlap must deal with data and branch hazards: instructions which influence later instructions in the pipeline. This makes the design of pipelined machines much more complex.