V22.0436 - Prof. Grishman

Lecture 19: Pushing CPU Performance

Pipelining, cont'd (chapter 6)

Pipeline hazards:  overlapping instruction execution can give rise to problems Pipelining is more complex for CISC machines, because the instructions may take different lengths of time to execute. However, RISC-style pipelining is now incorporated into high-performance CISC processors (such as the Pentium) by translating most instructions into a series of  RISC-like operations.

Superscalar (text, section 6.8)

Some machines now try to go beyond pipelining to execute more than one instruction at a clock cycle, producing an effective CPI < 1. This is possible if we duplicate some of the functional parts of the processor (e.g., have two ALUs or a register file with 4 read ports and 2 write ports), and have logic to issue several instructions concurrently.  However,  it requires even more complex logic to guard against hazards. Such designs are called superscalar.

Taking advantage of technology improvements

How has the steady progress in integrated circuit technology been translated into improvements in processor performance?

The technology improvements lead to faster transistors and smaller transistors. Faster transistors mean faster clock times. Smaller transistors mean that we can put more transistors on a chip (the Pentium III is approaching 10M transistors). What can we do with the increasing number of transistors to improve performance?

All of these techniques can be observed in the progress of x86 implementations:

Architectural Approaches

At some point most of these methods have diminishing returns;  it is very hard to squeeze out additional parallelism from a serial architecture.  New architectural approaches are needed: Spring 1999