V22.0436 - Prof. Grishman
Lecture 16: Measuring Performance (cont'd)
Some Popular Metrics
- MHz (clock speed). Only a useful measure when comparing the same
of the same architecture.
- MIPS (millions of instructions per second). Only meaningful when
machines with the same architecture, since some architectures may
substantially more instructions than others for the same program. Also
can be very dependent on the mix of instructions and hence on the
used to measure MIPS. (This is particularly true for machines with
instructions for graphics or video, such as the MMX instructions
and Pentium III). Some manufacturers report "peak MIPS" on
designed but useless programs.
- MFLOPS (millions of floating point operations per second).
interest for scientific applications. Again, deceptive "peak MFLOPS"
often reported (particularly for the fastest machines: see Top500).
- benchmarks: SPECint, SPECfp, ... .
- These are maintained by the Standard
Performance Evaluation Corporation.
- SPEC started with CPU metrics but now provides a wide range of
metrics, including graphics tasks, Java support, and Web servers.
- CPU metrics are based on the execution time for standard
of programs. More realistic than other metrics, but must make sure that
benchmarks reasonably reflect the actual application.
- SPECint for integer applications, SPECfp for floating point
applications. Score is elapsed time on a standard machine divided
by elapsed time on the machine being evaluated.
- elapsed time is not the best measure for multiple-CPU or
multiple-core machines, so for such machines a SPECrate is quoted, which
effectively measures the number of times the benchmark can be executed
in a given period.
- Several metrics have been developed specifically for Intel PC
(Intel's iCOMP, Ziff-Davis' CPUmark). Some labs report
a variety of benchmarks to better match the range of possible
How Architecture Affects Performance
Our goal is to minimize the product of the three factors (number of
executed, average CPI, clock cycle time) . Whenever we consider a
to the architecture, we must evaluate its effect on each of these
In particular, when we add an instruction to the instruction set, we
must consider whether it can significantly reduce the number of
to execute (the first factor) without affecting the time per
(the last two factors). A specialized instruction may be used only
by a compiler (most of the execution time is spent on a small number of
instructions; see Figure 3.26 for the distribution of
instructions for the SPEC benchmarks). On the other hand, if the
instruction requires a longer
data path, it may require a longer clock cycle. The net effect would be
a slower machine. [We ignore the issue of code size, which is much less
important than it used to be because memory is so much cheaper.]
Good candidates for instructions are those which would be used
and would take much longer if performed by a sequence of other
... for example, floating point operations for scientific applications.
The increased use of RISC machines, starting in the mid-80's,
a more careful assessment of the benefits and costs of adding
to the instruction set.
However, the issue of binary code compatibility remains very
if not overwhelming. The development of entirely new machine
has decreased since the 90's, with the Intel PC architecture, and its
increasingly dominant. As we shall discuss later, current
microprocessors achieve both speed and code compatibility by
translating Intel instructions into a RISC-like instruction set