Computer Architecture

Start Lecture #20

7.2: The Basics of Caches

We start with a very simple cache organization. One that was used on the Decstation 3100, a 1980s workstation.

Accessing a Cache

On the right is a pictorial example for a direct mapped cache with 4 blocks and a memory with 16 blocks.

How can we tell if a memory block is in the cache?

Also stored is a valid bit per cache block so that we can tell if there is a memory block stored in this cache block.

For example, when the system is powered on, all the cache blocks are invalid.

Addr(10)Addr(2)hit/missblock#
2210110miss110
2611010miss010
2210110hit110
2611010hit010
1610000miss000
300011miss011
1610000hit000
1810010miss010

Consider the example on page 476.


The circuitry needed for this simple cache (direct mapped, block size 1, all references to 1 word) to determine if we have a hit or a miss, and to return the data in case of a hit is quite easy. We are showing a 1024 word (= 4KB) direct mapped cache with block size = reference size = 1 word.

Make sure you understand the division of the 32 bit address into 20, 10, and 2 bits.

Calculate on the board the total number of bits in this cache.

Homework: 7.2 7.3 7.4

Processing a Read for this Simple Cache

The action required for a hit is obvious, namely return the data found to the processor.

For a miss, the best action is fairly clear, but requires some thought.

Handling Cache Misses

We can skip much of this section as it discusses the multicycle and pipelined implementations of chapter 6, which we skipped. For the single cycle processor implementation we just need to note a few points.

Handling Writes

Processing a write for our simple cache (direct mapped with block size = reference size = 1 word).

We have 4 possibilities: For a write hit we must choose between Write through and Write back. For a write miss we must choose between write-allocate and write-no-allocate (also called store-allocate and store-no-allocate and other names).

Write through: Write the data to memory as well as to the cache.

Write back: Don't write to memory now, do it later when this cache block is evicted.

The fact that an eviction must trigger a write to memory for write-back caches explains the comment above that the write hit policy effects the read miss policy.

Write-allocate: Allocate a slot and write the new data into the cache (recall we have a write miss). The handling of the eviction this allocation (probably) causes depends on the write hit policy.

  1. If the cache is write through, discard the old data (since it is in memory) and write the new data to memory (as well as in the cache).
  2. If the cache is write back, the old data must now be written back to memory, but the new data is not written to memory.

Write-no-allocate: Leave the cache alone and just write the new data to memory.

Write no-allocate is not normally as effective as write allocate due to temporal locality.

The simplest policy is write-through, write-allocate. The decstation 3100 discussed above adopted this policy and performed the following actions for any write, hit or miss, (recall that, for the 3100, block size = reference size = 1 word and the cache is direct mapped).

  1. Index the cache using the correct LOBs (i.e., not the very lowest order bits as these give the byte offset).
  2. Write the data and the tag into the cache.
  3. Set Valid to true.
  4. Send request to main memory.

Although the above policy has the advantage of simplicity, it is out of favor due to its poor performance.

Improvement: Use a Write Buffer

Unified vs Split I and D (Instruction and Data) Caches

Given a fixed total size (in bytes) for the cache, is it better to have two caches, one for instructions and one for data; or is it better to have a single unified cache?