======== START LECTURE #23 ========

  • How does the cache find a memory block?

    Tag size and division of the address bits

    We continue to assume a byte addressed machines, but all references are to a 4-byte word (lw and sw).

    The 2 LOBs are not used (they specify the byte within the word but all our references are for a complete word). We show these two bits in dark blue. We continue to assume 32 bit addresses so there are 2**30 words in the address space.

    Let's review various possible cache organizations and determine for each how large is the tag and how the various address bits are used. We will always use a 16KB cache. That is the size of the data portion of the cache is 16KB = 4 kilowords = 2**12 words.

    1. Direct mapped, blocksize 1 (word).
      • Since the blocksize is one word, there are 2**30 memory blocks and all the address bits (except the 2 LOBs that specify the byte within the word) are used for the memory block number. Specifically 30 bits are so used.
      • The cache has 2**12 words, which is 2**12 blocks.
      • So the low order 12 bits of the memory block number give the index in the cache (the cache block number), shown in cyan.
      • The remaining 18 (30-12) bits are the tag, shown in red.


    2. Direct mapped, blocksize 8
      • Three bits of the address give the word within the 8-word block. These are drawn in magenta.
      • The remaining 27 HOBs of the memory address give the memory block number.
      • The cache has 2**12 words, which is 2**9 blocks.
      • So the low order 9 bits of the memory block number gives the index in the cache.
      • The remaining 18 bits are the tag

    3. 4-way set associative, blocksize 1
      • Blocksize is 1 so there are 2**30 memory blocks and 30 bits are used for the memory block number.
      • The cache has 2**12 blocks, which is 2**10 sets (each set has 4=2**2 blocks).
      • So the low order 10 bits of the memory block number gives the index in the cache.
      • The remaining 20 bits are the tag. As the associativity grows the tag gets bigger. Why?
        Growing associativity reduces the number of sets into which a block can be placed. This increases the number of memory blocks that be placed in a given set. Hence more bits are needed to see if the desired block is there.

    4. 4-way set associative, blocksize 8
      • Three bits of the address give the word within the block.
      • The remaining 27 HOBs of the memory address give the memory block number.
      • The cache has 2**12 words = 2**9 blocks = 2**7 sets.
      • So the low order 7 bits of the memory block number gives the index in the cache.

    Improvement: Multilevel caches

    Modern high end PCs and workstations all have at least two levels of caches: A very fast, and hence not too big, first level (L1) cache together with a larger but slower L2 cache.

    When a miss occurs in L1, L2 is examined and only if a miss occurs there is main memory referenced.

    So the average miss penalty for an L1 miss is

    (L2 hit rate)*(L2 time) + (L2 miss rate)*(L2 time + memory time)
    
    We are assuming L2 time is the same for an L2 hit or L2 miss. We are also assuming that the access doesn't begin to go to memory until the L2 miss has occurred.

    Do an example

    7.4: Virtual Memory

    I realize this material was covered in operating systems class (V22.0202). I am just reviewing it here. The goal is to show the similarity to caching, which we just studied. Indeed, (the demand part of) demand paging is caching: In demand paging the memory serves as a cache for the disk, just as in caching the cache serves as a cache for the memory.

    The names used are different and there are other differences as well.

    Cache conceptDemand paging analogue
    Memory blockPage
    Cache blockPage Frame (frame)
    BlocksizePagesize
    TagNone (table lookup)
    Word in blockPage offset
    Valid bitValid bit

    Cache conceptDemand paging analogue
    AssociativityNone (fully associative)
    MissPage fault
    HitNot a page fault
    Miss ratePage fault rate
    Hit rate1 - Page fault rate
    Placement questionPlacement question
    Replacement questionReplacement question