Computer Architecture

Start Lecture #26

Remark: I expect the final exam to be on the 7th floor like the midterm. A practice final is on the web.

Remark: Covered Tag Size and Division of Address Bits which was inadvertently omitted.

Controler Time

Not much to say. It is typically small. We will use 0ms (i.e., ignore this time).

Queuing Delays

This can be the largest component, but we will ignore it since it is not a function of the architecture, but rather of the load and OS.

Dependability, Reliability, and Availability

Reliability measures the length of time during which services is continuously delivered as expected.

An example reliability measure is mean time to failure (MTTF), which measures the average length of time that the system is delivering service as expected. Bigger values are better.

Another important measure is mean time to repair (MTTR), which measures how long the system is not delivering service as expected. Smaller values are better.

Finally we have mean time between failures (MTBF).
MTBF = MTTF + MTTR

One might think that having a large MTBF is good, but that is not necessarily correct. Consider a system with a certain MTBF and simply have the repair center deliberately add an extra 1 hour to the repair time and poof the MTBF goes up by one hour!

RAID

The acronym was coined by Patterson and his students. It stands for Redundant Array of Inexpensive Disks. Now it is often redefined as Redundant Array of Independent Disks.

RAID comes in several flavors often called levels.

No Redundancy (RAID 0)

The base, non-RAID, case from which the others are built.

Mirroring (RAID 1)

Two disks containing the same content.

You read from either; in particular concurrent reads of different blocks are possible.
You a block write to both disks, possibly concurrently.
The storage overhead is 100%, i.e. the required storage is doubled.

Error Detecting and Correcting Code (RAID 2)

Often called ECC (error correcting code or error checking and correcting code). Widely used in RAM, not used in RAID.

Bit-Interleaved Parity (RAID 3)

Normally byte-interleaved or several-byte-interleaved. For most applications, RAID 4 is better.

Block-Interleaved Parity (RAID 4)

Striping a.k.a. Interleaving

To increase performance, rather than reliability and availability, it is a good idea to stripe or interleave blocks across several disks. In this scheme block n is stored on disk n mod k, where k is the number of disks. The quotient n/k is called the stripe number. For example, if there are 4 disks, stripe number 0 (the first stripe) consists of block 0, which is stored on disk 0, block 1 stored on 1, block 2 stored on 2, and block 3 stored on 3. Stripe 1 (like all stripes in this example) also contains 4 blocks. The first one is block 4, which is stored on disk 0.

Striping is especially good if one is accessing full stripes in which case all the blocks in the stripe can be read concurrently.

RAID 4

RAID 4 combines striping and parity. In addition to the k so-called data disks used in striping, one has a single parity disk that contains the parity of the stripe.

Consider all k data blocks in one stripe. Extend this stripe to k+1 blocks by including the corresponding block on the parity disk. The block on the parity disk is calculated as the bitwise exclusive OR of the k data blocks.

Thus a stripe contains k data blocks and one parity block, which is the exclusive OR of the data blocks.

The great news is that any block in the stripe, parity or data, is the exclusive OR of the other k. This means we can survive the failure of any one disk.

For example, let k=4 and let the data blocks be A, B, C, and D.

If the parity disk fails, we can easily recreate it since, by definition, the parity block for this stripe is
A ⊕ B ⊕ C ⊕ D
which is the exclusive OR of the other blocks.
If a data disk fails, we can again recreate it since, by the commutative and associative properties of XOR,
A ⊕ B ⊕ C ⊕ the parity block = A ⊕ B ⊕ C ⊕ (A ⊕ B ⊕ C ⊕ D) = D
and again the missing block is the exclusive OR of the remaining blocks.

Properties of RAID 4.

The storage overhead is 1/k.
ONE failure can be tolerated.
Failure means detectable failure (disks are good at this).
A bad disk can be recreated from the remaining disks (and RAID controllers do this on line).
In normal operation (i.e., all disks good), reads are not affected by the redundancy. In this respect RAID 4 is just like striping.
When writing a full stripe, the system calculates the new parity block from the new data blocks and writes the entire stripe.
When writing a single block, RAID 4 performs poorly since it must read additional blocks to determine the new parity.
- One way would be to read the remaining k-1 data blocks and calculate the parity as usual.
- There is a better (slightly clever) way.
- If block A is to be changed to A', just read (the old) A and the old parity P. The new parity is
  A ⊕ A' ⊕ P
All writes require reading and writing the parity disk, which becomes a bottleneck and motivates ...

Distributed Block-Interleaved Parity RAID 5

Rotate the disk used for parity.

Again using our 4 data-disk example, we continue to put the parity for blocks 0-3 on disk 4 (the fifth disk) but rotate the assignment of which disk holds the parity block of different stripes. In more detail.

For the first stripe block 0 is on 0, 1 is on 1, 2 is on 2, 3 is on 3, and parity is on 4.
For the next stripe (data blocks 4-7), we put the parity on the fourth disk so block 4 is on disk 0, 5 is on 1, 6 is on 2, parity is on 3, and 7 is on 4.
For the next stripe, we put the parity on the third disk so 8 is on 0, 9 is on 1, parity is on 2, 10 is on 3, and 11 is on 4.
etc

Raid 1 and Raid 5 are widely used.

P + Q Redundancy (RAID 6)

Gives more than single error correction at a higher storage overhead