int a[1024][1024], b[1024][1024], c[1024][1024]; multiply() { unsigned i, j, k; for(i = 0; i < 1024; i++) for(j = 0; j < 1024; j++) for(k = 0; k < 1024; k++) c[i][j] += a[i,k] * b[k,j]; }Assume that the binary for executing this function fits in one page, and the stack also fits in one page. Assume further that an integer requires 4 bytes for storage. Compute the number of TLB misses if the page size is 4096 and the TLB has 8 entries with a replacement policy consisting of LRU.
Solution:
1024*(2+1024*1024) = 1073743872
The binary and the stack
each fit in one page, thus each takes one entry in the TLB. While the function
is running, it is accessing the binary page and the stack page all the time. So
the two TLB entries for these two pages would reside in the TLB all the time
and the data can only take the remaining 6 TLB entries.
We assume the two entries are already in TLB when the function begins to run. Then we need only consider those data pages.
Since an integer requires 4 bytes for storage and the page size is 4096 bytes, each array requires 1024 pages. Suppose each row of an array is stored in one page. Then these pages can be represented as a[0..1023], b[0..1023], c[0..1023]: Page a[0] contains the elements a[0][0..1023], page a[1] contains the elements a[1][0..1023], etc.
For a fixed value of i, say 0, the function loops over j and k, we have the following reference string:
a[0], b[0], c[0], a[0], b[1], c[0], ¡ a[0], b[1023], c[0]For the reference string (1024 rows in total), a[0], c[0] will contribute two TLB misses. Since a[0] and b[0] each will be accessed every four memory references, the two pages will not be replaced by the LRU algorithm. For each page in b[0..1023], it will incur one TLB miss every time it is accessed. So the number of TLB misses for the second inner loop is
¡
a[0], b[0], c[0], a[0], b[1], c[0], ¡ a[0], b[1023], c[0]
Solution:
Need solution.
Solution: