x86-64 multilevel page tables
Consider the x86-64 architecture. Below we are asking about the physical pages consumed by a process, including the page tables themselves. As you answer the question, assume that any allocated memory consumes physical pages in RAM; that is, there is no swapping or demand paging. Note that it may be helpful for you to draw pictures (but you don’t have to).
As a reminder, the x86-64 imposes a multi-level page table structure: pages are 4KB, each page table entry is 8 bytes, and each individual page table (a node in the “tree”) occupies one page. Thus, each page table holds 4 KB / 8 B = 512 = 29 entries. Recall that the structure is four levels; each level is indexed by 9 bits of the virtual address.
What is the minimum number of physical pages consumed by a process that allocates 12KB (for example, 1 page each for code, stack, and data)?
What is the minimum number of physical pages consumed by a process that makes 29 + 1 allocations of size 4KB each? You can leave your answer in terms of powers of 2, and sums thereof.
What is the minimum number of physical pages consumed by a process that makes 218 + 1 allocations of size 4KB each? You can leave your answer in terms of powers of 2, and sums thereof.
TLBs
Consider a TLB which can store 4 mappings (the TLB is fully associative, meaning that any entry can store any mapping; if this parenthetical confuses you, you can ignore it). Below you will write C code to compute the sum of all integers in an array a
, which is 6 pages in length; you will do this in a way that maximizes the number of TLB misses (equivalently, minimizes the number of TLB hits).
A few things to note:
- The array is allocated to be page aligned, meaning that the first element in the array is at the beginning of a page.
- Your program can assume that the constant
PAGE_SIZE
is the size of a page in bytes and thatsizeof(int)
is the size of an integer. - You can ignore the effect on the TLB from fetching code; in other words, you can assume that the only memory references that affect the TLB are loads from array
a
. (In real systems, there are separate TLBs for instructions and data; this question is focusing on the data TLB.) - You can further assume that the processor does nothing else while your code is running; that is, you don’t need to worry about TLB flushes from context switches.
uint64_t tlb_unfriendly() {
int *a = page_alloc(6 * PAGE_SIZE);
populate_array(a); // sets the integers in the array
uint64_t sum = 0;
/* YOUR CODE HERE: compute sum in the most TLB-unfriendly way possible */
return sum;
}
Polling vs. interrupts
As discussed in class, two ways for an operating system to become aware of external events associated with a device are interrupts and polling. We observed that if a computer were receiving many interrupts, it might spend all of its time processing them and not get other work done; in that case, the operating system should switch to polling the device. Now consider the following:
A computer has an attached keyboard. The keyboard has a 1024-byte internal memory buffer to hold the codes of recently-pressed keys, each of which consumes 2 bytes of buffer space. (The buffer is a FIFO, which for our purposes means that the OS simply reads from it and doesn’t manage the memory; if this parenthetical confuses you, you can ignore it.)
This computer and its OS take 1 microsecond (10 − 6 seconds) to handle an interrupt from the keyboard. That duration includes everything: reading from the keyboard’s buffer, determining which key was pressed, and painting the appropriate letter to the screen.
Assume that polling requires a fixed cost of 1 microsecond per poll. Further assume that, per poll, the operating system can read an arbitrary amount of the keyboard’s internal memory buffer, up to the entire size of that buffer.
Assume that, if polling, the operating system checks the device in question every 200 milliseconds.
Assume that humans are sensitive to lags of 100 milliseconds or greater. Specifically, if a human types a letter, that letter must appear on the screen less than 100 milliseconds after the human types it, to avoid annoyance.
You type exceptionally quickly: 200 words per minute. Assume that the average word has 7 letters, including the space at the end of the word.
Each key code (each letter, in other words) generates a separate interrupt.
- How many interrupts per second would your typing generate on average? Show your work.
- Should the computer use polling or interrupts to handle your fast typing? Explain why your choice is acceptable and the other choice is not. Do not use more than three sentences.
Handing in the homework
Use Gradescope; you can enroll in our course with entry code 4J462V.