Remark: Lab 4 (the last lab is assigned).
For a two level page table the virtual address is divided into three pieces
+-----+-----+-------+ | P#1 | P#2 | Offset| +-----+-----+-------+
Do an example on the board
The VAX used a 2-level page table structure, but with some wrinkles (see Tanenbaum for details).
Naturally, there is no need to stop at 2 levels. In fact the SPARC has 3 levels and the Motorola 68030 has 4 (and the number of bits of Virtual Address used for P#1, P#2, P#3, and P#4 can be varied).
Note: Tanenbaum suggests that “associative memory” and “translation lookaside buffer” are synonyms. This is wrong. Associative memory is a general concept of which translation lookaside buffer is a specific example. case.
An associative memory is a content addressable memory. That is you access the memory by giving the value of some (index) field and the hardware searches all the records and returns the record whose field contains the requested value.
For example
Name | Animal | Mood | Color ======+========+==========+====== Moris | Cat | Finicky | Grey Fido | Dog | Friendly | Black Izzy | Iguana | Quiet | Brown Bud | Frog | Smashed | Green
If the index field is Animal and Iguana is given, the associative memory returns
Izzy | Iguana | Quiet | Brown
A Translation Lookaside Buffer or TLB is an associate memory where the index field is the page number. The other fields include the frame number, dirty bit, valid bit, etc.
Homework: 17.
Keep a table indexed by frame number. The content of entry f contains the number of the page currently loaded in frame f. This is often called a frame table as well as an inverted page table.
These are solutions to the replacement question.
Good solutions take advantage of locality.
Pages belonging to processes that have terminated are of course perfect choices for victims.
Pages belonging to processes that have been blocked for a long time are good choices as well.
A lower bound on performance. Any decent scheme should do better.
Replace the page whose next reference will be furthest in the future.
Divide the frames into four classes and make a random selection from the lowest nonempty class.
Assumes that in each PTE there are two extra flags R (sometimes called U, for used) and M (often called D, for dirty).
Also assumes that a page in a lower priority class is cheaper to evict.
We again have the prisoner problem, we do a good job of making little ones out of big ones, but not the reverse. Need more resets.
Every k clock ticks, reset all R bits
What if the hardware doesn't set these bits?
Simple but poor since usage of the page is ignored.
Belady's Anomaly: Can have more frames yet generate more faults. Example given later.
The natural implementation is to have a queue of nodes each pointing to a page.
Similar to the FIFO PRA, but altered so that a page recently referenced is given a second chance.
Same algorithm as 2nd chance, but a better implementation for the nodes: Use a circular list with a single pointer serving as both head and tail.
Let us begin by assuming that the number of pages loaded is constant.
What if the number of pages is not constant?
This is terrible! Why?
Ans: All but the last frame are frozen once loaded so you can replace
only one frame. This is especially bad after a phase shift in the
program when it is using all new pages.
When a page fault occurs, choose as victim that page that has been unused for the longest time, i.e. that has been least recently used.
LRU is definitely
Page | Loaded | Last ref. | R | M |
---|---|---|---|---|
0 | 126 | 280 | 1 | 0 |
1 | 230 | 265 | 0 | 1 |
2 | 140 | 270 | 0 | 0 |
3 | 110 | 285 | 1 | 1 |
R | counter | |
---|---|---|
1 | 10000000 | |
0 | 01000000 | |
1 | 10100000 | |
1 | 11010000 | |
0 | 01101000 | |
0 | 00110100 | |
1 | 10011010 | |
1 | 11001101 | |
0 | 01100110 |
NFU doesn't distinguish between old references and recent ones. The following modification does distinguish.
Homework: 25, 34
The goal is to specify which pages a given process needs to have memory resident in order for the process to run without too many page faults.
The idea of the working set policy is to ensure that each process keeps its working set in memory.
Homework: Describe a process (i.e., a program) that runs for a long time (say hours) and always has w<10 Assume ω=100,000, the page size is 4KB. The program need not be practical or useful.
Homework: Describe a process that runs for a long time and (except for the very beginning of execution) always has w>1000. Assume ω=100,000, the page size is 4KB. The program need not be practical or useful.
The definition of Working Set is local to a process. That is, each process has a working set; there is no system wide working set other than the union of all the working sets of each process.
However, the working set of a single process has effects on the demand paging behavior and victim selection of other processes. If a process's working set is growing in size, i.e. w(t,ω) is increasing as t increases, then we need to obtain new frames from other processes. A process with a working set decreasing in size is a source of free frames. We will see below that this is an interesting amalgam of local and global replacement policies.
Interesting questions concerning the working set include:
... Various approximations to the working set, have been devised. We will study two: using virtual time instead of memory references (immediately below) and Page Fault Frequency (section 4.6). In 4.4.9 we will see the popular WSClock algorithm that includes an approximation of the working set as well as several other ideas.
This treatment is based on one by Prof. Ernie Davis.
Tannenbaum suggests that the WSClock Page Replacement Algorithm is a natural outgrowth of the idea of a working set. However, reality is less clear cut. WSClock is actually embodies several ideas, one of which is connected to the idea of a working set. As the name suggests another of the ideas is the clock implementation of 2nd chance.
The actual implemented algorithm is somewhat complicated and not a clean elegant concept. It is important because
Since the algorithm is complicated we present it in stages. As stated above this is an important algorithm since it works well and is used in practice. However, I certainly do not assume you remember all the details.
We start by associating a node with every page loaded in memory (i.e., with every frame given to this process). In the node are stored R and M bits that we assume are set by the hardware. (Of course we don't design the hardware so really the R and M bits are set in a hardware defined table and the nodes reference the entries in that table.) Every k clock ticks the R bit is reset. So far this looks like NRU.
To ease the explanation we will assume k=1, i.e., actions are done each clock tick.
We now introduce an LRU aspect (with the virtual time approximation described above for working set): At each clock tick we examine all the nodes for the running process and store the current virtual time in all nodes for which R is 1.
Thus, the time field is an approximation to the time of the most recent reference, accurate to the clock period. Note that this is done every clock tick (really every m ticks) and not every memory reference. That is why it is feasible.
If we chose as victim the page with the smallest time field, we would be implementing a virtual time approximation to LRU. But in fact we do more.
We now introduce some working set aspects into the algorithm by first defining a time constant τ (analogous to ω in the working set algorithm) and consider all pages older than τ (i.e., their stored time is smaller than the current time minus τ) as candidate victims. The idea is that these pages are not in the working set.
The OS designer needs to tune τ just as one would need to tune ω and, like ω, τ is quite robust (the same value works well for a variety of job mixes).
The advantage of introducing τ is that a victim search can stop as soon as a page older than τ is found.
If no pages have a reference time older than Tau, then the page with the earliest time is the victim.
Next we introduce the other aspect of NRU, preferring clean to dirty victims. We search until we find a clean page older than τ, if there is one; if not, we use a dirty page older than τ.
Now we introduce an optimization similar to prefetching (i.e., speculatively fetching some data before it is known to be needed). Specifically, when we encounter a dirty page older than τ (while looking for a clean old page), we write the dirty page back to disk (and clear the M bit, which Tanenbaum forgot to mention) without evicting the page, on the presumption that, since the page is not in (our approximation to) the working set, this I/O will be needed eventually. The down side is that the page could become dirty again, rendering our speculative I/O redundant.
Suppose we've decided to write out old dirty pages D1 through Dd and to replace old clean page C with new page N.
We must block the current process P until N is completely read in, but P can run while D1 through Dd are being written. Hence we would desire the I/O read to be done before the writes, but we shall see later, when we study I/O, that there are other considerations for choosing the order to perform I/O operations.
Similarly, suppose we can not find an old clean page and have decided to replace old dirty page D0 with new page N, and have detected additional old dirty pages D1 through Dd (recall that we were searching for an old clean page). Then P must block until D0 has been written and N has been read, but can run while D1 through Dd are being written.
We throttle the previous optimization to prevent overloading the I/O subsystem. Specifically we set a limit on the number of dirty pages the previous optimization can request be written.
Finally, as in the clock algorithm, we keep the data structure (nodes associated with pages) organized as a circular list with a single pointer (the hand of the clock). Hence we start each victim search where the previous one left off.
Algorithm | Comment |
---|---|
Random | Poor, used for comparison |
Optimal | Unimplementable, used for comparison |
LIFO | Horrible, useless |
NRU | Crude |
FIFO | Not good ignores frequency of use |
Second Chance | Improvement over FIFO |
Clock | Better implementation of Second Chance |
LRU | Great but impractical |
NFU | Crude LRU approximation |
Aging | Better LRU approximation |
Working Set | Good, but expensive |
WSClock | Good approximation to working set |