Notes:
The goal is to specify which pages a given process needs to have memory resident in order for the process to run without too many page faults.
The idea of the working set policy is to ensure that each process keeps its working set in memory.
Homework: Describe a process (i.e., a program) that runs for a long time (say hours) and always has w<10 Assume ω=100,000, the page size is 4KB. The program need not be practical or useful.
Homework: Describe a process that runs for a long time and (except for the very beginning of execution) always has w>1000. Assume ω=100,000, the page size is 4KB. The program need not be practical or useful.
The definition of Working Set is local to a process. That is, each process has a working set; there is no system wide working set other than the union of all the working sets of each process.
However, the working set of a single process has effects on the demand paging behavior and victim selection of other processes. If a process's working set is growing in size, i.e. w(t,ω) is increasing as t increases, then we need to obtain new frames from other processes. A process with a working set decreasing in size is a source of free frames. We will see below that this is an interesting amalgam of local and global replacement policies.
Interesting questions concerning the working set include:
... Various approximations to the working set, have been devised. We will study two: using virtual time instead of memory references (immediately below) and Page Fault Frequency (section 4.6). In 4.4.9 we will see the popular WSClock algorithm that includes an approximation of the working set as well as several other ideas.
This treatment is based on one by Prof. Ernie Davis.
Tannenbaum suggests that the WSClock Page Replacement Algorithm is a natural outgrowth of the idea of a working set. However, reality is less clear cut. WSClock is actually embodies several ideas, one of which is connected to the idea of a working set. As the name suggests another of the ideas is the clock implementation of 2nd chance.
The actual implemented algorithm is somewhat complicated and not a clean elegant concept. It is important because
Since the algorithm is complicated we present it in stages. As stated above this is an important algorithm since it works well and is used in practice. However, I certainly do not assume you remember all the details.
To ease the explanation we will assume k=1, i.e., actions are done each clock tick.
Thus, the time field is an approximation to the time of the most recent reference, accurate to the clock period. Note that this is done every clock tick (really every k ticks) and not every memory reference. That is why it is feasible.
If we chose as victim the page with the smallest time field, we would be implementing a virtual time approximation to LRU. But in fact we do more.
The OS designer needs to tune τ just as one would need to tune ω and, like ω, τ is quite robust (the same value works well for a variety of job mixes).
The advantage of introducing τ is that a victim search can stop as soon as a page older than τ is found.
If no pages have a reference time older than Tau, then the page with the earliest time is the victim.
Suppose we've decided to write out old dirty pages D1 through Dd and to replace old clean page C with new page N.
We must block the current process P until N is completely read in, but P can run while D1 through Dd are being written. Hence we would desire the I/O read to be done before the writes, but we shall see later, when we study I/O, that there are other considerations for choosing the order to perform I/O operations.
Similarly, suppose we can not find an old clean page and have decided to replace old dirty page D0 with new page N, and have detected additional old dirty pages D1 through Dd (recall that we were searching for an old clean page). Then P must block until D0 has been written and N has been read, but can run while D1 through Dd are being written.
Algorithm | Comment |
---|---|
Random | Poor, used for comparison |
Optimal | Unimplementable, used for comparison |
LIFO | Horrible, useless |
NRU | Crude |
FIFO | Not good ignores frequency of use |
Second Chance | Improvement over FIFO |
Clock | Better implementation of Second Chance |
LRU | Great but impractical |
NFU | Crude LRU approximation |
Aging | Better LRU approximation |
Working Set | Good, but expensive |
WSClock | Good approximation to working set |
Consider a system that has no pages loaded and that uses the FIFO
PRU.
Consider the following “reference string” (sequences of
pages referenced).
0 1 2 3 0 1 4 0 1 2 3 4
If we have 3 frames this generates 9 page faults (do it).
If we have 4 frames this generates 10 page faults (do it).
Theory has been developed and certain PRA (so called “stack algorithms”) cannot suffer this anomaly for any reference string. FIFO is clearly not a stack algorithm. LRU is. Tannenbaum has a few details, but we are skipping it.
Repeat the above calculations for LRU.