Operating Systems
================ Start Lecture #10 ================
Chapter 4: Memory Management
4.4.5: Clock PRA
Same algorithm as 2nd chance, but a better
implementation for the nodes: Use a circular list with a single
pointer serving as both head and tail.
Let us begin by assuming that the number of pages loaded is
constant.

So the size of the node list in 2nd chance is constant.

Use a circular list for the nodes and have a pointer pointing to
the head entry. Think of the list as the hours on a clock and the
pointer as the hour hand.

Since the number of nodes is constant, the operation we need to
support is replace the “oldest” page by a new page.

Examine the node pointed to by the (hour) hand.
If the R bit of the corresponding page is set, we give the
page a second chance: clear the R bit, move the hour hand (now the page
looks freshly loaded), and examine the next node.

Eventually we will reach a node whose corresponding R bit is
clear. The corresponding page is the victim.

Replace the victim with the new page (may involve 2 I/Os as
always).

Update the node to refer to this new page.

Move the hand forward another hour so that the new page is at the
rear.
What if the number of pages is not constant?

We now have to support inserting a node right before
the hour hand (the rear of the queue) and removing the node
pointed to by the hour hand.

The natural solution is to double link the circular list.

In this case insertion and deletion are a little slower than for
the primitive 2nd chance (double linked lists have more pointer
updates for insert and delete).

So the tradeoff is that if there are mostly inserts and deletes
and granting 2nd chances is not too common, use the original 2nd
chance implementation.
If there are mostly replacements and you
often give nodes a 2nd chance, use clock.
LIFO PRA
This is terrible! Why?
Ans: All but the last frame are frozen once loaded so you can replace
only one frame. This is especially bad after a phase shift in the
program when it is using all new pages.
4.4.6: Least Recently Used (LRU) PRA
When a page fault occurs, choose as victim that page that has been
unused for the longest time, i.e. that has been least recently used.
LRU is definitely

Implementable: The past is knowable.

Good: Simulation studies have shown this.

Difficult. Essentially need to either:

Keep a time stamp in each PTE, updated on each reference
and scan all the PTEs when choosing a victim to find the PTE
with the oldest timestamp.

Keep the PTEs in a linked list in usage order, which means
on each reference moving the PTE to the end of the list
Page  Loaded  Last ref.  R  M


0  126  280  1  0

1  230  265  0  1

2  140  270  0  0

3  110  285  1  1

Homework: 29, 23.
Note: there is a typo in 29; the table should be as shown on the right.
A hardware cutsie in Tanenbaum

For n pages, keep an nxn bit matrix.

On a reference to page i, set row i to all 1s and col i to all 0s

At any time the 1 bits in the rows are ordered by inclusion.
I.e. one row's 1s are a subset of another row's 1s, which is a
subset of a third. (Tanenbaum forgets to mention this.)

So the row with the fewest 1s is a subset of all the others and is
hence least recently used.

This row also has the smallest value, when treated as an unsigned
binary number. So the hardware can do a comparison of the rows
rather than counting the number of 1 bits.

Cute, but still impractical.
4.4.7: Simulating (Approximating) LRU in Software
The Not Frequently Used (NFU) PRA

Include a counter in each PTE (and have R in each PTE).

Set counter to zero when page is brought into memory.

For each PTE, every k clock ticks.

Add R to counter.

Clear R.

Choose as victim the PTE with lowest count.
R  counter 

1  10000000 

0  01000000 

1  10100000 

1  11010000 

0  01101000 

0  00110100 

1  10011010 

1  11001101 

0  01100110 

The Aging PRA
NFU doesn't distinguish between old references and recent ones. The
following modification does distinguish.
 Include a counter in each PTE (and have R in each PTE).
 Set counter to zero when page is brought into memory.
 For each PTE, every k clock ticks.
 Shift counter right one bit.
 Insert R as new high order bit (HOB).
 Clear R.
 Choose as victim the PTE with lowest count.
Homework: 25, 34
4.4.8: The Working Set Page Replacement Problem (Peter Denning)
The working set policy (Peter Denning)
The goal is to specify which pages a given process needs to have
memory resident in order for the process to run without too many
page faults.

But this is impossible since it requires predicting the future.

So we make the assumption that the immediate future is well
approximated by the immediate past.

We measure time in units of memory references, so t=1045 means the
time when the 1045th memory reference is issued.

In fact we measure time separately for each process, so t=1045
really means the time when this process made its 1045th memory
reference.

W(t,&omega) is the set of pages referenced (by the given process) from
time tω to time t.

That is, W(t,ω) is the set pages referenced during
the window of size ω ending at time t.

That is, W(t,ω) is the set of pages referenced by the last
ω memory references ending at reference t.

W(t,ω) is called the working set at time t
(with window ω).

w(t,ω) is the size of the set W(t,ω), i.e. is the
number of distinct pages referenced in the window.
The idea of the working set policy is to ensure that each process
keeps its working set in memory.

Allocate w(t,ω) frames to each process.
This number differs for each process and changes with time.

On a fault, one evicts a page not in the working set. But it is
not easy to find such a page quickly.

Indeed determining W(t,ω) precisely is quite time consuming
and difficult. It is never done in real systems.

If a process is suspended, it is often swapped out; the working
set then can be used to say which pages should be brought back
when the process is resumed.
Homework: Describe a process (i.e., a program)
that runs for a long time (say hours) and always has w<10
Assume ω=100,000, the page size is 4KB. The program need not be
practical or useful.
Homework: Describe a process that runs for a long
time and (except for the very beginning of execution) always has
w>1000. Assume ω=100,000, the page size is 4KB. The program
need not be practical or useful.
The definition of Working Set is local to a process. That is, each
process has a working set; there is no system wide working set other
than the union of all the working sets of each process.
However, the working set of a single process has effects on the
demand paging behavior and victim selection of other processes.
If a process's working set is growing in size, i.e. w(t,ω) is
increasing as t increases, then we need to obtain new frames from
other processes. A process with a working set decreasing in size is a
source of free frames. We will see below that this is an interesting
amalgam of
local and global replacement policies.
Interesting questions concerning the working set include:

What value should be used for ω?
Experiments have been done and ω is surprisingly robust (i.e.,
for a given system, a fixed value works reasonably for a wide variety
of job mixes).

How should we calculate W(t,ω)?
Hard so do exactly so ...
... Various approximations to the working set, have been devised.
We will study two: using virtual time instead of memory references
(immediately below) and Page Fault Frequency (section 4.6).
In 4.4.9 we will see the popular WSClock algorithm that includes an
approximation of the working set as well as several other ideas.
Using virtual time

Approximate the working set by those pages referenced during the
last m milliseconds rather than the last ω memory references.
Note that the time is measured only while this process is running,
i.e., we are using virtual time.

Clear the reference bit every m milliseconds and set it on every
reference.

To choose a victim, we need to find a page with the R bit
clear.

Essentially we have reduced the working set policy to NRU.
4.4.9: The WSClock Page Replacement Algorithm
This treatment is based on one by Prof. Ernie Davis.
Tannenbaum suggests that the WSClock Page
Replacement Algorithm is a natural outgrowth of the idea of a working set.
However, reality is less clear cut.
WSClock is actually embodies several ideas,
one of which is connected to the idea of a working set.
As the name suggests another of the ideas is the clock implementation
of 2nd chance.
The actual implemented algorithm is somewhat complicated and not a
clean elegant concept.
It is important because

It works well and is in common use.

The embodied ideas are themselves interesting.

Inelegant amalgamations of ideas are more commonly used in real
systems than clean, elegant, oneidea algorithms.
Since the algorithm is complicated we present it in stages.
As stated above this is an important algorithm since it works well and
is used in practice. However, I certainly do not assume you remember
all the details.

We start by associating a node with every page loaded in memory
(i.e., with every frame given to this process).
In the node are stored R and M bits that we assume are set by the
hardware.
(Of course we don't design the hardware so really the R and M bits
are set in a hardware defined table and the nodes reference the
entries in that table.)
Every k clock ticks the R bit is reset.
So far this looks like NRU.
To ease the explanation we will assume k=1, i.e., actions
are done each clock tick.

We now introduce an LRU aspect (with the virtual time
approximation described above for working set): At each clock
tick we examine all the nodes for the running process and store
the current virtual time in all nodes for which R is 1.
Thus, the time field is an approximation to the time of the
most recent reference, accurate to the clock period. Note that
this is done every clock tick (really every k ticks) and
not every memory reference. That is why it is feasible.
If we chose as victim the page with the smallest time field, we
would be implementing a virtual time approximation to LRU.
But in fact we do more.

We now introduce some working set aspects into the algorithm by
first defining a time constant τ (analogous to ω in the
working set algorithm) and consider all pages older than τ
(i.e., their stored time is smaller than the current time minus
τ) as candidate victims.
The idea is that these pages are not in the working set.
The OS designer needs to tune τ just as one would need to
tune ω and, like ω, τ is quite robust (the same
value works well for a variety of job mixes).
The advantage of introducing τ is that a victim search can
stop as soon as a page older than τ is found.
If no pages have a reference time older than Tau, then the page
with the earliest time is the victim.

Next we introduce the other aspect of NRU, preferring clean to
dirty victims.
We search until we find a clean page older than τ, if
there is one; if not, we use a dirty page older than τ.

Now we introduce an optimization similar to prefetching (i.e.,
speculatively fetching some data before it is known to be needed).
Specifically, when we encounter a dirty page older than τ
(while looking for a clean old page), we write the dirty page back
to disk (and clear the M bit, which Tanenbaum forgot to mention)
without evicting the page, on the
presumption that, since the page is not in (our approximation to)
the working set, this I/O will be needed eventually.
The down side is that the page could become dirty again, rendering
our speculative I/O redundant.
Suppose we've decided to write out old dirty pages
D_{1} through D_{d} and to replace old clean page
C with new page N.
We must block the current process P until N is completely read
in, but P can run while D_{1} through D_{d} are
being written. Hence we would desire the I/O read to be done
before the writes, but we shall see later, when we study I/O, that
there are other considerations for choosing the order to perform
I/O operations.
Similarly, suppose we can not find an old clean page and have
decided to replace old dirty page D_{0} with new page N,
and have detected additional old dirty pages D_{1} through
D_{d} (recall that we were searching for an old clean
page). Then P must block until D_{0} has been written
and N has been read, but can run while D_{1} through
D_{d} are being written.

We throttle the previous optimization to prevent overloading the
I/O subsystem.
Specifically we set a limit on the number of dirty pages the
previous optimization can request be written.

Finally, as in the clock algorithm, we keep the data structure
(nodes associated with pages) organized as a circular list with a
single pointer (the hand of the clock).
Hence we start each victim search where the previous one left
off.
4.4.10: Summary of Page Replacement Algorithms
Algorithm  Comment


Random  Poor, used for comparison

Optimal  Unimplementable, used for comparison

LIFO  Horrible, useless

NRU  Crude

FIFO  Not good ignores frequency of use

Second Chance  Improvement over FIFO

Clock  Better implementation of Second Chance

LRU  Great but impractical

NFU  Crude LRU approximation

Aging  Better LRU approximation

Working Set  Good, but expensive

WSClock  Good approximation to working set

4.5: Modeling Paging Algorithms
4.5.1: Belady's anomaly
Consider a system that has no pages loaded and that uses the FIFO
PRU.
Consider the following “reference string” (sequences of
pages referenced).
0 1 2 3 0 1 4 0 1 2 3 4
If we have 3 frames this generates 9 page faults (do it).
If we have 4 frames this generates 10 page faults (do it).
Theory has been developed and certain PRA (so called “stack
algorithms”) cannot suffer this anomaly for any reference string.
FIFO is clearly not a stack algorithm. LRU is. Tannenbaum has a few
details, but we are skipping it.
Repeat the above calculations for LRU.