Released Monday, March 23, 2015
Due Friday, April 3, 2015, 9:00 PM
This lab will introduce you to another WeensyOS operating system. (Recall that WeensyOS is a series of coding exercises, written by Eddie Kohler. All of this lab, including the description, is due to Eddie.) In this WeensyOS, you will implement process memory isolation, virtual memory, and some system calls.
Some notes:
We recommend doing this lab inside the development platform that you set up in the previous labs (the virtual devbox or the CIMS machines).
From within this platform,
change to your lab directory, use
Git to commit changes you've made since handing in lab 4 (if any),
obtain the latest version of the course
repository, and create a local branch called lab5 based on
origin/lab5 (which is our lab5 branch). The commands are as follows, but you
may not need all of them (for example, if git status
indicates that
there are no changes to commit):
$ cd ~/cs202 $ git commit -am 'my solution to lab4' Created commit 254dac5: my solution to lab4 3 files changed, 31 insertions(+), 6 deletions(-) $ git pull Already up-to-date. $ git checkout -b lab5 origin/lab5 Branch lab5 set up to track remote branch refs/remotes/origin/lab5. Switched to a new branch "lab5" $ cd lab5
You are now in the lab5 directory.
Run make run
. You should see something like this, which shows four
versions of the p-allocator
process running in parallel:
This image loops forever; in an actual run, the bars will move to the right and stay there. Don't worry if your image has different numbers of K's or otherwise has different details.
If your bars run painfully slowly, edit the p-allocator.c
file and
reduce the ALLOC_SLOWDOWN
constant.
p-allocator.c
. Here’s what’s going on in the physical memory display.
p-allocator.c
), but linked to use a different region of memory.
Here are two labeled memory diagrams, showing what the characters mean and how memory is arranged.
The virtual memory display is similar.
You will implement complete and correct memory isolation for WeensyOS processes. Then you'll implement full virtual memory, which will improve utilization. You'll implement fork: creating new processes at runtime. Finally, for extra credit, you'll implement exit.
As usual,
the code that you write will be limited. Our solutions contain less than 200
lines. All your code goes in kernel.c
(except for part of
Exercise 6).
Read the README-OS.md
file for information on how to run WeensyOS. If QEMU’s default display causes accessibility problems, you will want to run make run-console
. To make run-console
the default, run export QEMUCONSOLE=1
in your shell.
For debugging convenience, we have provided two printf functions:
debug_printf(format, ...) log_printf(format, ...)
debug_printf
will print one line of output at the bottom
of the animation screen. Each time you invoke this function, it will overwrite the previous output.
log_printf
will print the output into a file log.txt
We recommend using make run-gdb
for debugging, as well as adding log_printf
statements to your code.
WeensyOS's memory system layout is described by several constants.
KERNEL_START_ADDR
|
Start of kernel code. |
KERNEL_STACK_TOP
|
Top of kernel stack. The kernel stack is one page long. |
console
|
CGA console memory. |
PROC_START_ADDR
|
Start of application code. Applications should not be able to access memory below PROC_START_ADDR , except for the single page at console .
|
MEMSIZE_PHYSICAL
|
Size of physical memory in bytes. WeensyOS does not support physical addresses ≥ MEMSIZE_PHYSICAL . Equals 0x200000 (2MB).
|
MEMSIZE_VIRTUAL
|
Size of virtual memory. WeensyOS does not support virtual addresses ≥ MEMSIZE_VIRTUAL . Equals 0x300000 (3MB).
|
WeensyOS uses several macros to handle addresses. They are defined at the top of x86.h
. The most important include:
PAGESIZE
|
Size of a memory page. Equals 4096 (or, equivalently, 1 << 12 ).
|
PAGENUMBER(addr)
|
The page number for the page containing addr . Expands to something like addr / PAGESIZE .
|
PAGEADDRESS(pn)
|
The initial address in page number pn . Expands to something like pn * PAGESIZE .
|
PTE_ADDR(entry)
|
The physical address within a page table entry. Remember that a page table entry contains 32 bits, with the bottom 12 bits being configuration and the top 20 bits identifying the page number. This returns the top 20 bits as an address (by masking out, or clearing, the bottom 12 bits). |
WeensyOS begins with the kernel and all processes sharing a single address space. This is defined by the kernel_pagetable
page table. kernel_pagetable
is initialized to the identity mapping: virtual address X maps to physical address X.
As you work through the lab, you will shift processes to using their own independent address spaces, where each process can access only a subset of physical memory.
The kernel, though, still needs the ability to access any location in physical memory. Therefore, all kernel functions run using the kernel_pagetable
page table. Thus, in kernel functions, each virtual address maps to the physical address with the same number. The exception
function explicitly installs kernel_pagetable
when it begins.
WeensyOS system calls are more expensive than they need to be, since every system call switches address spaces twice (once to kernel_pagetable
and once back to the process’s page table). Real operating systems avoid this overhead. In real OSes kernels access memory using process page tables, rather than a kernel-specific kernel_pagetable
. This makes the kernel code more complicated, since kernels can’t always access all of physical memory directly.
WeensyOS processes could stamp all over the kernel’s memory if they wanted. Better stop that. Change kernel
, the kernel initialization function, so that kernel memory is inaccessible to applications—except for the memory holding the CGA console (the single page at (uintptr_t) console == 0xB8000
).
When you are done, WeensyOS should look like this. In the virtual map, kernel memory is no longer reverse-video, since the user can’t access it. Note the lonely CGA console memory block.
Hints:
virtual_memory_map
. A description of this function is in kernel.h
. You will benefit from reading all the function descriptions in kernel.h
.
virtual_memory_map
, it is in k-hardware.c
, along with many other grody hardware functions.
perm
argument to virtual_memory_map
is a bitwise-or of zero or more PTE
flags, PTE_P
, PTE_W
, and PTE_U
. PTE_P
marks Present pages (pages that are mapped). PTE_W
marks Writable pages. PTE_U
marks User-accessible pages—pages accessible to applications. You want kernel memory to be mapped with permissions PTE_P|PTE_W
, which will prevent applications from reading or writing the memory, while allowing the kernel to both read and write.
sys_page_alloc
system call is safe. Applications shouldn’t be able to use sys_page_alloc
to screw up the kernel.
Similarly, if the application requests to allocate an address that is the last page in virtual memory space, you should arrange for the system call to return an error (this page will be used as the stack page later).
Implement process isolation by giving each process its own independent page table. Your OS should look like this:
Thus, each process only has permission to access its own pages. You can tell this because only its own pages are shown in reverse video.
What goes in per-process page tables:
kernel_pagetable
.
pageinfo[PN].owner == processid
).
program_load
works, you must use addresses in kernel address space (i.e., below PROC_START_ADDR
) for the initial processes’ page tables.
pagetable->entry[0]
should equal (x86_pageentry_t) address_of_new_l2_pagetable | PTE_P | PTE_W | PTE_U
. You need to set this up yourself.
PROC_START_ADDR
should be copied from those in kernel_pagetable
. You can use a loop with virtual_memory_lookup
and virtual_memory_map
to copy them. Alternately, you can copy the mappings from the kernel’s page table into the new page table using memcpy
. This is faster, but make sure you copy the right data!
PROC_START_ADDR
—should be inaccessible to user
processes (the PTE_U
bit should be cleared in the
corresponding page table entries). In our solution (shown above),
these addresses are totally inaccessible (they are not mapped, so
they show as blank). However,
you can implement this differently (for example, by having the
inaccessible addresses be mapped but not with PTE_U
privilege); if you do that, your display will look different from the
animated gifs throughout this lab description.
How to implement per-process page tables:
process_setup
to create per-process page tables.
copy_pagetable(x86_pagetable* pagetable, int8_t owner)
function that allocates and returns a new page table, initialized as a copy of pagetable
. This function will be useful in Exercise 5. In process_setup
you can modify the page table returned by copy_pagetable
according to the requirements above. Your function can use pageinfo
to find free pages to use for page tables. Read about pageinfo
at the top of kernel.c
.
x86.h
will be handy,
particularly PTE_ADDR
(see the section on "Address
composition" above).
So far, WeensyOS processes use physical page allocation: the page with physical address X is used to satisfy the sys_page_alloc(X)
allocation request for virtual address X. This is inflexible and limits utilization. Change the implementation of the INT_SYS_PAGE_ALLOC
system call so that it can use any free physical page to satisfy a sys_page_alloc(X)
request.
Your new INT_SYS_PAGE_ALLOC
code must perform the following tasks.
pageinfo
array. Return -1
to the application if you can’t find one. Use any algorithm you like to find a free physical page; we just return the first one we find.
pageinfo
.
Don’t modify the physical_page_alloc
helper function, which is also used by the program loader. You can write a new function if you want.
Here’s how our OS looks after this step.
Hints:
kernel.c
for some hints on how to examine the pageinfo
array.
pageinfo[PAGENUMBER].refcount == 0
.
Now the processes are isolated, which is awesome. But they’re still not taking full advantage of virtual memory. Isolated address spaces can use the same virtual addresses for different physical memory. There’s no need to keep the four process address spaces disjoint.
In this step, change each process’s stack to start from address 0x300000 == MEMSIZE_VIRTUAL
. Now the processes have enough heap room to use up all of physical memory!
If there’s no physical memory available, sys_page_alloc
should return an error to the caller (by returning -1). (Our solution additionally prints “Out of physical memory!
” to the console when this happens; you don’t need to.)
We return in this exercise to a topic that we saw earlier in the semester:
the fork()
system call. In the first WeensyOS lab, you implemented
a primitive version of fork
. In this exercise, you will implement
a more realistic fork
; this one will actually give the new process
a separate memory address space.
Backing up a bit, recall that fork
is one of Unix’s great ideas. It starts a new process as a copy of an existing process. (fork
returns in each process, the original and the copy.) To the child process, it returns 0. To the parent process, it returns the child’s process ID.
Now, run WeensyOS with make run
or make run-console
. At any time, press the ‘f
’ key. This will soft-reboot WeensyOS and ask it to run a single p-fork
process, rather than the gang of allocator
s. You should see something like this:
Your job now is to implement (most of) fork
.
fork
.
Specifically, we include the code that initializes the child process’s registers as a copy of the parent process’s registers (and sets reg_eax
to 0).
We also include the code
that looks for a free process slot in the processes[]
array. (You implemented these things in the first WeensyOS lab.) If no slot exists, fork
returns -1
to the caller.
current->p_pagetable
, the forking process’s page table, using your copy_pagetable
function from earlier.
fork
must examine every virtual address in the old page table. Whenever the parent process has an application-writable page at virtual address V
, then fork
must allocate a new physical page P
; copy the data from the parent’s page into P
, using memcpy
; and finally map page P
at address V
in the child process’s page table.
virtual_memory_lookup
to query the mapping between virtual and physical addresses in a page table.
When you’re done, you should see something like this after pressing ‘f
’.
An image like this means you forgot to copy the data for some pages, so the processes are actually sharing stack and/or data pages:
This exercise is an honors supplement (meaning required for the students in section 001, and does not "count" for the students in section 002). However, we encourage everyone to do it, to see how fork would really work.
It’s wasteful for fork()
to copy all of a process’s memory. For example, most processes, including p-fork
, never change their code. So what if we shared the memory containing the code? That’d be fine for process isolation, as long as neither process could write the code.
Change the process loader in k-loader.c
to detect read-only program segments and map them as read-only for applications (PTE_P|PTE_U
). A program segment ph
is read-only iff (ph->p_flags & ELF_PFLAG_WRITE) == 0
.
Your fork()
code shouldn’t copy shareable pages, but it should keep track of the number of active references to each user page. Specifically, if pageinfo[pn].refcount > 0
and pageinfo[pn].owner > 0
, then pageinfo[pn].refcount
should equal the number of times pn
is mapped in process page tables. Note: the code does not have "Your code here" for this piece because there are several options. Depending on your implementation, you may need to
modify any or all of copy_pagetable
, fork
, or helper functions called by these. Other possibilities are conceivable too.
When you’re done, running p-fork
should look like this:
Each process’s virtual address space begins with a darker-colored “1”. The dark color indicates that the corresponding physical page has reference count (refcount
) greater than 1. (The color difference is only visible on graphical QEMU; the console version doesn’t distinguish between light reverse-video and dark reverse-video.)
Hint:
k-loader.c
, mark a program segment read-only after the memcpy
and memset
operations that add data to the segment. Otherwise you’ll get a fault.
So far none of your test programs have ever freed memory or exited. Memory allocation’s pretty easy until you add free! So let’s do that, by allowing applications to exit. In this exercise you’ll implement the sys_exit
system call, which exits the current process.
We hope everyone tries this exercise, but it is optional, and definitely harder than the others. Freeing memory will tend to expose weaknesses and problems in your other code.
To test your work, use make run
and then type ‘e
’. This reboots WeensyOS to run the p-forkexit
program. (Initially it’ll crash because sys_exit()
isn’t implemented yet.) p-forkexit
combines two types of behavior:
The result is that once your code is correct, p-forkexit
makes crazy patterns forever. An example:
Your picture might look a little different; for example, thanks to Exercise 6, your processes should share a code page, which would appear as a darker-colored “1”.
Here’s your task.
sys_exit
should mark a process as free and free all of its memory. This includes the process’s code, data, heap, and stack pages, as well as the pages used for its page directory and page table pages.
p-forkexit
, unlike in previous parts of the lab, sys_fork
can run when there isn’t quite enough memory to create a new process. Your code should handle this case. If there isn’t enough free memory to allocate a process, fork()
should clean up after itself (i.e., free any memory that was allocated for the new process before memory ran out), and then return -1
to the caller.
The virtual_memory_check
function, which runs periodically, should help catch some errors. Feel free to add checks of your own.
sysenter
and sysexit
instructions!
If you are working in a pair, only one of the team members shoud submit.
If you have not done so yet, commit your code.
$ git status # see what files you changed $ git diff # examine your changes $ git commit -am "My solutions to lab5"Then, follow these steps:
$ make handinThis command removes binaries from your directory, archives your directory in a tarball called
lab5-handin.tgz
, and invokes a
script that submits this tarball to our submission server.
Assuming that you performed the submission
setup step in lab1,
you should see something like this:
CLEAN CREATE lab5-handin.tgz SUBMITTING ... ... progress ... Submission successful
After a successful submission, you should receive email confirmation (at your NYU email address).
If you submit multiple times, we will take the latest submission and count slack hours accordingly.
This completes the lab.
Last updated: Mon May 04 11:24:46 -0400 2015 [validate xhtml]