CS202: HW 9: mmap(), context switches, I/O, disk performance

CS202: HW 9: mmap(), context switches, I/O, disk performance

These problems should be done on your own. They are not to be turned in. Getting help from AI (besides being ruled out by the course policies) will actually hurt you, since the point of these questions isn’t so you can deliver an answer somewhere and check a box; it’s for you to gain the practice and experience of working through the problems. You will need that general skill (of absorbing something by practicing it) in life, and in this semester you will need the specific problem-solving skills that are emphasized in these homeworks.

mmap()

Consider the following code excerpt that uses mmap():

int main(int argc, char* argv[]) {
    // ...
    // readme.txt is a file that is 5KB in length.
    int fd = open("readme.txt", O_RDWR, 0);

    char *map1 = (char*)mmap(NULL, 5120, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    char *map2 = (char*)mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

    // assume that neither mmap call fails

    char x = 0;
    char y = 0;
    // Line 1
    for (int i = 0; i < 5120; i++) {
        x = map1[i];
        y = map2[i % 4096]; 
    }
    // Line 2
    // ...
}

// The signature of mmap is:
//
//   void* mmap(void *addr, size_t len, int prot, int flags, int fd, off_t offset);
//   
// If addr is NULL, the kernel chooses the start virtual address.
//
// If not already in memory, disk blocks are fetched into physical pages on access.
//
// Each separate call to mmap() with the same fd maps the file in a separate
// location and does not undo prior mappings. This means that in the example
// above, the file can be read and written from multiple virtual addresses
// within the same address space.

Assume the operating system minimizes the number of virtual and physical pages required in order to implement mmap().

  1. How many last-level (level-4) page table entries are created or modified as a result of the two mmap() calls?

  2. At line 2, how many physical pages are mapped into the process as a result of the mmap() calls?

Context switches

In this question, you will implement swtch(), which switches between two user-level threads. You will do so for a user-level threading package, running on the TeensyArch processor. TeensyArch has 4 general registers, %r0-%r3, a stack pointer, a base (or frame) pointer, and an instruction pointer %rip. Assume the same stack frame structure as the architecture we’ve been covering in class (x86); further, all registers need to be saved by a function’s callee (that is, registers are callee-saved, also known as call-preserved).

Fill out swtch(). Below are definitions, declarations, and utility functions that you can use.

struct thread {
    int thread_id;
    uint64_t stack;
    /* ... */
};

enum register {
    R0,
    R1,
    R2,
    R3,
    RBP,
    RSP
};

// Push CPU's register r to the stack
void push_register(register r);

// Pop from the stack and into the CPU's register r
void pop_register(register r);

// Returns the CPU's current value of register r. 
uint64_t read_register(register r);

// Update the CPU's register r so it holds value `value`.
void write_register(register r, uint64_t value);


// Context switch from thread t1 to thread t2.
void swtch(struct thread *t1, struct thread *t2) {
    // On entry this function is run by thread t1.

    // Your code here. We have started it for you.
    push_register(RBP);
    push_register(R0);

    // YOUR CODE HERE

















    return; // The function should return to the
            // point where thread t2 called swtch().
}

Polling vs. interrupts

As discussed in class, two ways for an operating system to become aware of external events associated with a device are interrupts and polling. We observed that if a computer were receiving many interrupts, it might spend all of its time processing them and not get other work done; in that case, the operating system should switch to polling the device. Now consider the following:

  1. How many interrupts per second would your typing generate on average? Show your work.
  2. Should the computer use polling or interrupts to handle your fast typing? Explain why your choice is acceptable and the other choice is not. Do not use more than three sentences.

Disk performance

Consider a disk with the following characteristics:

  1. What is the storage capacity of the disk in bytes or gigabytes? (Explain briefly.)
  2. What is the sequential transfer bandwidth, expressed in bytes/second or megabytes/second? (Explain briefly.)
  3. Now assume that the disk with the above characteristics is given a never-ending stream of requests to read one sector at a time, with each request chosen randomly from all possible sectors on the disk. Assume that these read requests are scheduled in FIFO order. State the effective long-term transfer rate that the disk can sustain, expressed in bytes/second or kilobytes/second, and explain briefly.

In doing the third question, the following may be useful:

  • You can (and probably should) make several percentage point approximations, for example 4096 can be represented as 4000, and 13 × 7.5 is approximately 100.
  • The term "long-term transfer rate" refers to R/X, where R is the number of bytes to transfer in each read, and X is the average length of time that a read takes.

Disk scheduling

Disk requests come into the disk driver for tracks 10, 22, 20, 2, 40, 6, and 38, in that order. A seek takes 6 msec per track moved (note that, among other simplifications, we are not taking into account the length of the seek). How much seek time is needed for the following scheduling algorithms?

(a) SSTF

(b) LOOK (SCAN, but doesn't move to the end)

In all cases, the arm is initially at track 20. (Note that SCAN is a synonym for the elevator algorithm.)

Adapted from Tanenbaum Chapter 5 Number 24.