Class 17
CS 202
03 November 2025

On the board
------------

1. Last time
2. Software architecture: device drivers
3. Synchronous vs asynchronous I/O
4. User-level threading, intro
5. Context switches (user-level threading)
    - swtch()
    - yield()
    - I/O
6. Cooperative multithreading
7. Preemptive user-level multithreading

---------------------------------------------------------------------------

1. Last time

    concurrency is hard, real-world example from a few weeks ago:
        https://aws.amazon.com/message/101925/
            DNS Planner
            Enactor

            fast enactor interleaved with slow enactor
            fast enactor cleanup deleted "active" (but stale) plan
            the architectural problem is that the versioning happened on a
                read from shared storage as opposed to at the instant of
                application at the endpoint
            the precise bug was a concurrency issue

2. Software architecture: device drivers

    The examples on the handout are simple device drivers.

    Device drivers in general solve a software engineering problem ...

        [draw a picture]

        expose a well-defined interface to the kernel, so that the
        kernel can call comparatively simple read/write calls or
        whatever.

        For example, reset, ioctl, output, read, write,
        handle_interrupt()

        this abstracts away nasty hardware details so that the kernel
        doesn't have to understand them.

        When you write a driver, you are implementing this interface,
        and also calling functions that the kernel itself exposes

    ... but device drivers also *create* software engineering problems.
    Fundamental issues:

        Each device driver is per-OS and per-device (often can't reuse
        the "hard parts")

        They are often written by the device manufacturer (core
        competence of device manufacturers is hardware development, not
        software development).

        Under conventional kernel architectures, bugs in device drivers
        -- and there are many, many of them -- bring down the entire
        machine.

    So we have to worry about potentially sketchy drivers ...

    ... but we also have to worry about potentially sketchy devices.

        a buggy network card can scribble all over memory 
        (solution: use IOMMU; advanced topic)

        plug in your USB stick: claims to be a keyboard; starts issuing
        commands. (IOMMU doesn't help you with this one.)

        plug in a USB stick: if it's carrying a virus (aka malware),
        your computer can now be infected. (Iranian nuclear reactors are
        thought to have been attacked this way. Unfortunately for us,
        the same attacks could work against our power plants, etc.)
    
    Stuxnet example

3. Synchronous vs asynchronous I/O

    - A question of interface

    - NOTE: kernel never blocks when issuing I/O. We're discussing the
    interface presented to user-level processes.


    - Synchronous I/O: system calls block until they're handled.
   
    - Asynchronous I/O:

        I/O doesn't block. for example, if a call like read() or write()
        _would_ block, then it instead returns immediately but sets a
        flag indicating that it _would_ have blocked.

        Process discovers that data is ready either by making another
        query or by registering to be notified by a signal (discuss
        signals later)

    - Annoyingly, standard POSIX interface for files is blocking,
    always. Need to use platform-specific extensions to POSIX to get
    async I/O for files. (Although in the user-level threading
    discussion, we will assume a non-blocking read(). This isn't a total
    abuse because read() can be set to be non-blocking, if the fd
    represents a device, pipe, or socket.)

    - Pros/cons:

        - blocking interface leads to more readable code, when
        considering the code that invokes that interface
        
        - but blocking interfaces BLOCK, which means that the code
        _above_ the interface cannot suddenly switch to doing something
        else. if we want concurrency, it has to be handled by a layer
        _underneath_ the blocking interface. 
                
            - We'll see an example of this later.

5. User-level threading

    Setting: there's a _threading package_

    --Review: what *is* a kernel-managed thread? (We refer to that
    as "kernel-level threading.")

        --basically same as a process, except two threads in the same
        process have the same value for %cr3 

        --recall: kernel threads are always preemptive

    --We can also have *user*-level threading, in which the kernel is
    completely ignorant of the existence of threading.

               [draw picture]

            T1     T2     T3
                thr package
                   OS
                   H/W


    --in this case, the threading package is the layer of software that
    maintains the array of TCBs (thread control blocks)

    --threading package has other responsibilities as well:

        --make a new stack for each new thread.

        --scheduling!

     --user-level threading can be non-preemptive (cooperative) or
     preemptive. we'll look at both.

     --but first, revisit context switches, this time for user-space.
     
4. Context switches in user-space

    Note confusion: a "context switch" really is two things:
        - switching the view of memory (%cr3)
        - switching the registers

    The first one isn't relevant for user-level threading.

    Workhorse: swtch() switches registers

        [draw pictures; see handout]

    swtch() is called by yield()
        
        [see handout]

    yield() is called by any thread that couldn't make further progress.

    Good example of simultaneous use of synchronous and asynchronous
    interface. 

        [see handout]

    (Kernels also need swtch(), to switch the *kernel* stack. 

        [for those who want to learn more:
        
        here is an example in an instructional OS, see MIT's sv6

        x86 version:
            https://pdos.csail.mit.edu/6.828/2018/xv6.html
            https://pdos.csail.mit.edu/6.828/2018/xv6/xv6-rev7.pdf

        RISC version:
            https://pdos.csail.mit.edu/6.828/2019/xv6.html

        search for uses of swtch().
        ]

        WeensyOS doesn't use this mechanism because there is only a
        single kernel stack.
    )


6. Cooperative multithreading

    --This is also called *non-preemptive multithreading*.
    
    --It means that a context switch takes place only at well-defined
    points: when the thread calls yield() and when the thread would
    block on I/O.

7. Preemptive multithreading in user-level

    How can we build a user-level threading package that does
    context switches at any time?

    Need to arrange for the package to get interrupted.

    How?

    Signals!

    Deliver a periodic timer interrupt or signal to a thread
    scheduler [setitimer() ]. When it gets its interrupt, swap out
    the thread, run another one

    Makes programming with user-level threads more complex -- all the
    complexity of programming with kernel-level threads, but few of the
    advantages (except perhaps performance from fewer system calls).

    in practice, systems aren't usually built this way, but sometimes it
    is what you want (for example, if you're simulating some OS-like
    thing inside a process, and you want to simulate the non-determinism
    that arises from hardware timer interrupts).

    A larger point: signals are instructive, and are used for many
    things. What a signal is really doing is abstracting a key hardware
    feature: interrupts.

    So this is another example of the fact that the OS's job is to give
    a user-space process the illusion that it's running on something
    like a machine, by creating abstractions. In this example, the
    abstraction is the signal, and the thing that it's abstracting is an
    interrupt.