Class 1 CS 202 29 January 2015 On the board ------------ CS202: Operating Systems Instructor: Michael Walfish TAs: Yang Cui, Ye Ji, Ryan Keavney, Cheng Tan http://www.cs.nyu.edu/~mwalfish/classes/15sp 1. Introduction and goals 2. What is an operating system? 3. Course mechanics 4. Quick poll 5. Intro to processes 6. Questionnaire --------------------------------------------------------------------------- 1. Introduction and goals --Hello --Introduce TAs --Class goals a. learn how systems, especially operating systems, work b. learn abstractions and concepts in operating systems... c. ... which will be useful beyond the kernel (other large-scale systems, etc.) 2. What is an operating system? [draw picture of hardware (memory, CPU, disk, I/O), OS, user-level programs] [OS includes OS services: processes, virtual memory, file system (file contents, directories and file names), security, networking, inter-process communication, time, terminals, etc.] --definition: An operating system implements a virtual machine that is (ideally) easier to program than the raw hardware --layer of software; usually has to deal with the messy details of the hardware --In some sense: OS is just a software engineering problem: how do you convert what the hardware gives you into something application programmers want? --Job of OS: provide services to user-level programs --hardware consists of resources: CPU, memory, disk, I/O devices did I leave anything out? hardware is nasty to program directly! --consider what is involved in getting things to the disk..... --this software is classically described as doing two things: * managing the resources of the machine --example: scheduling -- give every process some of the CPU --example: virtual memory: give every process some memory * abstracting the hardware --hide details of hardware (OS gets dirty; programmers don't) --what are examples of managing resources and providing abstractions? scheduling, file systems, memory, I/O * scheduling: --abstraction: process has the illusion that it is running continuously; it is not --isolation: user program that is hogging CPU gets switched out in favor of another user's program * file systems: --abstraction: illusion file is continuous array of bytes; it's not fd = open("/tmp/foo", WR_ONLY) rc = write(fd, "abc...z", 26) abstractions here: --files --file descriptors applications see these abstractions through system calls --isolation: user program can't write to a file unless it has permission * memory: movl 0x1248, %edx [means "get contents of 0x1247 and put in %edx"] --abstraction: user program thinks it is reading from 0x1248; it is not --more generally, user program thinks it has a linear, contiguous address space; it does not --isolation: user program can't write to another user's memory * I/O: --abstracting the hardware: example: # fd = open("/dev/audio", WR_ONLY); # write(fd, , ....); versus outb(0x74, 123) outb(0x83, ....) ..... --note: --operating system makes the device look like a file! (obvious in hindsight. not at the time this design idea entered). advantages: --files and device I/O use same programming --names have same syntax and meaning, so programs can take either as a parameter. that is, the **same** program works, whether it's being asked to read or write a file or I/O. --above, we could have had open("/dev/tty", WR_ONLY) or open("/tmp/foo", WR_ONLY) and the rest of the code would have been unaffected --managing resources: what happens if every process tries to write to the screen at once? result of abstraction and resource management: --applications are easier to write --impact of bugs reduced --machine's resources shared, which is an efficiency gain 3. Course mechanics and admin [going through this quickly] [write on the board] course Web page: please check it every day. we will communicate three ways: Web page Piazza email components of the course: --lectures --review sessions: 1 hr; reinforce material; lab help --labs --exams --reading --homeworks --honors section: supplementary problems grading policies --lectures: --please ask questions --will write on board; paces lecture --will put lecture notes online --no laptops --review sessions (not required but highly recommended) --review material --work problems --discuss labs --labs: --key piece of the course --some of our labs will be miniOSes that boot on an x86! --this should actually be fun --acknowledgment: Eddie Kohler --often: not much code to write (but lots to learn!) --I used to tell people, "Start early". Now I say, "Start on time". --Regardless, you need to allocate time. --I'm expecting you to feel challenged by the labs. The concept of "no pain no gain" applies to learning. --exams: see course Web page --reading: see course page --homeworks: see course page --honors section: see course page --grading: see course page --policies: please see the Web page, and let me say here: --help on the labs: please ask for lab, but please make sure that you've really thought through your question on your own. --the collaboration and academic integrity policy is real 4. Quick poll Free slots...? 5. Intro to processes * what is a process? --answer: instance of a running program browser, text editor, etc. --the program "sees" an abstraction of a virtual machine (virtual memory, virtual CPU, including registers, etc.). * where do they come from? they create each other! fork() [ ????? HUMAN --> SOURCE CODE --> EXECUTABLE -----------> PROCESS vi gcc as ld HUMAN ---> foo.c ---> foo.s ----> foo.o ---> executable How does the ???? work? NOTE: 'ld' is the name of the linker. it stands for 'linkage editor'. ] * can ask OS for help, via system calls. talk about those more next time. * state diagram [draw running/ready/blocked diagram] NOTE: "waiting"=="blocked" NOTE: preemption is part of this picture. We will come back to this. * implementation [draw picture of process array. NPROC elements. each holds one of these: PCB ----------------- | process id | | state | (ready, runnable, blocked, etc.) | user id | | IP | | open file | | VM structures | | registers | | ..... | (signal mask, terminal, priority, ...) ---------------- ] what is labeled "PCB" (process control block) above is called "proc" in Unix, "task_struct" in Linux, and "process_t" in lab1. some important details: each process has at any given time: stack pointer frame pointer other registers view of memory, which includes: program code (aka "text") constants zeroed-out area for variables stack heap state of OS resources very little else is actually needed, but a modern process does have a lot of associated information: --signal state --UID, signal mask, controlling terminal, priority, whether being debugged, etc., etc.