On the board ------------ CS372H: Introduction to Operating Systems: Honors Instructor: Michael Walfish TA: Namrata Baliga http://www.cs.utexas.edu/~mwalfish/classes/s10-cs372h 1. Introduction and goals 2. What is an operating system? Why study them? How will we study them? 3. Course mechanics and admin 4. Examples and history --------------------------------------------------------------------------- Welcome! 1. Introduction and goals --Hello --Class goals a. you'll learn about operating systems concepts --useful beyond the kernel b. you'll learn about operating systems abstractions c. you'll build one, which will both enhance your understanding of the above two and tell you what's really going on. more to say about this in a bit....... 2. What is an operating system? Why study them? How will we study them? --What is an OS? --Why study them? --How will we study them? A. What is an OS? [draw picture of hardware (memory, CPU, disk, I/O, OS, user-level programs] --definition: An operating system implements a virtual machine that is (ideally) easier to program than the raw hardware --layer of software; usually has to deal with the messy details of the hardware --In some sense: OS is just a software engineering problem: how do you convert what the hardware gives you into something application programmers want? --Job of OS: provide services to user-level programs --hardware consists of resources: CPU, memory, disk, I/O devices did I leave anything out? hardware is nasty! consider what is involved in getting things to the disk..... --this software is classically described as doing two things: * managing the resources of the machine (often means protection) --example: scheduling -- give every process some of the CPU * abstracting the hardware --hide details of hardware (OS gets dirty; programmers don't) --so who cares about these things? --well, resource management means that one bad program doesn't screw up another: isolation, protection, etc. --and abstracting the hardware is about convenience --you really don't want applications to have to program the bare hardware --would lead to lots of repeated code --and unclear how to run more than one application at once, particularly more than one that are mutually distrustful. --what are examples of managing resources and providing abstractions? memory, scheduling, file systems, I/O * memory: movl 0x1247, %edx [means "get contents of 0x1247 and put in %edx"] --user program thinks it is writing to 0x1247; it is not --more generally, user program thinks it has a linear, contiguous address space; it does not --user program can't write to another user's memory ***these are different points*** * scheduling: --user program has the illusion that it is running continuously; it is not. --user program gets switched out by the kernel in favor of another user's program --while (1) {} ***these are different points*** --you could imagine telling the user program that it was getting switched out * file systems: --user program can't write to a file unless it has permission --user program gets illusion that file is continuous on the disk * I/O: --managing resources: what happens if every process tries to write to the screen at once? --abstracting the hardware: example: # fd = open("/dev/", WR_ONLY); # write(fd, , ....); --[foo can be audio or tty] --operating system makes the device look like a file! (obvious in hindsight. not at the time.). advantages: --files and device I/O use same programming --names have same syntax and meaning, so programs can take either as a parameter. that is, the **same** program works, whether it's being asked to read or write a file or I/O. another example: fd = open("/tmp/foo", WR_ONLY) rc = write(fd, "abc...z", 26) *** turns out that file descriptors are an inspired choice *** B. Why study them? a. ideas are everywhere: resource management and abstraction, for example. b. understand fundamental design trade-offs --between performance and simplicity --between putting functions in hardware or software --etc. c. useful to know "how things work". peek under the hood. d. at the same time, there are still unsolved problems! --OS security is sort of a joke. people defend their operating systems by deploying firewalls. --multicore: yeah, the operating systems run, but there's an argument that perhaps the abstractions should be different (interaction between caches and address space abstractions) C. What level will we study at? a. learn how to approach problems --fundamental issues --concept of a design space --case studies: sometimes, state-of-the-art techniques goal: when you're faced with a similar problem, you will be able to devise a good solution timescale: big long-term payoff b. specific techniques you should be able to apply --time-tested solutions to hard problems --avoid "hacks" --examples: concurrent programming, transactions, etc. c. details of modern OSes --file systems, network stacks, ... --lots of material, changes relatively quickly --not a priority of this class --but we will sometimes use "real" examples to motivate the fundamental ideas 3. Course mechanics and admin --write on the board: course Web page: please check it every day. primary way that we will communicate components of the course: --reading --labs --homeworks --exams --lectures grading policies --reading: see Web page. will try not to make this too much of a burden. --labs: --centerpiece of the course. very awesome set of labs: JOS!! --exposes hardware to user-level programs (few abstractions provided by the kernel, but you will build them in user space.) --but does implement protection and resource management --unlike much OS courseware, this one will boot on an x86!!! --**PLEASE** start these early --can guarantee you that if you start them early and leave enough time, you will get a lot out of this class. --lab 1: lots to learn; not much code to write --want to see how first assignment goes. may use pair programming after that. --homework: --mainly self assessments. you should do them, but we don't collect them --exams: --ad-hoc quizzes --in-class midterm --final during finals week --lectures: will try to encourage discussion --please ask questions!! --will tend to write on board; paces lecture --will put lecture notes online but they will be unpolished --am reserving the right to say "no laptops" --grading: --40%: labs --40%: final exam --15-20%: midterm --policies: please see the Web page 4. Examples and history A. technology changes --> techniques need to change. --typical academic computer in 1981, 1996, 2005: [MIPS: sort of meaningless. but perhaps better than clock speed.] MIPS 1-3 500 15,000 [factor: 100] DRAM capacity 128 KB 128 MB 1024 MB [factor: 1,000] Disk capacity 10 MB 4 GB 400 GB [factor: 40,000] Net b/w 9600 bits/s 100 Mbit/s 1000 Mbit/s [factor: 100,000] # users/machine 100 < 1 < 1 [factor: 100] --phase 1: hardware expensive, humans cheap --computers cost millions of $ --> optimize to make most efficient use of hardware --phase 2: hardware cheap, humans expensive --terminals to let multiple users interact with system at same time --sacrifice CPU time to get better response time --phase 3: hardware very cheap, humans very expensive --personal computing --computers are cheap, so give everyone a computer --initially, don't worry about multiprogramming and protection (one person per computer) --but humans are expensive, don't want to waste their time by letting programs crash each other, so add back in memory protection, multiprogramming --phase 4: distributed systems: computers so cheap, everyone gets a bunch of them --now have to ask how to share resources and data easily --how to coordinate them? --networks cheap to, so we can coordinate geographically distributed machines B. OSes --1960s: batch systems --IBM 7094, from early 1960s --you've all heard of punchcards --kind of a long edit/run/debug cycle --1960s: CTSS: first time-sharing system --now multiple programs could run at once! --opens the door to interactivity --everyone gets a terminal!! --mid 1960s: project MAC at MIT builds Multics --time-sharing but expensive --the model was like an electric power plant: all of Massachusetts gets a Multics machine, etc. --Multics a collaboration between MIT and Bell Labs --Issues: complex, expensive, slow. --announced in 1963; ran in 1969 --1969 to early 1970s: Unix (from Thompson and Ritchie at Bell Labs) --goal was to run on cheap hardware. PDP-7 --eventually got a PDP-11: only $40k! (which was $173k in 2008 dollars) --small system: 9000 LOC --Eventually, Thompson and Ritchie decide they need a new programming language to write Unix in --So C gets born --Unix took over the world. --Its abstractions are still in use everywhere --(Which is arguably depressing) --great example of a small number of mechanisms going very far (high ratio of capabilities to mechanism) a. stuff was added to Unix in the 1980s, at Berkeley b. Andy Tannenbaum: "System 7 was a dramatic improvement over its predecessors, and over its successors as well". --meanwhile, in the 1970s, at Xerox PARC, people are trying to build the personal computer.... --keep in mind that at the beginning of this era, telling someone that you were going to build a personal computer was like telling them that you would have a personal nuclear reactor --computers were totally slow when they started but those designers knew that computers would be fast, so start building the thing people want and worry later about performance --eventually leads to the graphical interfaces we have today as well as modern technology