Class 2 CS 372H 21 January, 2010 On the board ------------ http://www.cs.utexas.edu/~mwalfish/classes/s10-cs372h 1. review last class, today, preview 2. PC architecture 3. x86 instructions 4. gcc calling conventions 5. PC emulation --------------------------------------------------------------------------- 1. Context a. Last time....... * discussed the essential function of an OS * reviewed some of the history --correction: the human operator of the batch processing system *is* part of the "operating system" --which part? --the hardware abstraction was in two things: --a program already loaded into memory --subroutines in your punchard deck (but I'm not sure how this worked) --batch processing, IBM 7094 --Fortran Monitor System --job + libraries + monitor routine that killed jobs > time estimate - "IBM had been very generous to MIT in the fifties and sixties, donating its biggest scientific computers. When a new top of the line 36-bit scientific machine came out, MIT expected to get one. "In the early sixties, the deal was that MIT got one 8-hour shift, all the other New England colleges and universities got a shift, and the third shift was available to IBM for its own use. One use IBM made of it was yacht handicapping: the president of IBM raced big yachts on Long Island Sound, and these boats were assigned handicap points by a complicated formula. There was a special job deck kept at the MIT Computation Center, and if a request came in to run it, operators were to stop whatever was running on the machine and do the yacht handicapping job immediately." - http://www.multicians.org/thvv/7094.html --clarify the history - early days of PCs, didn't do a bunch of things that JOS will do! --point is, that when computers stopped having tons of users, there was less need for protection and isolation. - now of course, we need those features, and modern OSes have them. - seems to be true that as a new device gets an OS, it gradually recapitulates the history of the others: starts with no OS, then a primitive one, etc., etc. - this is reason why studying "obsolete" concepts could be useful: because they won't be obsolete on some new platform, e.g., phones, sensors, smart cards b. today x86 architecture and assembly need this because we're building an OS on top of the chip. --OS needs to understand the hardware architecture --which means OS programmer needs to understand it --OS programmer needs to understand assembly language..... c. where are we going? --OS structure, process abstraction --memory protection, virtual memory, paging hardware --concurrency 2. PC architecture --components --CPU (registers, execution unit, memory management) --I/O --memory map (physical address space) a. components A full PC has: * an x86 CPU with registers, execution unit, and memory management * CPU chip pins include address and data signals * memory * disk * keyboard * display * other resources: BIOS ROM, clock, ... b. CPU --runs instructions: for (;;) { run next instruction } --needs work space; registers: --8086 started with 4 16-bit registers: AX, BX, CX, DX --each in two 8-bit halves, AH and AL --more work space: memory --address lines and data lines --need to be able to point into memory --SP: stack pointer --BP: frame base pointer [more on these in a bit] --SI, DI: source index, dest index --instructions are in memory: --IP (instruction pointer) --everywhere else, this thing is called the PC, for program counter --incremented after running an instruction --modified by CALL, RET, JMP, conditional jumps --for conditional jumps, there are: --FLAGS -- various condition code --whether last op overflowed -- ... was positive/negative -- ... was [not] zero -- ... carry/borrow on add/subtract -- ... etc. -- whether interrupts are enabled -- direction of data copy instructions --JP, JN, J[N]Z, J[N]C, J[N]O c. I/O * Original PC architecture: use dedicated I/O space --Works same as memory accesses but set I/O signal --Only 1024 I/O addresses --Accessed with special instructions (IN, OUT) --Example: write a byte to line printer: #define DATA_PORT 0x378 #define STATUS_PORT 0x379 #define BUSY 0x80 #define CONTROL_PORT 0x37A #define STROBE 0x01 void lpt_putc(int c) { /* wait for printer to consume previous byte */ while((inb(STATUS_PORT) & BUSY) == 0) ; /* put the byte on the parallel lines */ outb(DATA_PORT, c); /* tell the printer to look at the data */ outb(CONTROL_PORT, STROBE); outb(CONTROL_PORT, 0); } * Memory-Mapped I/O o Use normal physical memory addresses + Gets around limited size of I/O address space + No need for special instructions + System controller routes to appropriate device o Works like "magic" memory: + Addressed and accessed like memory, but ... + ... does not behave like memory! + Reads and writes can have "side effects" + Read results can change due to external events d. physical memory map [draw picture] --two points here. (1) physical address space is mostly ordinary RAM (2) low-memory addresses actually refer to other things. --example: writing to VGA memory makes things appear on the screen --reset or power-on jumps to ROM at 0xffff0 --so what is the first instruction going to have to do? [answer: probably jump] +------------------+ <- 0xFFFFFFFF (4GB) | 32-bit | | memory mapped | | devices | | | /\/\/\/\/\/\/\/\/\/\ /\/\/\/\/\/\/\/\/\/\ | | | Unused | | | +------------------+ <- depends on amount of RAM | | | | | Extended Memory | | | | | +------------------+ <- 0x00100000 (1MB) | BIOS ROM | +------------------+ <- 0x000F0000 (960KB) | 16-bit devices, | | expansion ROMs | +------------------+ <- 0x000C0000 (768KB) | VGA Display | +------------------+ <- 0x000A0000 (640KB) | | | Low Memory | | | +------------------+ <- 0x00000000 --is this an abstraction that the OS provides to others or an abstraction that the hardware is providing to the OS? [the latter] --job of hardware to turn request for address 0x00100004 into a request that goes to the appropriate place in the actual RAM, perhaps at 0x000A0004 (but we don't know). 3. x86 instructions --OS programmer needs to understand assembly language? can get suprisingly far in OS work without being an assembly hacker. the reason is that in general you want to try to get the compiler to write as much of the assembly as possible. but of course there are some instructions you can't express in C, such as clearing interrupts, loading segment registers, etc. and sometimes you need good performance --but there is debate about this --"if OS is controlling the CPU, it should work directly with the CPU's instructions" --these people are probably annoyed by the fact that even assembly instructions are basically macros interpreted by the x86 to some internal microcode. --transition to .... --x86: CISC architecture --unforch, two conventions Intel: op dst, src ATT/gcc: op src, dst (labs) --uses b,w,l suffix on instructions to specify size --examples: movl %eax, %edx ? [edx = eax] register movl $0x12c, %edx ? [edx = 0x12c] immediate movl 0x12c, %edx ? [edx = *(0x12c)] direct movl (%ebx), %edx ? [edx = *(ebx)] indirect movl 4(%ebx), %edx ? [edx = *(ebx + 4)] displaced movl 4(%ebx,%eax,8), %edx ? [edx = *(ebx + eax*8 + 4)] xor %eax, %eax ? [eax = 0] --instruction classes data movement: MOV, PUSH, POP arithmetic: TEST, SHL, ADD, AND... i/o: IN, OUT, .... control: JMP, JZ, JNZ, CALL, RET string: REP MOVSB system: iret, int Intel architecture manual Volume 2 is the reference --what do push and pop actually do? --stack grows down......... [draw picture] --examples: pushl %eax [ subl $4, %esp movl %eax, (%esp) ] popl %eax [ movl (%esp), %eax addl $4, %esp ] call 0x12345 [ pseudo: pushl %eip movl $0x12345, %eip] ret [ pseudo: pop %eip ] ---------------------------------------------------------------------------- admin details --Reading or assignment listed next to a lecture should be done *before* the lecture. --Namrata is the TA. The TA is your friend. --Reminder: start the labs early. ---------------------------------------------------------------------------- NOT REACHED (but might be useful to lab 1) ---------------------------------------------------------------------------- 4. gcc calling conventions --okay we saw how call and ret interact with the stack --but what happens to a function's state, that is, the registers, when a function is called? they might need to be saved, or not. --purely a matter of convention --here's what gcc does: --at entry of a function: %eip points at first instruction of function %esp+4 point at first argument %esp points at return address arg -->esp [ret_addr] --after ret instruction: %eip contains return address %esp points at arguments pushed by caller --> arg %eax contains return value (or trash if function is void) %ecx, %edx may be trashed %ebp, %ebx, %esi, %edi need look the way that they did at the time of the call --in other words: %eax, %ecx, %edx are "caller save": caller's job to push them on the stack if it wants to save them %ebp, %ebx, %esi, %edi are "callee save": callee's job to push them on the stack after function call, and pop them just before doing return --gcc does even more +------------+ | | arg 2 | \ +------------+ >- previous function's stack frame | arg 1 | / +------------+ | | ret %eip | / +============+ | saved %ebp | \ %ebp-> +------------+ | | | | | local | \ | variables, | >- current function's stack frame | etc. | / | | | | | | %esp-> +------------+ / --%esp moves to make stack bigger/smaller --%ebp points at saved %ebp from previous function --function prologue: pushl %ebp movl %esp, %ebp --function epilogue movl %ebp, %esp popl %ebp ret --example [see handout] 5. PC emulation --Bochs does exactly what a real PC would --But it is implemented in software, not hardware [This is a **virtual machine**. We'll see this topic, possibly on steroids, later in the course.] --Uses normal programmatic constructs (if statements, memory, etc.) to emulate processor logic and state --Stores emulated CPU registers in global variables int32_t regs[8]; #define REG_EAX 1; #define REG_EBX 2; #define REG_ECX 3; --Stores emulated physical memory in Boch's memory char mem[256*1024*1024]; --See handout -- Simulate I/O devices, etc., by detecting accesses to "special" memory and I/O space and emulating the correct behavior: e.g., --Reads/writes to emulated hard disk transformed into reads/writes of a file on the host system --Writes to emulated VGA display hardware transformed into drawing into an X window --Reads from emulated PC keyboard transformed into reads from X input event queue --------------------------------------------------------------------------- [Credit to Frans Kaashoek, Robert Morris, and Nickolai Zeldovich for much of this content.]