Class 22 CS 202 21 April 2020 On the board ------------ 1. Last time 2. Debugging, II A. [last time] intro B. [last time] attaching to, and controlling, a process C. [last time] reading and writing a program's memory and registers D. resolving addresses to program names/state E. single-stepping F. breakpoints G. watchpoints H. multithreading --------------------------------------------------------------------------- 1. Last time - finished NFS (focusing on transparency) - debugging, part I. - question we had last time, including a correction **how does the debugger print a stack trace?** a. debugger needs all of the addresses in the call stack b. debugger needs all of the argument values c. debugger needs to know what names those functions and arguments have. look at the handout [draw picture of stack frames] For a., first read current rip. That's the address of the code we stoppped at. Then read current rbp + 8. That's the address of the code that called us (technically, one instruction past the calling instruction). Now, dereference rbp; that's the address of the previous function's frame pointer. Read 8 past that value. That's the address of the code that called _that_ function. Keep going with this process. Next question is b., where the arguments come from. I said something wrong last time. Let's fix it. First thing to say is: this is not handled perfectly by debuggers. But here's the rundown: - in -O0 (no optimization), x86-64, the arguments are passed in registers but then immediately placed in the stack in the called function (so in addresses below the return address). - the information about where variables in each function are pushed is in the symbol table, computed as offsets from the frame pointer. - this information got into the symbol table because the compiler knows exactly where each variable's "home in the stack" is - during the stack unwinding, the debugger knows that a given function has certain arguments, and reads those variables from the location in the stack, per the symbol table. c. all of the above gives addresses, not names. This brings us to the next part of debugging. 2. Today, debugging part II D. Getting function names, variable names, values, lines of code, etc. There are debuggers, e.g., Solaris mdb, that are deliberately designed to offer only such information. However, this is not very human friendly. Symbol tables and symbol files are how we generally translate addresses into human readable form. What do symbol files usually contain: + Mappings from address to global variable names. + Mappings from address to function names. * This is usually provided as a set of extents. For instance 0xa0000 - 0xa2200: main For each function, symbol tables also include information mapping stack offset (from the frame pointer) to local variable name. For instance: 0xa0000 - 0xa2200: main offset 0: argc offset 1: *argv ... [for another example, take a look at our original class 2: handout01.pdf, as.txt. At compile time, the compiler "knows" that (for example) a given variable's address is %rbp - 8. Can use this information to convert addresses in the backtrace to function names, print values for variables, etc. + Mappings from addresses to source file names and line numbers. Of course compiler optimizations can make this harder. For example: + optimized code might never write a variable's value to a stack location. + a single stack slot might be used for multiple variables. + ... Symbols are best efforts, and in practice debuggers cannot always resolve names to values due to these problems. For context: when you compile your code with the "-g" flag, the compiler is embedding in the binary the information above: symbol table, offsets, function names, and so forth. Then, when gdb actually runs on the binary, it reads in this information. [This is partly why debuggers like mdb are preferred by some.] E. Single-stepping * Single step: In gdb the `step` command can be used to execute a single instruction in the program being debugged. The PTRACE_SINGLESTEP command (line 2 in handout section 3) single steps the process being debugged. What happens under the hood is the following: the OS sets the TF bit in the RFLAGS/EFLAGs register (bit 8), which has the processor generate a debug interrupt (INT 1) after a single instruction is executed. The TF bit (along with several other bits in there) is something that is reserved for system software, and should be manipulated in ring 0 (kernel mode). One can of course emulate single-stepping using the breakpoint mechanism described below, but using hardware is easier. F. Breakpoints * Breakpoints set with the `break` command in gdb pause execution **before** a process being debugged is about to execute an instruction at a particular address. + In reality you often use `break` with function names, filename : line number, etc. Rely on symbols to translate human readable values to an address in the program's text section. So: how do breakpoints actually work?? * A naive way for a debugger to implement breakpoints would be to **single-step** through the program, examine RIP after each step and stop whenever the desired address is reached. + This is very slow: each single step and read from a register involves a system call, which means about 1 microsecond per step. This is equivalent to executing your program on a 1MHz processor. * Sketch of a better performing solution: + Debugger sets it up so program generates an interrupt whenever it arrives at an address with a breakpoint. + Kernel translates interrupt to a signal, delivers signal to the _target_ process. (Recall, from user mode threading: signals are just like user mode interrupts.) + ptrace semantics mean that the target process is stopped, and waitpid at the debugger receives the signal number as a part of the status. Several questions need to address in order to use this technique. * Adding a breakpoint given an address + The debugger uses PTRACE_PEEKDATA to read the current instruction at the address. The debugger needs to save the original instruction in order to continue from a breakpoint. + The debugger then uses PTRACE_POKEDATA to change the code at the address so it generates an interrupt. + The breakpoint is now set, the debugger can resume the process when desired. * How to generate an interrupt, and what interrupt. + Processor provides an instruction `int` that generates an interrupt. + By convention on Intel x86 and AMD64 interrupt 3 is reserved for breakpoints. (This is a convention dictated by Intel.) + The `int 3` instructions can be encoded in 1-byte: `int 3` can be encoded in two ways on x86/AMD64: `0xcc` and `0xcd 0x03`. The latter is how all other `int` instructions are encoded, e.g., `int 5` is `0xcd 0x05`. `0xcc` is a special encoding Intel provides just for `int 3`. Being byte-length is important since x86 and AMD64 instructions are variable length. The shortest instruction is 1 byte, the longest is 15 bytes. A byte length instruction can substitute for any of these. * Hitting a breakpoint: What happens when the program executes `int 3` + When the program executes `int 3`, control gets transferred to the kernel, calls aptly named `do_int3` function. + Kernel marks the process to deliver a SIGTRAP signal. + Since a debugger is attached, the kernel blocks the process as a result of SIGTRAP and the debugger can use waitpid to observe that breakpoint has been hit. * Determining which breakpoint was hit + A program might have many active breakpoints at a time. To determine which breakpoint was hit, the debugger reads RIP to determine where the program is and then searches through the list of breakpoints. * Continuing from a breakpoint Usually when one uses the `continue` command on GDB, the intention is to continue execution **without** disabling the breakpoint that has been hit. Continuing is however complicated by the fact that when setting a breakpoint the debugger modified the program. To continue the debugger must undo the change, execute the original program and then change the program back in order to restore the breakpoint. This is done as follows: + First, the debuggers uses PTRACE_POKEDATA to restore the original instruction to the address. The original instruction is recorded as a part of the breakpoint information. + Second, the debugger single steps the process, executing the restored instruction and returning control back to the debugger + Finally, the debugger changes the program code by writing 'int 3' (0xcc) to the address, re-enabling the breakpoint. [Note: This technique is a bit complicated for multithreaded applications since all threads share the same text (code) region.] * Other kinds of breakpoints: Sometimes it is useful to break whenever a process is about to make a syscall. This is a common enough occurrence that `ptrace` provides a special `PTRACE_SYSCALL` command that breaks on syscalls. G. Watchpoints * Watchpoints set with the `rwatch
` and `awatch ` commands in GDB stop execution whenever the specified memory address is read (rwatch) or accessed (either read or written). + Invaluable when debugging memory corruption. * How to implement watchpoints? + Single step implementation + Page fault based implementation + Hardware assisted. * Single-step: This looks much like the breakpoint case, except the debugger also needs a way to decode instructions (to see whether a given MOV or LEA accesses the watched address, as the address is either explicitly or implicitly part of the instruction). GDB and others use single-stepping when necessary. * Page-fault based: The idea here is simple: the debugger asks the kernel to mark a page in the process as inaccessible, + Done with a combination of mprotect(), and other calls. We won't go into details. + Processor will generate SIGSEGV whenever target process accesses the watched address. Accesses to this page will generate a SEGFAULT, blocking the process, The debugger can then retrieve the signal using waitpid and signal information using PTRACE_GETSIGINFO. + When continuing, debugger uses the same trick as for breakpoints: it changes page access bits, single steps, and then removes access to the page. * Problem with page-fault based approaches: rwatch and awatch can be used to watch addresses at byte, 2 byte, 4 byte or 8 byte granularity. Page faults occur at page granularity. + Potentially many needless page faults. + Slows things down. * Hardware assisted: + On Intel (also ARM, Power-PC, etc.) hardware provides support for watchpoints. + On Intel there are: - Four watchpoint registers DR0 -- DR3: each register contains a virtual address. The processor will generate an interrupt whenever the program accesses one these addresses. These registers specify the start of the memory area that the processor is watching. - One control register DR7, which controls a few things: * the size of each memory area being watched. This can be 1, 2, 4 or 8 bytes, and any access to a virtual memory address with [start, start+length] will result in an interrupt. - Another control register DR6 that records why a debug interrupt occurred. * Both which register's address triggered the interrupt and the type of access performed. - DR0 to DR3, DR6 and DR7 can only be read or written to from kernel. + Using Intel's hardware watches - GDB can manipulate these registers using PTRACE_PEEKUSER and PTRACE_POKEUSER. * Since ptrace is a syscall, there is no permission issue with using PTRACE to read or write from these registers. - When the program accesses a watched address, the processor generates *Interrupt 1* (by convention the debug interrupt). - The kernel turns interrupt 1 into a SIGTRAP (this is the same as what was used for breakpoints). * The value of DR6 is included in siginfo_t, and the debugger can retrieve this using PTRACE_GETSIGINFO. + Hardware assisted watchpoints are very efficient, but there are very few of them, which limits their use. * Reality: GDB uses a combination of all three techniques, with single stepping serving as the ultimate fallback, and virtual memory techniques used only when a lot of adjacent memory addresses are monitored. Implication: Recommended to not have more than a few (< 4) watchpoints at a time. H. Considerations for multithreading [didn't cover this] * So far we have not considered multiple threads. It gets a bit more complicated. Why? + In Linux every pthread is backed by a kernel thread. For historic reasons each kernel thread has its own PID. So really ptrace() can target individual threads, not just processes. But also means that the debugger needs to attach to all threads individually. + A signal sent to a process can be delivered to *any* thread belonging to the process, and the kernel chooses the signal-receiving thread non-deterministically. This means that only one of many threads in the application stops. Consequences: + Any memory accesses from the debugger runs the risk of causing races. [However, register accesses to the stopped thread are safe: why?] + Solution: whenever any thread is paused, debugger uses kill(2) to pause all other threads in the process before it does anything. (The kill(2) can be delivered to a chosen thread.) [thanks to Aurojit Panda for this content]