CS202 Review Session 3 Notes from [Xiangyu Gao](https://xiangyug.github.io/), TA from fall 2021 Edited by Khanh Nguyen, TA Spring 2022 Edited by Charlie Chen, TA Spring 2023 Edited by Abigail Zhou, TA Spring 2024 1. C review 1.1. Stack and Heap memory allocation 1.2. String concatenation 1.3. Struct 2. Lab 2 overview 2.1. Motivation 2.2. File definition 2.3. File permissions 2.4. Flags 2.5. Functions 2.6. Helper functions 2.7. Test output 3. Q&A --------------------------------------------------------------------- 1. C review A large part of this course is programming (primarily in C), so having a solid understanding of C can really help with the labs. **If you are unsure about your C skills, I recommend looking at K&R: The C programming language. 1.1. Stack and Heap memory allocation When we program in C, where are the variables stored? **Hint: three different memory locations. Stack – local variables Heap – dynamically allocated memory (anything called with new and malloc) Data – global variables These memory locations have no relevant relationship with the data structures "stack" and "heap." (although the stack for a CPU supports push and pop, as the data structure does) When do we use stack? Functions, where they are temporary and freed right after. After functions return, the stack pops until it looks like before the function occurred. Relatively safe and ideal choice for most scenarios. Alternatively, heap? Allocated dynamically and persist even after the call stack is complete. More flexible because not temporary like in stack. Risk of memory leaks. Use with caution, always remember to free the memory after you use it. e.g. linked list, user entering names but you don’t know when they will stop so you continue to allocate memory for each node as the user continues to type. In general: If you can use a local variable, aka stack, use that because it will free itself and you do not have to worry about memory leak. e.g. Suppose you allocate memory and then free it, but when your code is running, an exception happens in between. Instead of continuing to execute, it will go to exception handling and that resource will be leaked forever. Examples: ``` int main() { // Stack allocation int a = 0; char b[10]; // Heap allocation int *p = malloc(sizeof(int)); if (p == 0) { exit(1); // 0 is null pointer, what happens if you try to dereference p here? // Access violation, segmentation fault, core dumped. } *p = 10; char *q = malloc(5*sizeof(char)); // contents are uninitialized, unpredictable if (q == 0) { exit(1); } // How many characters can you store in this array? // 4, last character is ‘\0’, termination. // can use calloc to initialize to 0 or for loop. q[0] = 'a'; // brackets represent offset, how does it know how many bytes to offset to? q[1] = '\0'; // what happens if I forget this line? Access violation. Segmentation Fault. Core dump. printf(“%s\n”, q); // In heap, we need to free the memory manually. free(p); free(q); return 0; } ``` [Whiteboard drawing for memory layout. Check the scribbles] 1.2. String concatenation What commands do you use to concatenate two strings? strcat – can lead to buffer overflow, use strncat snprintf – suggested by Lab 2, do `man snprintf` to learn more. ** log into docker. Function signature: int snprintf(char *str, size_t size, const char *format, ...) char *str: buffer to write INTO (or in other words, the output after concatenation) size_t size: maximum size to write into the buffer char *format: format for the concatenation, similar to how you would use in printf ...: optional arguments. Variables to be formatted in the "format" string, also similar to the syntax for "printf". [See example in the whiteboard] const char* name = "Abigail"; printf("Hello, my name is %s.\n", name); // bad example char* buffer; snprintf(buffer, 200, "Hello, my name is %s.\n", name); // buffer does not point to any allocated memory. // correct example char buffer[200]; snprintf(buffer, 200, "Hello, my name is %s.\n", name); Is the buffer here 200 bytes on the stack (inside a function) or the data segment (global or static variable)? 1.4. Struct - C doesn't have a concept of class, like in Java, but it has `struct`. Struct is a collection of data items grouped into a single thing. Cannot have a method inside of a struct. - You can define a struct as: ``` struct student { int age; char* name; } ``` - To initialize: ``` struct student alice; alice.age = 22; alice.name = "Alice"; ``` - struct can contain any number of variables, including pointers. In lab 2, you will deal with struct pointer. ** show long info - For example: ``` struct student *palice = &alice; ``` - To access members of the struct, you can do: ``` alice.age; OR palice->age; // -> operator = dereference and then access the member ``` 2. Lab 2 overview 2.1. Motivation - You will implement ls, which is a command, in lab 2. - It will help you practice reading man pages, working with system calls, toying with the APIs, and design as well as re-design/refactor your code. 2.2. File definition - Normal files can be: "test.txt", "main.c", ... - Normal directories can be: "foo/", "bar/", "foo/bar/", ... - Some files and directories start with ".". If you are curious, you can inspect ".git" in your repository. This is where git stores your information and commit history. These are usually hidden from the users when they invoke `ls`. Specifically, to see them, you would need `ls -a` - In addition, "." also means current directory. ".." means parent directory. Every directory you create will have these 2 "pseudo-directories". - With that, there are also relative and absolute path: + Relative is a path from the current directory. Usually, it prefixes with "." and "..". For instance: `./my/relative/path/to/file`. +_Absolute path is a path from the fullpath you are providing. For instance: `~/my/absolute/path`. / is root ~ is the home directory /home/alz4471/cs202-labs/lab2/main.c ~/cs202-labs/lab2/main.c example with pwd on docker ../ --> ? (~/ which contains cs202-labs) 2.3. File permissions - Lab 2 will have you print the permission of a file, which has the form: rwx------. There are 10 chars, but we are concerned with the last 9. - In the 9 chars, the first group of 3 chars refers to owner’s permission, the second group of 3 chars refers to the group’s permission the last 3 chars refer to everybody else. - Sometimes you would see people do: `chmod 700 file.txt`. - The way to read it is split 700 into 3 characters so 7,0,0. Then convert each digit (which is in octal, or base 8) to binary. Each character then represent the same file permission as the 9 characters we saw earlier. A 1 in binary means the bit is set. A 0 means otherwise. Example: 700 -> 7,0,0 -> 111,000,000 -> rwx------ Exercise: I want owner to read, write. I want others to read only. What should I set? Answer: 644 What does 750 mean? 755? /* In a directory r - can do ls and list all files but does not mean you can read all the files in the directory w - can create and delete files x - can cd into that directory */ 2.4. Flags - ` ./ls -alR foo/ bar/` | program | flags | args - Flags refer to -a, -l, -R. You will need to support combination of them as well. - To make it more clear, let's look at getopt_long() (learn more `man 3 getopt_long`). int getopt_long(int argc, char * const argv[], const char *optstring, const struct option *longopts, int *longindex); - argc: number of arguments supplied (from main) - argv: array of arguments supplied (from main) - optstring: the flag we want to parse - longopts: pointer to option structure which contains your long options - longindex: if not null, points to a variable that is set to the index of the long option that is relative to longopts. If getopt() finds another option character, from the optstring you provided, in the arguments supplied, it returns that character, updating the variable optind so the next call to getopt() can resume the scan. It returns -1 otherwise. Example: #include #include #include #include /* -n: take 1 arg, how many seconds to sleep * -v: no arg, print out how long you slept * --help: no arg, print out usage */ int main(int argc, char *argv[]) { int verbose_flag, opt, nsecs; nsecs = 0; verbose_flag = 0; struct option opts[] = { {.name = "help", .has_arg = 0, .flag = NULL, .val = '\a'} }; // This loop is used for argument parsing. Refer to `man 3 getopt_long` to // better understand what is going on here. while ((opt = getopt_long(argc, argv, "n:v", opts, NULL)) != -1) { switch (opt) { case '\a': fprintf(stderr, "Usage: %s [-n nsecs] [-v]\n", argv[0]); exit(EXIT_FAILURE); case 'n': nsecs = atoi(optarg); break; case 'v': verbose_flag = 1; break; default: printf("Unimplemented flag %d\n", opt); break; } } printf("verbose_flag=%d; nsecs=%d\n", verbose_flag, nsecs); if (nsecs != 0) { sleep(nsecs); } if (verbose_flag == 1) { printf("We slept for %d seconds.\n", nsecs); } /* Other code omitted */ exit(EXIT_SUCCESS); } - Will parse -n and -v. - Expect at least 1 argument after -n - Try compile, run and observe the output: `example.exe -v -n 5`, `example.exe -v -n`, `example.exe –help`, `example.exe –hello` - 1 colon for required argument, 2 colons for optional argument - getopt is quite similar. Instead there is no option to input our longopts option structure. 2.5. Functions: - The main building blocks of your program are: opendir, readdir and closedir. - Read through their man pages carefully. - Don't forget to call closedir when you finish with your function. We don't want resources to be leaked. 2.6. Helper functions: - We provide lots of handy helper functions and macros such as: - PRINT_PERM_CHAR: print out the corresponding permission character - uname_for_uid: convert to human-readable user id - group_for_uid: convert to human-readable group id - date_string: convert to formatted date string - ftype_to_string: print out the file type - Please read through them carefully. Lots of students try to reinvent their own way of doing this, which is more painful than necessary. 2.7. Test output: - Part of the lab is figuring out what it is testing. Don't be intimidated if your test output is really big and strange. - We run diff between your output (./ls) and system output (ls). - diff command: `diff your_output.txt system_output.txt` When there is an error, you will see the below format. Example: 2,3c1,3 > > > ---- > > > - In the example above, everything in the 2nd line, before `---`, is your output. The rest is the system output. Or in other words, everything before `---` is the result of the file on the left of the diff command and everything after is the result of the file on the right. It's showing the difference between your output and system output. - The first line, `2,3c1,3` refers to what needs to be changed so that the differences are resolved. - `2,3c1,3`: Your file, line 2-3 needs to be changed to system output file, line 1-3. - c: change - a: add - d: delete