CS202 Review Session 1 Notes from [Xiangyu Gao](https://xiangyug.github.io/), TA from fall 2021 Edited by Khanh Nguyen, TA Spring 2022 Edited by Jinli Xiao, TA Spring 2023 Edited by Sophia Watts, TA Spring 2024 Edited by Sam Frank, TA Fall 2024 1. Introduction 2. Logistics 3. Motivation & tips for success 4. Lab infrastructure 4.1. Git and GitHub 4.2. Docker 4.3. scripts and Makefile 5. Lab 1 overview 5.1. C Basics 5.2. Debugging and gdb 6. Q&A -------------------------------------------------- 1. Introduction 2. Logistics - 8 optional review sessions in total, 1 hour each. - Will post the notes on course website, and recordings on Brightspace. - Provide an overview of the labs, good starting point, and help you with important background knowledge. - Ask me any questions. If I can't answer them, please post on Campuswire (CW). - PLEASE don't ask me if your code is correct. Labs are an opportunity for you to learn, and you should test your code. 3. Motivation & tips for success - Personally, I find the material very useful day to day in industry. You will hear terms: processes, threads and memory thrown around in technical docs as well as codebase. So I think the course prepares the foundation significantly. - I found this class extremely interesting and helpful for understanding more about how computers actually work. - Useful for all kinds of computer science careers To succeed in cs202: - Start on time. Read through the code and lab write up before coding. Understand the overall design first. - There are about 2 weeks between the prior lab being due and the next one being due (note lab 4 has 3 weeks) - For me, "on time" meant reading the labs in their entirety when they are released. - PUT IN THE WORK. I don't mean only doing the lab. It means keeping up with the reading, reviewing notes, handouts, doing the homeworks and labs. - Most of the time when I was confused, it was because I hadn't read the reading. - Do the work yourself! - DON'T CHEAT. - Exam is one of the many data points. Don't get discouraged if you don't do too hot on the midterm. Pull your weight and be consistent throughout the course - Ask questions. Don't be afraid to ask questions. I will be happy to answer them. If I don't know the answer, I will find someone who does. 4. Lab infrastructure - You will need to setup Git, GitHub, and Docker. - The tools may be overwhelming at first. But for the purpose of this course, you don't need to dive too deep into them. Remember the basic commands and you will be fine. - Instructions on the course website should be detailed enough. If you have any questions, please post on CW or come to office hours. - Here I will give you a higher level overview on how these tools work together. 4.1. Git and GitHub - Git - Git is a version control system that allows developers to track and manage changes to their code. - Git also allows developers to easily revert to previous versions of their code if necessary. - It is widely used in the software development industry and is a key tool for collaboration and organization. - GitHub - Git != GitHub - GitHub is a hosting website that stores Git repositories. Alternatives are GitLab, BitBucket, etc. - GitHub hosts remote repositories that allows you to download lab starter code and upload your code for grading, with appropriate permissions. - In cs202, you need to work with three Git repositories: - nyu-cs202/labs - remote repository that contains the starter code for labs. - the course staff will release labs here. - you should not push your work to this repo; push to your private repo instead. - nyu-cs202/labs-23sp- - remote repository contains your code/submission for labs. - this repo is private and visible only to you and the course staff. DO NOT share code in this repo with anyone else. - your local repo - this is a copy of your private repo that lives on your computer. - the lab setup lets you fetch the starter code (upstream) to your local machine and work on it there. - you then push your changes to the remote private repo (origin) on GitHub. Common Git and GitHub terms and commands (with whiteboard drawing): - Repository: "container" which holds files - Working copy: copy of code/files which you are changing - Version repository: history of files in repository - Think of google docs history - Remote: repositories stored elsewhere (in this case, on GitHub) - Index: "staging area between workspace and repository" - clone: make a copy of remote to local machine - fetch: update the local copy from the remote - checkout -b: creates local branch - push: updates remote refs - merge: joins changes from commits into current branch - commit: records changes to repository - save a "snapshot" of working copy - add: tells git to start paying attention to a file - adds file to index 4.2. Docker - Standardizes operating system for everyone in the class - We will all appear to be using and running programs on the Linux distribution "Jammy Jellyfish" 4.3. scripts and Makefile - scripts are a set of commands that you can run in the terminal. By putting them in a file, you can run them all at once with a single command. For example, ./cs202-run-docker is a script that runs the Docker container for you. This saves you from remembering the long Docker commands and typing them manually each time. - make is a tool that helps you automate the compilation process. We setup rules in a file called 'Makefile' and you can just run each of them with a single and simple command. Eg. make, make test, etc. 5. Lab 1 Overview - Lab 1 should be easier than other labs. Still, you will do a lot of setting up, review your C skills, and familiarize yourself with the lab workflow. Many things can go wrong. So remember to start on time. - The lab will help you review C and teach you to use gdb (if you haven't already used it). - If you are unsure about your C skills, I recommend looking at K&R: The C programming language. (see course homepage for more details) - Try to learn gdb on the go. It is really powerful and somewhat crucial on later labs. Your skill will improve the most when you are using it. - Part 1: implementation of functions, writing a basic C program from scratch - Part 2: Debugging using gdb - Part 3: navigating around linux on terminal - if you want to learn more about Linux commands, check out https://overthewire.org/wargames/bandit/ - puzzles that teach basic linux commands - Also check out https://missing.csail.mit.edu (look at class webpage for more resources) 5.1. C Basics - a. Declaration & initialization - Telling computer to expect/make space in memory for variable /* * * Declaration. Dereferencing x will lead to undefined behavior. * Remember to initialize your variables. */ int * x; b. Pointer (variable) - Some C programmingtasks are performed more easily with pointers, and other tasks, such as dynamic memory allocation, which cannot be performed without using pointers. - Definition: An address within memory OR a variable whose value is the address of another variable. - Format: type * variable_name. - '*' denotes the pointer type. Don't confuse this with accessing/dereferencing pointers. - '*' can also be used to access contents. E.g1: int * myFavNum = 6; //pointer to int 6 printf("%d\n",*myFavNum); //'6' - *myFavNum is equivalent to accessing the value of what is stored at myFavNum Fun exercise: Try copy the code into your machine. Before running it, read through the code and think what would happen. Then run it to double check your understanding. Change it around and redo the process. c. Mental Model - Everyone has different mental model when it comes to pointers. I will present my way of thinking about pointers. - I like to think about variable or code as boxes. So pointer, to me, is a box that contains memory address: E.g: int a = 1; int* b = &a; int** c = &b; - In my brain, with the example, this is what's going on: a [ 1 ] 0x100 b [ 0x100 ] 0x108 c [ 0x108 ] 0x116 - So as you see, b refers to the memory address that store a's value. c refers to address of b. And if there's a dereferencing, you just follow the trail of address. int* x = *c; // x will refers to b's value so x [ 0x100 ] int y = *x // y will be 1. Try work this out and understand it on paper. d. Array and string - There is no built-in type string in C :O - 2 ways to access element in array: subscript and pointer arithmetic. - Character array is terminated with a null byte, indicating the end of the string. E.g: // will compile sometimes but this is not correct char name[5] = "Alice"; // correct way to initialize string with length 5. Need 1 more for null byte char name[6] = "Alice"; /* * * "Alice" is a string constant in read-only region. Write/modify can lead to * undefined behavior. */ const char * name = "Alice"; 5.2. Debugging and gdb - gdb is a debugger that allows you to step through your program and inspect the state of your program at each step. Personally, it is my favorite debugger for C. - You currently might use a lot of printf statements. I also use a lot of printf statements, but that’s not always the best way or possible - We assume you are already familiar with C. You should spent some decent time in CSO learning C. - [live gdb demo] - Suggestions for debugging: - Don't try changing your code around without thinking about it first. It will waste lots of your time and is not cost effective. More importantly, sometimes, you won't even know why your code fails and how you ACTUALLY fix it. - Can also use printf or combination of printf + gdb. But printf is also not the greatest strategy. If you notice you rely on using printf too much, try switching to gdb. - Don't be scared of compiler warnings and errors. Compiler, most of the times, is your friend. 6. Q&A References: https://www.geeksforgeeks.org/git-index/ https://cs61.seas.harvard.edu/site/ref/git/#gsc.tab=0