CS202-001: Lab setup and tools

We are going to use GitHub and git for distributing and collecting assignments. Please make sure you have git installed on your machine.

GitHub

If you don’t have a GitHub account, sign up for one here. Probably you want a Free plan.

Get the Class VM

[NOTE: If you have a Mac computer with an M1 CPU (as opposed to an Intel CPU), these instructions will not work. Either use an Intel CPU, or else run vagrant and ssh from a user account on the Courant machines (CIMS). We give some documentation for this setup below, but probably you should ask the teaching staff for help.]

One way to ensure that we are all using a uniform development environment (short of us all using the same machine) is for us to use identically configured virtual machines (VMs). You can think of a virtual machine as a way to run a particular operating system (in our case, an instance of Debian Linux) on top of another operating system (the one that controls your laptop or desktop).

The software that “runs” the virtual machine is known as a hypervisor. The job of the hypervisor is to pretend to be “real” hardware (in our case, an x86-64 CPU) to the operating system running on top of it (in our case, the aforementioned Debian Linux instance). The Debian Linux system thinks it’s running on physical hardware but in reality is running on an illusory x86-64 CPU.

We will use Vagrant to distribute the VM (rather than having you download a large image) and invoke the hypervisor.

The hypervisor that we support is VirtualBox.

VirtualBox Installation

Install VirtualBox by following the instructions for your own computer’s operating system.

If you have a recent Mac and get “Installation failed”, try this.

Vagrant Installation

Now install Vagrant: download from https://www.vagrantup.com/downloads.html, again selecting your own system’s OS.

Next, install the scp plugin. (This allows you to copy files to and from the VM.) Open a terminal on your machine and run:

> vagrant plugin install vagrant-scp

Creating the VM

To create the VM, start by cloning the class’s image repository at https://github.com/nyu-cs202/base-image by running:

> git clone https://github.com/nyu-cs202/base-image.git

Continue to the next section.

Running the VM

Go into the directory where you cloned the image repository. Usually that directory is base-image, and you go into a directory with the cd command ($ cd base-image). Once inside that directory, run

> vagrant up

Note that (1) you need to be connected to the Internet when running vagrant up, and (2) the first time that you do this, it will take a few minutes to execute. During this time, it is downloading some large files, so probably you shouldn’t run this from a coffee shop or when tethered to your mobile phone.

Once vagrant up is done, your VM is ready to go. You can enter it by running

> vagrant ssh

Note: vagrant commands must be run from the directory where you cloned the base-image repository. This is because Vagrant associates a VM with a Vagrantfile (and a possible tag).

The VM is a standard Debian VM (Buster), so you can install programs the same you would on Debian (or Ubuntu). vim is already installed in the VM, but if you want to use emacs or another editor you can install it by running apt install <editor name>.

Finally, here is a list of commands that you can use with Vagrant:

vagrant up: Start the VM, provisioning it if necessary. Once a VM is provisioned Vagrant will not try to reprovision it, and thus after the first time you can run vagrant up without worrying about resource usage.
vagrant suspend: Suspend the VM, useful if you want to regain memory or cores on your machine and are not working on the class.
vagrant resume: Resume a previously suspended VM. vagrant up will do the right thing if your VM is suspended (that is, resume it rather than booting it again), but this is more explicit.
vagrant halt: Shutdown the VM, for the same reasons as above.
vagrant scp :<src path> <dst path>: copy file(s) at src path in the VM to dst path on your machine. Note that vagrant scp uses rsync rather than scp, which means that it will avoid copying files that are already present. Note the colon (:).
vagrant scp <src path> :<dst path>: Copy file(s) at src path on your machine to dst path within the VM. Note the colon (:).
vagrant ssh-config: Print the ssh host, username, port and key that you need to use when connecting to the VM. One way this might come in handy is if you want to use an editor which supports editing files on a remote host using SFTP or ssh (for example, Sublime Text).

Git and GitHub

What is git?

Git was developed by Linus Torvalds for development of the Linux kernel. It’s is a distributed version control system, which means it supports many local repositories which each track changes and can synchronize with each other in a peer-to-peer fashion. It’s the best widely-available version control system, and certainly the most widely used. For information on how to use git, see:

For the workflow in GitHub:

GitHub Guides: Hello World

Cloning the labs repository

Labs will be released using the nyu-cs202/labs repository.

Please click this link to create your own private clone of the labs repository; this clone lives on (is hosted by) GitHub. Once that clone exists, you will perform a further clone to get that private repository onto your devbox. You’ll do your work on your devbox, and then push your work to the GitHub-hosted private repository for us to grade.

Here’s how it should work.

Click the link.
Log in to GitHub.
Provide a name.
The link should automatically clone the repository. For instance, if your username name was foomoo67, you should now have a repository on GitHub called nyu-cs202/labs-21sp-foomoo67.

Note: GitHub Classroom may tell you to create a new branch and make your first commit with a README file. Do not do this. It will introduce merge conflicts later on.

Teaching GitHub about your identity

The easiest way to access GitHub repositories is using an SSH key, a secret key stored on your CS202 VM that defines your identity. This handy tutorial may be useful to teach you about SSH; or just follow the steps below to create a key for your virtual machine.

Enter your VM: vagrant ssh
Run ssh-keygen -t rsa -b 2048 and follow the instructions.
- Press enter to use the default file path and key name (should be ~/.ssh/id_rsa).
- Choose a password or leave it empty.
This creates your ssh keys, which live in the directory ~/.ssh. Your public key is in the file ~/.ssh/id_rsa.pub.
Run cat .ssh/id_rsa.pub to display your public key.
Copy your public key (that is, select the text on the screen, and copy it to the clipboard).
In GitHub, go to your profile settings page (accessible via the upper-rightmost link—this looks like a bunch of pixels for new accounts). Select “SSH and GPG keys” and hit the “New SSH key” button. Then copy and paste the contents of your ~/.ssh/id_rsa.pub (from the VM) into the “Key” section. Give the key a sensible title, hit the “Add SSH key” button, and you’re good to go.

Creating a local clone

Once GitHub knows your SSH identity, you’re ready to clone your lab repository and start doing work! Here’s how to get a local clone of your private repo on your machine:

Launch your VM and open up a terminal (again, vagrant ssh)

Configure your git “identity” as it shows up in commits:

  $ git config --global user.name "FIRST_NAME LAST_NAME"
  $ git config --global user.email "YOUR_@COLLEGE_EMAIL"

Cloning “your” lab repo:
```
  $ cd ~
  $ git clone git@github.com:nyu-cs202/labs-21sp-<Your-GitHub-Username>.git cs202
  Cloning into ....
  warning: You appear to have cloned an empty repository.
```
Note that the git@github.com:.... can be obtained on GitHub by clicking the “Clone or download” button. You want to clone using SSH, not HTTPS, so you might need to click “Use SSH”.

At this point, all you have is an empty repository, so you need to get the lab files. We’ll do that next.
Setting up the upstream repo: The lab skeleton code is kept in the repo https://github.com/nyu-cs202/labs, managed by the course staff. Therefore, the first thing you need to do is to set up your own lab repo to track the changes made in the labs repo. In the git world, nyu-cs202/labs would be the “upstream” repo from which changes should “flow” into your own lab repo.

Type git remote add to add the upstream repo, and git remote -v to check that the right repo is indeed an upstream for your own lab repo.
```
  $ git remote add upstream https://github.com/nyu-cs202/labs.git
  $ git remote -v
  origin  git@github.com:nyu-cs202/labs-21sp-<YourGithubUsername>.git (fetch)
  origin  git@github.com:nyu-cs202/labs-21sp-<YourGithubUsername>.git (push)
  upstream    https://github.com/nyu-cs202/labs.git (fetch)
  upstream    https://github.com/nyu-cs202/labs.git (push)
```
Now fetch the commits from upstream:
```
  $ git fetch upstream
```
Now the commits on the upstream’s main branch are on your machine. Now you need to create a local branch to track the upstream’s branch:
```
  $ git checkout -b main upstream/main
  Branch 'main' set up to track remote branch 'main' from 'upstream'.
  Switched to a new branch 'main'
```
Now you can browse your local copy of the repo:
```
  $ cd cs202
  $ ls
```
Obtaining future labs and changes: You can check for and merge in changes upstream by typing:
```
  $ git fetch upstream
  $ git diff <commit_name> upstream/main
  $ git merge upstream/main
```
You should do this periodically. And we will remind you to fetch upstream on Campuswire if we make changes/bug-fixes to the labs.

(Above, commit_name is the name of the former head of upstream/main. It can be read out after you type git fetch.)
Visualizing git history: It’s often helpful to view/browse git history visually. GitHub can help with this, but of course it can only display commits that are pushed to the server. To look at your local repository’s state, you can use gitk. Assuming that you’re inside your git repo:
```
  $ gitk --all
```
Note that gitk is an XWindows client, and assumes that your terminal presents an XWindows server. For most of you, this should just work (because the vagrant ssh command forwards X connections), but please post to Campuswire or visit office hours if you run into trouble here.

Saving changes while you are working on labs

As you modify the skeleton files to complete the labs, you should frequently save your work to protect against laptop failures and other unforeseen troubles, and to create “known good” states. You save the changes by first “committing” them to your local lab repo and then “pushing” those changes to the repo stored on github.com

$ git commit -am "saving my changes"
$ git push origin

Note that whenever you add a new file, you need to manually tell git to “track it”. Otherwise, the file will not be committed by git commit. Make git track a new file by typing:

$ git add <my-new-file>

After you’ve pushed your changes by typing git push origin, they are safely stored on github.com. Even if your laptop catches on fire in the future, those pushed changes can still be retrieved. However, you must remember that doing git commit by itself does not save your changes on github.com (it only saves your changes locally). So, don’t forget to type git push origin.

To see if your local repo is up-to-date with your origin repo on github.com and vice versa, type git status.

Git FAQ

What message should I fill in for git commit -am “message”?

The “message” can be any string. But we ask you to leave something descriptive. In the future, when you check your git logs, this message helps you recall what you did for this commit.
How can I change a message if it’s already pushed to GitHub?

You can’t do this safely. If you want to put another message on top of a previous commit, create an empty commit with your new message:
```
  $ git commit --allow-empty -m "<new msg>"
```
I got an error message Fatal: Not a git repository.

This means you are typing git commands outside the directory containing your git repository. You need to type cd ~/cs202.
Can I edit files through GitHub.com?

Do not do that this semester. Super dangerous. Please only use GitHub.com for read-only access, i.e. checking if all your changes have been pushed to your remote repository.
When I do git pull, I got an error Repository not found

Check the repository address, there should be no quotes (") or angle brackets (< >). The lab instructions use quotes or angle brackets to mark a placeholder for your GitHub username. If git pull upstream main fails, then check the upstream address by typing git remote -v. To edit your upstream address, remove it first by typing git remote remove upstream, and then add it back with git remote add.
“The connection timed out” (or problems cloning, or problems with SSH keys).

Check if your firewall is blocking port 22, and open port 22 if it is blocked. You can use your favorite search engine to figure out how to do this.

Courant (CIMS) machines

Most of you will never need to read this section. However, some of you may end up using an account on the Courant machines in place of your personal laptop or desktop. In that case, the instructions for installing and using the VM differ slightly. We separate the description into a one-time setup and then what you do each time after.

One-time setup

First, you need to make sure you have an account on the Courant machines (you will have an account if you registered on or before January 28). Check here. If you do have an account but can’t login, follow the steps on that page. If you do not have an account, then please email our course’s staff email alias (the address is on our home page). In this email, explain that you registered after January 28; in that case, we will close the loop between you and the CIMS sysadmins.

Second, you need to login to a compute server. Select one from this list. We’ll use crunchy5 as our running example, but please pick one yourself for load-balancing purposes. Then login to access.cims.nyu.edu and onward from there to crunchy5:

[your-machine ~]$ ssh -AX access.cims.nyu.edu  
[name@access2 ~]$ ssh -X crunchy5

Once on crunchy5, set up by doing:

[name@crunchy5 ~]$ module load vagrant  
[name@crunchy5 ~]$ vagrant plugin install vagrant-scp   
[name@crunchy5 ~]$ git clone https://github.com/nyu-cs202/base-image.git
[name@crunchy5 ~]$ cd ~/base-image
[name@crunchy5 ~]$ chmod 0700 .

From there, you can do:

[name@crunchy5 ~]$ vagrant up
[name@crunchy5 ~]$ vagrant ssh

as described in the VM section above.

Before you logout, do:

$ vagrant halt

Ongoing

Here is the command sequence you’ll probably use, starting from your personal machine

[your-machine ~]$ ssh -AX access.cims.nyu.edu  
[name@access2 ~]$ ssh -X crunchy5
[name@crunchy5 ~]$ module load vagrant  
[name@crunchy5 ~]$ cd base-image
[name@crunchy5 ~]$ vagrant up ; vagrant ssh
......

[name@crunchy5 ~]$ vagrant halt

Acknowledgments

Portions of this writeup were borrowed from Harvard’s CS61, Jinyang Li’s CS201, and Aurojit Panda’s 3033.