Back to Computer Science courses page for Spring 2001

**Course home page for:**

A first graduate course in computer algorithms, covering the basic algorithms and data structures together with techniques of analysis. For specific topics, see the tentative syllabus below. The course is intended to prepare the student for the Computer Science Department written comprehensive examination in algorithms. It is strongly recommended that the student be familiar with data structures such as queues and linked lists, have a working knowledge of C/C++ or Java, and understand mathematics through calculus.

The course grade will be based on in class midterm and final examinations as well as weekly homework assignments. The assignments will mostly be theoretical, but some will require programming.

Thomas Cormen, Charles Leiserson, and Ronald Rivest, **Introduction
to Algorithms, **MIT Press, Cambridge, Massachesetts, and McGraw-Hill,
New York, 1990.

The first thing you will notice about this book, after weighing it, is that the introduction is endless. We sill skip most of it. Also, we will cover nothing from Chapter 27 on.

You can communicate with me be sending email to goodman@cims.nyu.edu. You can communicate with the class using the bulletin board. If you have a question or are looking for homework partners or whateber, start with a post. I will post announcements to there, so you should check it often. In particular, I will post corrections to homework questions or hints to answers. If you have a question on the homework, it is better if you post it to the bboard than if you email it to me.

These items roughly correspond to weekly lectures, but some are too
short and others are too long. Please make every effort to **do the reading
before class**. We will have the problem that the students have varied
backgrounds. Some will find the mathematical parts in the beginning more
challenging. If you are one of these, please work hard to hang on in the
early weeks. Others will be familiar with much of the material, I will
try to add enough spice to stave off boredom.

- Three sorting algorithms: insertion sort, bubble sort, merge sort, and their mathematical analysis. Orders of magnitude, big "O" notation. Recursive algorithms and recurrence relations.
- Heaps, priority queues, heapsort. Application: event driven simulation
- Quicksort: worst case and randomized analysis. Quickselect. Bucket, radix, and counting sort.
- Review of data structures: stacks, queues, pointers, linked lists, binary and general trees.
- Hashing and hash tables, collision resolution strategies, choice fo hash functions, dynamic hash tables and amortized analysis. Application: symbol tables.
- Binary search trees: maintenence and traversal. Balance, rotations, balancing.
- Advanced trees: RB trees and B trees.
- Set algorithms: union and find, and path compression
- Graphs and graph searching. Breadth first and depth first search. Connected components and topoligical sort. Application: code dependency.
- Minimum spanning trees, the algorithms of Kruskal and Prim.
- Shortest path algorithms and variants, Dijkstra and Bellman - Ford angorithms.
- Dynamic programming.
- Greedy algorithms, uses and pitfalls of heuristic algorithms. Huffman coding.

**Read**: Chapter 1, paying attention to: the definition of insertion
sort and merge sort and the running times for these algorithms as a function
of N, the number of items to be sorted. Read and learn Section 2.1. Read
over Section 2.2. This is mostly for notation (lg and ln, for example).
Don't worry if you don't already know Sterling's approximation (usually
called Sterling's formula). One of our goals is to develop so much intuition
and experience that problem 2-3 (p. 38) is child's play. Skim Chapter 3;
we will do a simpler form of this. Read and learn Sections 4.1 and 4.2.
This is the most mathematical part of the course and usually gives people
trouble. Skim the rest of Chapter 4. Don't worry if this seems like too
much reading for one class, the writing is verbose and the chapters are
short.

Remind yourself about binary trees (Section 5.5,and earlier parts of
Chapter 5 if necessary). **Read** and learn Chapter 7.

Quicksort is discussed in chapter 8. Section 8.1 describes the basic
idea: divide and conquer using pivots. Section 8.2 presents intuition on
the importance of balance. One should distinguish between randomized analysis
(the numbers to be sorted are assumed to be random)and the randomized algorithm
(the pivots are chosen at random). In the real world, the objects to be
sorted are often far from random. Nevertheless, the programmer is free
to choose random pivots. Without random pivots, quicksort is bad. The important
concept in Section 8.2 is *balance*, learn it. Section 8.3 contains
hard but very important theory. Hit it hard. You can review probability
in Sections 6.2 and 6.3. Section 6.6 gives some easy and worderful examples
of elementary probability theory at work. You may find it empowering. Skim
Section 8.4.

A close relative of quicksort is quickselect (called "randomized select" here), an algorithm for finding the median of N numbers in order N time. Read about it in Section 10.2. Section 10.3 has a clever version of this that avoids randomization but is worse in practice almost all the time. The rest of Chapter 10 is easy; read it too.

Chapter 9 is a grab bag of topics related to sorting. First, in Section 9.1, is the famous order N*log(N) lower bound for comparison based sorting. You will be responsible for this fact and its proof. The remaining sections discuss sorting algorithms that are faster in special cases, such as sorting words with a finite alphabet and of a fixed length (e.g telephone numbers). This will be easy by comparison to what came before.

For hash tables, read sections 11.1 and 11.2. You should understand stacks and queues and linked lists. In particular, some hashing algorithms use a linked list.

Read section 12.1: the operations hash tables do: insert, delete, find. Read Section 12.2: what is a hash function? Suppose you were trying to store about 1000 words (character strings) in a hash table, what would be the argument of the hash function and what would it return? Why must you provide for collisions? What is the "chaining" resolution of collisions? Read section 12.3: Why is it good for hash functions to seem random? What are some simple hash functions you can use in practice? Read section 12.4: open addressing and probing. What are the various probing strategies and the pros and cons of each?

Sorting assumes that you get all the numbers at once. Once the list is sorted, it is too late to add new numbers. Binary search trees remove that difficulty. You can insert, find, and delete elements from a sorted list with all operations taking O(h), where h is the height of the tree. If the tree is balanced, then n ~ 2^h (equivalently, h = lg(n)) where n is the number of elements in the list. Read Chapter 13 sections 13.1 through 13.3. Learn what a binary search tree is, how to traverse it, and how to find elements in the tree. Learn the simplest insert and delete methods. Section 13.4 is more technical and less important. You can think of it as motivation for balanced tree algorithms that follow.

The algorithms in Chapter 13 can produce wildly unbalanced trees. More complex tree algorithms and data structures guarantee that the tree is nearly balanced at all times. It is surprising that this has little impact on the time required for the basic insert/find/delete algorithms. Read Chapter 14 on RB trees. Understand what an RB tree is and why it must be almost balanced. Section 14.2 describes rotations, the main extra operation that allows us to rebalance a tree that has become unbalanced. Study the insert and delete algorithms in sections 14.3 and 14.4. These are the most complicated algorithms we've seen so far. The only more complicated ones in the course are for B-trees. The algorithms are quite efficient despite their complexity.

Homework and other downloads

Homework 1: postscript (.ps) format PDF (.pdf) format

Homework 2: postscript (.ps) format PDF (.pdf) format

Homework 3: postscript (.ps) format PDF (.pdf) format

Programming assignment 1: postscript (.ps) format PDF (.pdf) format

Data for programming assignment 1

Hand written answers to homework 1.

Homework 1a. If you did poorly on homework 1 and want to raise your grade on this, doing homework 1a will help. I don't know exactly how much, but some: postscript (.ps) format PDF (.pdf) format

Homework 4: postscript (.ps) format PDF (.pdf) format

Some extra problems for practice: postscript (.ps) format PDF (.pdf) format

Homework 5: postscript (.ps) format PDF (.pdf) format

Homework 2a. If you did poorly on homework 2 and want to raise your grade on this, doing homework 2a will help. I don't know exactly how much, but some: postscript (.ps) format PDF (.pdf) format

Homework 6: postscript (.ps) format PDF (.pdf) format

Sorting code fragments:

Homework 3a. If you did poorly on homework 3 and want to raise your grade on this, doing homework 3a will help. I don't know exactly how much, but some: postscript (.ps) format PDF (.pdf) format

Homework 7: postscript (.ps) format PDF (.pdf) format

Homework 8: postscript (.ps) format PDF (.pdf) format

Homework 9: postscript (.ps) format PDF (.pdf) format

Homework 10: postscript (.ps) format PDF (.pdf) format

Sample final exam. This is the exam I gave last time: postscript (.ps) format PDF (.pdf) format

Quick exam review guide: postscript (.ps) format PDF (.pdf) format

The final exam, including typos: postscript (.ps) format PDF (.pdf) format

Answers to the final: postscript (.ps) format PDF (.pdf) format