Back to Computer Science courses page for Spring 2001

# New York University

### Course description

A first graduate course in computer algorithms, covering the basic algorithms and data structures together with techniques of analysis. For specific topics, see the tentative syllabus below. The course is intended to prepare the student for the Computer Science Department written comprehensive examination in algorithms. It is strongly recommended that the student be familiar with data structures such as queues and linked lists, have a working knowledge of C/C++ or Java, and understand mathematics through calculus.

The course grade will be based on in class midterm and final examinations as well as weekly homework assignments. The assignments will mostly be theoretical, but some will require programming.

### Text

Thomas Cormen, Charles Leiserson, and Ronald Rivest, Introduction to Algorithms, MIT Press, Cambridge, Massachesetts, and McGraw-Hill, New York, 1990.

The first thing you will notice about this book, after weighing it, is that the introduction is endless. We sill skip most of it. Also, we will cover nothing from Chapter 27 on.

### Communication

You can communicate with me be sending email to goodman@cims.nyu.edu. You can communicate with the class using the bulletin board. If you have a question or are looking for homework partners or whateber, start with a post. I will post announcements to there, so you should check it often. In particular, I will post corrections to homework questions or hints to answers. If you have a question on the homework, it is better if you post it to the bboard than if you email it to me.

### Syllabus (tentative)

These items roughly correspond to weekly lectures, but some are too short and others are too long. Please make every effort to do the reading before class. We will have the problem that the students have varied backgrounds. Some will find the mathematical parts in the beginning more challenging. If you are one of these, please work hard to hang on in the early weeks. Others will be familiar with much of the material, I will try to add enough spice to stave off boredom.

1. Three sorting algorithms: insertion sort, bubble sort, merge sort, and their mathematical analysis. Orders of magnitude, big "O" notation. Recursive algorithms and recurrence relations.
2. Read: Chapter 1, paying attention to: the definition of insertion sort and merge sort and the running times for these algorithms as a function of N, the number of items to be sorted. Read and learn Section 2.1. Read over Section 2.2. This is mostly for notation (lg and ln, for example). Don't worry if you don't already know Sterling's approximation (usually called Sterling's formula). One of our goals is to develop so much intuition and experience that problem 2-3 (p. 38) is child's play. Skim Chapter 3; we will do a simpler form of this. Read and learn Sections 4.1 and 4.2. This is the most mathematical part of the course and usually gives people trouble. Skim the rest of Chapter 4. Don't worry if this seems like too much reading for one class, the writing is verbose and the chapters are short.

3. Heaps, priority queues, heapsort. Application: event driven simulation
4. Remind yourself about binary trees (Section 5.5,and earlier parts of Chapter 5 if necessary). Read and learn Chapter 7.

5. Quicksort: worst case and randomized analysis. Quickselect. Bucket, radix, and counting sort.
6. Quicksort is discussed in chapter 8. Section 8.1 describes the basic idea: divide and conquer using pivots. Section 8.2 presents intuition on the importance of balance. One should distinguish between randomized analysis (the numbers to be sorted are assumed to be random)and the randomized algorithm (the pivots are chosen at random). In the real world, the objects to be sorted are often far from random. Nevertheless, the programmer is free to choose random pivots. Without random pivots, quicksort is bad. The important concept in Section 8.2 is balance, learn it. Section 8.3 contains hard but very important theory. Hit it hard. You can review probability in Sections 6.2 and 6.3. Section 6.6 gives some easy and worderful examples of elementary probability theory at work. You may find it empowering. Skim Section 8.4.

A close relative of quicksort is quickselect (called "randomized select" here), an algorithm for finding the median of N numbers in order N time. Read about it in Section 10.2. Section 10.3 has a clever version of this that avoids randomization but is worse in practice almost all the time. The rest of Chapter 10 is easy; read it too.

Chapter 9 is a grab bag of topics related to sorting. First, in Section 9.1, is the famous order N*log(N) lower bound for comparison based sorting. You will be responsible for this fact and its proof. The remaining sections discuss sorting algorithms that are faster in special cases, such as sorting words with a finite alphabet and of a fixed length (e.g telephone numbers). This will be easy by comparison to what came before.

7. Review of data structures: stacks, queues, pointers, linked lists, binary and general trees.
8. For hash tables, read sections 11.1 and 11.2. You should understand stacks and queues and linked lists. In particular, some hashing algorithms use a linked list.

9. Hashing and hash tables, collision resolution strategies, choice fo hash functions, dynamic hash tables and amortized analysis. Application: symbol tables.
10. Read section 12.1: the operations hash tables do: insert, delete, find. Read Section 12.2: what is a hash function? Suppose you were trying to store about 1000 words (character strings) in a hash table, what would be the argument of the hash function and what would it return? Why must you provide for collisions? What is the "chaining" resolution of collisions? Read section 12.3: Why is it good for hash functions to seem random? What are some simple hash functions you can use in practice? Read section 12.4: open addressing and probing. What are the various probing strategies and the pros and cons of each?

11. Binary search trees: maintenence and traversal. Balance, rotations, balancing.
12. Sorting assumes that you get all the numbers at once. Once the list is sorted, it is too late to add new numbers. Binary search trees remove that difficulty. You can insert, find, and delete elements from a sorted list with all operations taking O(h), where h is the height of the tree. If the tree is balanced, then n ~ 2^h (equivalently, h = lg(n)) where n is the number of elements in the list. Read Chapter 13 sections 13.1 through 13.3. Learn what a binary search tree is, how to traverse it, and how to find elements in the tree. Learn the simplest insert and delete methods. Section 13.4 is more technical and less important. You can think of it as motivation for balanced tree algorithms that follow.

The algorithms in Chapter 13 can produce wildly unbalanced trees. More complex tree algorithms and data structures guarantee that the tree is nearly balanced at all times. It is surprising that this has little impact on the time required for the basic insert/find/delete algorithms. Read Chapter 14 on RB trees. Understand what an RB tree is and why it must be almost balanced. Section 14.2 describes rotations, the main extra operation that allows us to rebalance a tree that has become unbalanced. Study the insert and delete algorithms in sections 14.3 and 14.4. These are the most complicated algorithms we've seen so far. The only more complicated ones in the course are for B-trees. The algorithms are quite efficient despite their complexity.

13. Advanced trees: RB trees and B trees.
14. Set algorithms: union and find, and path compression
15. Graphs and graph searching. Breadth first and depth first search. Connected components and topoligical sort. Application: code dependency.
16. Minimum spanning trees, the algorithms of Kruskal and Prim.
17. Shortest path algorithms and variants, Dijkstra and Bellman - Ford angorithms.
18. Dynamic programming.
19. Greedy algorithms, uses and pitfalls of heuristic algorithms. Huffman coding.

Homework 1: postscript (.ps) format            PDF (.pdf) format

Homework 2: postscript (.ps) format            PDF (.pdf) format

Homework 3: postscript (.ps) format            PDF (.pdf) format

Programming assignment 1: postscript (.ps) format            PDF (.pdf) format

Hand written answers to homework 1.

Homework 1a. If you did poorly on homework 1 and want to raise your grade on this, doing homework 1a will help. I don't know exactly how much, but some: postscript (.ps) format            PDF (.pdf) format

Homework 4: postscript (.ps) format            PDF (.pdf) format

Some extra problems for practice: postscript (.ps) format            PDF (.pdf) format

Homework 5: postscript (.ps) format            PDF (.pdf) format

Homework 2a. If you did poorly on homework 2 and want to raise your grade on this, doing homework 2a will help. I don't know exactly how much, but some: postscript (.ps) format            PDF (.pdf) format

Homework 6: postscript (.ps) format            PDF (.pdf) format

Sorting code fragments:

Homework 3a. If you did poorly on homework 3 and want to raise your grade on this, doing homework 3a will help. I don't know exactly how much, but some: postscript (.ps) format            PDF (.pdf) format

Homework 7: postscript (.ps) format            PDF (.pdf) format

Homework 8: postscript (.ps) format            PDF (.pdf) format

Homework 9: postscript (.ps) format            PDF (.pdf) format

Homework 10: postscript (.ps) format            PDF (.pdf) format

Sample final exam. This is the exam I gave last time: postscript (.ps) format            PDF (.pdf) format

Quick exam review guide: postscript (.ps) format            PDF (.pdf) format

The final exam, including typos: postscript (.ps) format            PDF (.pdf) format

Answers to the final: postscript (.ps) format            PDF (.pdf) format