G22.3033-007

Special Topics in Problem Solving, concentrating on problems drawn from distributed computing and computational biology (Tuesdays 7-9)

Dennis Shasha

Graduate Division

Computer Science


 

GOAL

Interesting problems usually can't be solved in closed form or using polynomial algorithms. Ambulances can't be scheduled based on rough hunches in a castrophe. Shipping companies must exploit non-linear constraints involving profit margins, demand, and volume, and still keep the boats moving. The same goes for optimal substring search for microarray design or optimal portfolio management. The people who program the practical solutions to these problems comprise the elite of their technical organization. They are sought after throughout their careers, because of their versatality, imagination, and programming talent. I call them omniheurists -- solvers of all problems.

This course aims to develop your skills as an omniheurist. We will study problems that require heuristics and approximation algorithms. The heuristics include branch and bound, simulated annealing, tabu searching, evolutionary algorithms, and adaptive gradient methods. Approximation algorithms to NP-complete problems will help for subproblems. The problems take minutes to explain but the challenge they pose will keep you busy in a creative way for a few hours. I don't promise an easy course, but I believe you will enjoy this class and find it useful.

The goal of this course, then, is to train you to face a new problem, integrate techniques you have learned, arrive at a solution rapidly, and be able to demonstrate (experimentally and sometimes competitively) that the solution is both efficient and close to the best possible.

Prerequsities: this is a course for students who can think creatively about algorithms and are willing to prototype them. I will spend an hour teaching the prototyping language I use (for all of my research) but you are free to use any you like.

In the fall of 2002, the course emphasis will be on problem-solving in two areas: distributed computing and biological computing. The distributed computing projects will involve adversarial games in which distributed computing goals (e.g. ambulance pickup; competitive consumption) must be achieved in the context of an adversary who can cause failures or otherwise try to thwart you. The biological computing projects will involve combinatorial problems having to do with verifying DNA sequences using Crick-Watson pairing, the reconstruction of sequences using restriction enzymes, the use of combinatorial design for experiments, and and so on. The course will present all the necessary domain knowledge. At the end, you will be able to face a difficult problem in an area where the domain is not familiar to you, but still be able to contribute a computational solution.

Depending on the size of the class, you will work alone or in a team of two. In some cases the problem will be an as yet unanalyzed game and your team's task will be to compete with other teams. The games are games of strategy and full information. A side effect of each effort will be an evolving "Omniheurist" website that we expect will get many visitors over the years. Sometimes, you will have a design problem, but in all cases you will work as a member of your team.

This is hands-on computer science. I will lecture only on a few topics -- game-playing in AI, a prototyping language K , a description of the puzzles, heuristic techniques, and approximation algorithms for NP-complete problems. I will also serve mostly as a moderator for class discussion. You will be graded on how your program works, on your insights in class, and on your verbal explanation of the strategies you use. Attendance is therefore mandatory unless you have a written excuse from a physician or an employer.

Similar courses have been given by Donald Knuth at Stanford and Ken Ross at Columbia.

Class Prerequisites

To take the class, you should love challenging puzzles, enjoy rapid prototyping, and have some sense of concurrent programming. You will like the class if you think algorithmically and enjoy translating algorithms into programs. Formally, the only requirement is Fundamental Algorithms or its equivalent. You should have done well however.

It helps if you know some rapid prototyping language that allows you to generate test data easily, test different approaches, and has built in set and sequence and matrix operators. For some, this will be C++ with the Standard Template Library, but for others in may be Python, Perl, or Mathematica.

Basic Class Structure

For each problem, there will be three relevant classes: Presentation, Discussion, and Competition. So, on class i, I will present problem k, we will discuss k in class i+1, and there will be a competition for k in class i+2 (when problem k+1 will be presented).

For each problem, there will be an Architecture team and Competing Teams. These will be assigned on Presentation night.

Grading: since your teams will be generated at random, there is a danger that someone will not do the work. This will come out in the class discussion.

In addition to his or her other roles, one student will act as the scribe for the presentation and discussion of a game.

Class schedule

Books

Class Materials -- a growing list

Game playing in Artificial Intelligence from the paper by Pantel: "Advanced Adversary Search Algorithms"

Heuristic search algorithm from AI A* description by Justin Heyes-Jones

Genetic Algorithms Tutorial and an example of a genetic algorithm for the knapsack problem implemented in K.

Chris Poultney's Voronoi game implementation

Here is how to call C from K: Don Orth's description of how to call C from K.

Here is a powerpoint overview of the game framework in php by Alberto Lerner . Here is a text description of php and here are the php pages themselves for tic-tac-toe.

Here is Tyler Neylon's code for the knapsack that performs evolutionary algorithms on general chromosome structures.