Graduate Special Topics in Computer Science

NOTE: for descriptions of standard graduate computer science courses, see Graduate Course Descriptions.

G22.3033-001 Statistical and Computational Learning Theory

This course is intended for highly motivated students who are actively engaged in research. As such, it has two main goals. The first goal is to familiarize students with the theory and practice of two popular approaches to machine learning. Students will be encouraged to apply what they learn to their own research as part of their course project.

The second goal is to improve students' presentation skills. After all, these skills are as important to researchers as any other. Therefore, the course will be run like a reading group. Participants will take turns presenting papers or chapters from relevant books. The presentations will be graded not just on the presenter's knowledge of the subject, but also on how well the presenter helped their classmates to understand the material.

For more information, see

G22.3033-002 Logic in Computer Science

A beginning graduate-level class in mathematical logic with motivation provided by applications in Computer Science. There are no formal prerequisites, but the pace of the class will require that the students can cope with a significant level of mathematical sophistication. Topics include: propositional and first-order logic; soundness, completeness, and compactness of first-order logic; first-order theories; undecidability and Godel's incompleteness theorem; and an introduction to other logics such as second-order and temporal logic.

G22.3033-003 Topics in Computer Graphics

Prerequisites: undergraduate-level mechanics and numerical methods

This course discusses the basics and state of the art of physically-based modeling for computer graphics and related fields. The topics covered will include a review of the necessary background in mechanics, relevant numerical methods and discussions of recent research papers in the area. The focus will be on simulation of dynamics and interaction of rigid objects and deformable objects.

G22.3033-004 Bioinformatics

This course explores the topics in biology (particularly, genomics, proteomics, transcriptomics, etc.) where mathematics, statistics and computer science meet life sciences to rigorously explore biological data and information. Syllabus:

  1. Genome Structure & Grammar
  2. Mapping
  3. Sequencing & Resequencing
  4. Transcription Maps: Gene Finding, Regulatory Sequences
  5. Structural Genomics & Proteomics
  6. Functional Genomics:
    • Genetic Networks
    • Gene Expression Arrays
    • Clustering Algorithms
    • Ideas from Learning Theory
  7. Population Genomics: SNPs, Linkage Analysis
  8. Comparative Genomics
  9. DNA Computers.

G22.3033-005 Heuristic Problem Solving

This course revolves around several new problems to computer science (derived from games or puzzles in columns for Dr. Dobb's Journal, Scientific American, and elsewhere). The idea is to train students to face a new problem, read relevant literature and come up with a solution. The solution entails winning a contest against other solutions. The winner receives candy. The best solutions will be part of an evolving "Omniheurist" website that we expect will get many visitors over the years.

The course is for highly motivated, mathematically adept students. It is open to supported PhD students and master's students who have passed the Core exam. Class size has been around 10 students in the past and we have all gotten to know one another very well. Algorithmic and programming knowledge are the main prerequisites. It also helps if you are familiar with a rapid prototyping language such as MatLab, Mathematica, K, Python, or some other language in which you are completely fluent.

Please see for more information.

G22.3033-006 Producing Production Quality Software

Arthur P. Goldberg

How to produce high-quality code, a practical course.
Obtaining good requirements and writing accurate specifications.
Inspecting requirements, specifications and code.
Using pseudocode to describe algorithms.

Small to medium sized programs:
Accurately implementing the specifications. Writing simple code:
function and class decomposition;
O-O class hierarchies (deciding whether to use composition or
inheritance). Using libraries (data structure selection).
Practical code correctness: using invariants.
Unit testing. Using assertions(). JUnit (or something better). Writing
tests in advance (Extreme programming).
Coverage testing.

Large programs:
Module decomposition and interfaces.
File & class organization.

We will look at a lot of code in class.
Assignments: a significant programming project, built from 4 or 5 assignments.

Prerequisite: the ability to program in Java, including understanding of classes, methods, Java typing, collections, iterators, exception handling, and I/O. If you do not like programming, do not take this class.

The professor (or trained TAs) will read and review all student code.


Steve C McConnell, Code Complete: A Practical Handbook of Software Construction, 857 pages (May 1993), Microsoft Press; ISBN: 1556154844
Tom Gilb, Dorothy Graham, Susannah Finzi (Editor), Software Inspection, Publisher: Addison-Wesley Pub Co; ISBN: 0201631814; 1st edition (December 1993)

Recommended or excerpted:
Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, Grady Booch (Designer), Design Patterns: Elements of Reusable Object-Oriented Software, 395 pages 1 edition (October 1995), Addison-Wesley Pub Co; ISBN: 0201633612

Karl E. Wiegers, Peer Reviews in Software: A Practical Guide,
Publisher: Addison Wesley Professional; ISBN: 0201734850; 1st edition
(December 15, 2001)

Capers Jones, Software development best practices,

gries 'the science of programming'

hetzen 'complete guide to software testing' (or a better testing book)

G22.3033-007 Models/Analysis of Real-Time/Hybrid Systems

Course Description

The class of computer systems whose modeling, analysis, and development are most challenging is the class of reactive systems. Reactive systems are systems whose role is to maintain an ongoing interaction with their environment rather than produce some final value upon termination. Typical examples of reactive systems are Network protocols, Air traffic control system, Programs controlling mechanical devices such as a train, a plane, or ongoing processes such as a nuclear reactor. Recently, there is an awakened interest in reactive systems due to the emergence of embedded systems as a major research field.

The Formal Methods approach to software analysis and construction is based on viewing a program and its execution as mathematical objects and applying mathematical and logical techniques to specify and analyze the properties and behavior of these objects. The main advantages of the formal approach to software construction is that, whenever applicable, it can lead to an increase of the reliability and correctness of the resulting programs by several orders of magnitude.

Several level of abstraction can be used in the study of reactive and embedded systems. At the highest level of abstraction, we only care about the temporal precedence of events and actions, and can guarantee that a system satisfies a requirement such as "every request is eventually followed by a response". However, at this level of modeling we cannot bound the response time by actual numbers. In the course, we consider two successive refinements of our modeling and specification abilities. The first refinement considers the model of real-time systems . In this model, we represent, specify, and analyze the temporal distance between events, thus being able to require and verify a property such as "every request is eventually followed by a response, which must occur within the time interval [3,5] from the request".

The next level of refinement considers the model of hybrid systems. In this model, we consider a system consisting of a combination of components some of which are discrete, such as programs, while the other are continuous, such as a motor driving a railroad gate barrier. This model employs discrete as well as continuous analysis tools in order to analyze and verify the correctness of hybrid systems, typically corresponding to a digital program controlling a continuous environment.

Course topics:

  • The computational model of Real-Time Systems (RTS).
  • A simple programming language (SPL),extended by timing information and its translation into RTS.
  • The specification language of Timed Temporal Logic (TTL).
  • Discrete and dense-time Timed Automata.
  • Model checking timed properties of timed automata.
  • Deductive verification of timed properties.
  • The computational model of Hybrid Systems.
  • Hybrid automata.
  • Model checking properties of hybrid automata.
  • Verification Diagrams.
  • Deductive verification of Hybrid Systems.

G22.3033-008 Web Services and Applications

Web Services and Applications explores how to build a web where content is dynamic and different services interact directly with each other. This course is motivated by computing industry's current interest in web services (witness Microsoft's .NET). However, while exploring the relevant standards (SOAP, WSDL, and UDDI), the course is not limited to them and tries to be a more general exploration of web technologies.

More specifically, the goals for the course are:

  • To understand current web technologies, including the underlying protocols and how to build services for a global audience.
  • To hatch ideas for future research on web services and applications.
  • To develop a methodology for building complex systems.

To meet these goals, the course has three components:
  • Reading assignments to introduce topics.
  • Lectures and in-class discussions to deepen students' understanding.
  • Programming assignments to provide students with hands-on experience.

G22.3033-009 WWW Programming

A programming course that focuses on covering many topics that relate to the WWW.

Topics covered include:

  • Object Oriented Concepts
  • The structure and protocols of the WWW: IP, TCP, UDP, SMTP, DNS, PING, HTTP
  • Programming in Java using J2SE
  • XML, XSL, XSLT, and XPath
  • WebServices
We will do five or more homework assignments in Java. These will include building applications that demonstrate:
  • A Swing UI
  • Sockets
  • Multithreading and RMI
  • Servlets
  • JSPs
  • WebServices
The class does NOT require any previous Java programming expertise, but programming in some language is required. You will be writing code and expected to get up to speed quickly on Java and the IDE we will use. You should have a windows 2000/XP or Linux machine with at least 256 megs (512 meg better) and a 750 mHz processor or better.

G22.3033-010 Application Servers

An application server is a rich, highly portable software program that runs on a middle tier server and handles all application operations between client applications running on pervasive devices and back-end databases and business applications. Application servers provide a platform independent programming interface for developing portable applications in a variety of programming languages. Application servers also facilitate the integration of legacy applications via on-the-fly transformation of XML messages, and support a wide variety of XML-enabled client channels that include traditional web clients and a growing set of smart devices. Application servers enable programmers to implement distributed applications using a variety of architectural patterns including traditional Model-View-Controller (MVC) architectures, Service Oriented Architectures (SOAs), Point-to-Point (P2P) Architectures, Grid-Computing Architectures, etc. Using an SOA architectural pattern, and emerging standards such as SOAP, application servers enable a new generation of "web services" that allow systems to make remote procedure calls to other systems over the Internet. Today, J2EE, .Net, and CORBA 3 application servers set the stage as enabling platforms for Web Services initiatives, Web appliances, and wireless applications.

This course concentrates on architecting, designing, and developing persistent software applications using application server technology. Throughout the course, students are exposed to the evolution of application server architectures that started in the mid 1990s, and experiment with corresponding approaches based on traditional client-server technology, CGI frameworks, page-based extended HTML environments, distributed object computing platforms, object management architectures, component-based computing environments, and frameworks based on MVC, SOA, P2P, etc. The course conveys the necessary skills to select the proper application server architecture based on projects' business and technical requirements. The course also explains how to integrate an application server into an existing Web site, as well as how to implement an application server-based Web application from the ground up. Students will learn how to configure and operate application servers in production environments, taking advantage of features available in mainstream commercial frameworks such as scalability, concurrency, security, fault tolerance, auto-deployment, communications support, development environment, and monitoring tools.

As they design and implement applications using commercial and open source application server technologies (e.g., IBM WebSphere 5.0, BEA WebLogic 8.1, .Net, OpenCCM 0.2, etc.), students will learn how to identify application patterns that lead to the most effective use of the various services provided within application server frameworks. The design and implementation of the persistence and legacy application integration layers, using related application server technology, will be particularly emphasized. Case studies, provided as part of the course, will demonstrate how medium- to large-size sites manage the complexities inherent in these endeavors. These case studies will help students get a firm understanding of how application servers work and how to best deploy complex applications in a real-world situation. Although, the course will strive to provide a comprehensive coverage and classification of application server technologies, attempts will be made whenever possible to select open source technologies for experimentation purpose. As part of the course, students will be exposed to next generation application server technologies including Model Driven Architectures (MDAs), as well as reflective, multimedia and agent-enabled frameworks

G22.3033-011 Systems Biology

Presently, there is no clear way to determine if the current body of biological facts is sufficient to explain phenomenology. In the biological community, it is not uncommon to assume certain biological problems to have achieved a cognitive finality without rigorous justification. In these particular cases, rigorous mathematical models with automated tools for reasoning, simulation, and computation can be of enormous help to uncover cognitive flaws, qualitative simplification or overly generalized assumptions. Some ideal candidates for such study would include: prion hypothesis, cell cycle machinery (DNA replication and repair, chromosome segregation, cell-cycle period control, spindle pole duplication, etc.), muscle contractility, processes involved in cancer (cell cycle regulation, angiogenesis, DNA repair, apoptosis, cellular senescence, tissue space modeling enzymes, etc.), signal transduction pathways, circadian rhythms (especially the effect of small molecular concentration on its robustness), and many others. We believe that the difficulty of biological modeling will become acute as biologists prepare to understand even more complex systems.

Fortunately, in the past, similar issues had been faced by other disciplines: for instance, design of complex microprocessors involving many millions of transistors, building and controlling a configurable robots involving very high degree-of-freedom actuators, implementing hybrid controllers for high-way traffic or air-traffic, or even reasoning about data traffic on a computer network. The approaches developed by control theorists analyzing stability of a system with feedback, physicists studying asymptotic properties of dynamical systems, computer scientists reasoning about a discrete or hybrid (combining discrete events with continuous events) reactive systems---all have tried to address some aspects of the same problem in a very concrete manner. We believe that biological processes could be studied in a similar manner, once the appropriate tools are made available.

The goal of this course is to understand, design and create a large-scale computational system centered on the biology of individual cells, population of cells, intra-cellular processes, and realistic simulation and visualization of these processes at multiple spatio-temporal scales. Such a reasoning system, in the hands of a working biologist, can then be used to gain insight into the underlying biology, design refutable biological experiments, and ultimately, discover intervention schemes to suitably modify the biological processes for therapeutic purposes. The course will focus primarily on two biological processes: genome-evolution and cell-to-cell communication.

G22.3033-012 Introduction to Computational Number Theory & Algebra

This course is an introduction to computational aspects of number theory and algebra. The topics to be covered (tentatively) include: long integer arithemtic, Euclid's algorithm and its applications, primality testing, factorization of polynomials and integers, algorithms for discrete logarithms, and lattice basis reduction and its applications. The selection of topics is geared towards those that are especially useful in modern cryptography, but are anyway interesting in their own right.

There are no formal prerequisites, although some exposure to probability theory and abstract algebra (at the undergraduate level) would be helpful. The text will be a manuscript that is currently in preparation, and it includes all the necessary mathematical background.

Grades will be based on problem sets.

top | contact