Graduate Special Topics in Computer Science

NOTE: for descriptions of standard graduate computer science courses, see Graduate Course Descriptions.

G22.3033-001 Computational Photography

Computational Photography is an exciting new area at the intersection of Computer Graphics and Computer Vision. Through the use of computation, its goal is to move beyond the limitations of conventional photography to produce enhanced and novel imagery of the world around us. The main focus of the course will be on software-based methods for producing visually compelling pictures. However, it will also cover novel camera designs, for which computation is integral to their operation. The course will explain the principles behind many of the advanced tools that can be found in Adobe Photoshop, although the use of this package itself is outside the scope of the course. The course will be suitable for advanced undergraduates, masters and PhD students. A reasonable knowledge of linear algebra is required and familiarity with Matlab is desirable. Assessment will be through coursework and a course project.

See the course homepage for more information.

G22.3033-002 Data Mining

We live in the age of information and knowledge management. The importance of collecting data that reflects business or scientific activities to achieve competitive advantage is widely recognized today. Advanced systems for collecting data and managing it in large databases are in place in most large and mid-range companies. However, the bottleneck of turning this data into your success is the difficulty of extracting knowledge about the system from the collected data.

Below are some of the questions that can be answered if information hidden in a database can be found explicitly and utilized:

What goods should be promoted to this customer?
What is the probability that a certain customer will respond to a planned promotion?
Can one predict the most profitable securities to buy/sell during the next trading session?
Will this customer default on a loan or pay back on schedule?
What medical diagnosis should be assigned to this patient?
How large are the peak loads of a telephone or energy network going to be?
Why does the manufacturing facility suddenly start to produce defective goods?

Modeling the investigated system and discovering relations that connect variables are the subject of data mining.

The course introduces concepts and techniques of data mining and data warehousing, including concept, principle, architecture, design, implementation, application of data warehousing and data mining.

Topics covered include the following:
Data warehousing and OLAP technology for data mining
Data preprocessing
Descriptive data mining: characterization and comparison
Association analysis
Classification and prediction
Cluster analysis
Mining complex types of data
Applications and trends in data mining

See the course homepage for more information.

G22.3033-003 Computational Systems Biology

The course focuses on statistically determining the relations between genotypes and phenotypes. We now know that human genome contains millions of SNPs (single-nucleotide polymorphisms), and thousands more variations in the number of copies of large and small segments of the genome (CNVs: copy number variation), which may either directly cause changes in phenotype (e.g., TAS) or which tag nearby mutations containing the key differences that influence individual variation (e.g., TASPs) and susceptibility to disease.

GWA (Genome-Wide Association) studies allow one to sample large number of SNPs from many patients, thus, capturing variation uniformly across the genome. Recently, there has been an enormous interest in such studies as they have succeeded in identifying risk and protective factors for asthma, cancer, diabetes, heart disease, mental illness and other human differences. For instance, in 2005, it was learned through a small scale GWAS that age-related macular degeneration is associated with variation in the gene for complement factor H, which produces a protein that regulates inflammation. One expects the GWAS to play a significant role in drug discovery and personalized medicine, and will be important in the modern models of health-care (e.g., evidence-based medicine). For instance, it was found that the genetic variants have different responses to various anti-hepatitis C virus treatments: for genotype 1 hepatitis C, treated with Pegasys combined with ribavirin, genetic polymorphisms near the human IL28B gene are associated strongly with responses to the treatment. One expects to find and catalogue many such facts.

This course will focus on the algorithmic, statistical and genetic aspects of this problem. Thus, we will develop specialized methods for Machine Learning (supervised and unsupervised), Classification, Model Selection, Multiple Hypotheses Testing and Experiment Design (pooling and group-testing).

Required Textbooks:

Applied Statistical Genetics with R: For Population-based Association Studies (Use R);
Author: Andrea S. Foulkes
Publisher/Edition (Yr. or No.): Springer; 1 edition (April 17, 2009).

Recommended textbooks:

Mathematical and Statistical Methods for Genetic Analysis;
Author: Kenneth Lange;
Publisher/Edition (Yr. or No.): Springer; 2nd edition (June 3, 2003).

Statistical Genetics of Quantitative Traits: Linkage, Maps and QTL
Authors: Rongling Wu, Changxing Ma, George Casella;
Publisher/Edition (Yr. or No.): Springer; 1 edition (July 31, 2007).

Essentials of Genomic and Personalized Medicine;
Authors: Geoffrey S. Ginsburg and Huntington Ph.D Willard,
Publisher/Edition (Yr. or No.): Academic Press; 1 edition (October 8, 2009).

Genetics: Analysis of Genes and Genomes (Hardcover)
Authors: Daniel Hartl and Elizabeth Jones;
Publisher/Edition (Yr. or No.): Jones & Bartlett Publishers; 7 edition (August 1, 2008).

See the course homepage for more information.

G22.3033-004 Financial Computing I

G22.3033-005 Web Development with Ruby on Rails

This course begins with an in-depth examination of the Ruby language and moves on to web development within the Ruby on Rails framework. An emphasis is placed on understanding the particular features of the Ruby language, how the language compares to others like Java and Python, and how it facilitates the creation of frameworks such as Ruby on Rails. This course is recommended for students with a strong interest in programming languages, web development frameworks, and software engineering. No experience with Ruby or Ruby on Rails is assumed.

See the course homepage for more information.

G22.3033-006 Visualization

Large amounts of data are produced every day in a variety of domains such as engineering, medicine, natural sciences, or meteorology. Due to the ever increasing size and complexity of simulated and measured data, its analysis becomes more challenging. A successful approach to this is data visualization, i.e., the creation of images or videos which allow for a fast and intuitive identification of the most important properties inherent to the data.

The course gives an overview of the most important approaches to data visualization and discusses their advantages and limits. The necessary mathematical tools will be presented along the way (these include topics in numerical mathematics and topology). A state-of-the-art visualization system will be used to examine real-world data sets coming from the fields of medicine and fluid dynamics.

Covered topics:

* Data Description and Selection
* Mapping of Data to Graphics
* Visualization of Multiparameter Data
* Volume Visualization
* Flow Visualization
* Tensor Visualization
* Information Visualization
* Topological Data Analysis

The course is suitable for advanced MS students and PhD students. Familiarity with basic computer graphics (or motivation to learn this fast) is desirable. Assessment will be based on homework assignments and a course project.

See the course homepage for more information.

G22.3033-007 Software Management Systems Cancelled

G22.3033-008 Geometric Modeling

Digital 3D content creation is in high demand in the film and gaming industry, product design and manufacturing, architecture, surgical simulation and planning, medical prosthesis design and more, and it is backed up by affordable 3D acquisition technologies. Yet, shape modeling tasks, such as creation, editing, deformation and animation, remain extremely laborious, requiring artistic skills and high technical expertise. This course will survey state-of-the-art shape modeling research that aims at broadening our knowledge and understanding of shapes to create better digital modeling tools, and explores ways to communicate the human intentions of shape manipulation to the computer in a natural and effective manner.

The course will begin by covering some introductory topics in geometric modeling, with an emphasis on discrete geometry processing: digital shape representations and related data structures, shape acquisition and reconstruction, smoothing and denoising, parameterization, remeshing. The course will then concentrate on recent shape creation and manipulation research, touching on variational modeling techniques, space deformations, sketch-based modeling interfaces, shape interpolation and skeleton-skin animation of articulated bodies. The necessary mathematical tools will be presented along the way (these include topics in linear algebra, differential geometry, optimization).

The course is suitable for PhD students (advanced MS students are also welcome). Programming knowledge is required (preferably C++) and familiarity with basic computer graphics and GUI programming (or motivation to learn those fast) is desirable. Assessment will be based on two small-scale homework assignments and a course project.

See the course homepage for more information.

G22.3033-009 Optimization in Machine Learning

This course introduces a range of machine learning models and optimization tools that are used to apply these models in practice. For the students with some ML background this course will introduce what lies behind the optimization tools often used as a black box as well as an understanding of the trade-offs of numerical accuracy and theoretical and empirical complexity. For the students with some optimization background this course will introduce a variety of applications arising in machine learning and statistics as well as novel optimization methods targeting these applications.

The main topics covered are: 1. Algorithms for support vector machines: interior point, active set, coordinate descent and cutting planes.
2. A Kernel selection in SVM - SDP and SOCP formulations, possible approaches.
3. Low dimensional embedding - SDP formulations and approaches.
4. Matrix completion - SDP formulations and approaches.
5. Sparse optimization: Lasso and sparse logistic regression - convex programming formulations, algorithmic approaches, connection to compressed sensing.
6. Computing regularization paths for Lasso, SVMs and other settings.
7. Sparse PCA and Sparse inverse covariance selection -SDP formulation and approaches.

Prerequisites: MS students who have not taken G22.1180-001 require permission of instructor.

See the course homepage for more information.

G22.3033-010 Values Embodied in Information and Communications Technology *cancelled*

G22.3033-011 Random Graphs

Prerequisites: "Mathematical Maturity." This topic takes from several areas but the material will be developed in the course. An acquaintance with, say, variance (in probability) and/or chromatic number (in graph theory) will be helpful but not mandatory.

Description: Equally appropriate titles would have been "Probabilistic Combinatorics" or "The Probabilistic Method" or (personal favorite) "Erdos Magic." The Probabilistic Method is a lasting legacy of the late Paul Erdos. For "Uncle Paul" the purpose was to prove the existence of a graph, coloring, tournament, or other combinatorial object. A random object would be described, and then one would show that that object had the desired properties with positive probability.

Today we are very interested in algorithmic implementation, both deterministically and with random algorithms. There is further great interest (the official title) in the study of random discrete structures (not just graphs, though that is the main one) for their own sake. The course involves probability, Discrete Math, and algorithms. Probability results include Chernoff Bounds, Martingales, the Lovasz Local Lemma and the Janson Inequalities and will be derived from scratch. Topics include: Ramsey Numbers, Continuous Time Greedy Algorithms, Graph Coloring, Discrepancy, the Liar Game and the Tenure Game. Of particular pragmatic interest: asymptotic calculations permeate the course and approaches to finding asymptotics of various sums and products will be emphasized throughout.

Text: Noga Alon, Joel Spencer, The Probabilistic Method, third edition
publisher: John Wiley, 2009

More Info: Contact Prof Spencer directly at or check out his website:

G22.3033-012 Nonlinear Dimensionality Reduction and Manifold Learning

This course will review computational methods for reducing the dimensionality of high dimensional data which lie on or near a manifold of low intrinsic dimensionality. Topics will include: linear methods (such as principal components analysis, factor analysis, singular value decomposition); classic visualizations methods (such as multidimensional scaling and its non-metric variants); and more recent methods based on eigenvectors of Laplacians and convex optimization (such as Kernel PCA, Locally Linear Embedding, Isomap and Maximum Variance Unfolding). Both theoretical and algorithmic properties of the methods will be discussed. Coursework will include small scale computational experiments and readings of primary source research papers.

See the course homepage for more information.

top | contact