Computer Science Colloquium

Rational Kernels -- A General Machine Learning Framework for the Analysis of Biological Sequences, Text, and Speech

Mehryar Mohri
AT&T Labs Research

Friday, March 26, 2004 11:30 A.M.
Room 1302 Warren Weaver Hall
251 Mercer Street
New York, NY 10012-1185

Colloquium Information:


Richard Cole, (212) 998-3119


Most classification algorithms were originally designed for fixed-size vectors. However, important machine learning problems in computational biology, or text and speech processing, require the analysis of variable-length sequences and more generally distributions over variable-length sequences.

Rational kernels are a new family of similarity measures over variable-length sequences and their distributions. Many similarity measures commonly used in computational biology, such as the edit distance, the convolution kernels of Haussler, and other string kernels, are shown to be special cases of rational kernels.

This talk will describe general and efficient methods for computing rational kernels, and discuss some important convergence and closure properties. It will also report the results of experiments illustrating the successful use of rational kernels for some difficult prediction problems.

[Joint work with Corinna Cortes and Patrick Haffner]

top | contact