We propose a novel framework for supervised learning of discrete
concepts. Since the 1970's, the standard computational primitive has
been to find the most consistent hypothesis in a given complexity
class. In contrast, in this paper we propose a new basic operation: for
each pair of input instances, count how many concepts of bounded
complexity contain both of them.
Our approach maps instances to a Hilbert space, whose metric is induced
by a universal kernel coinciding with our computational primitive, and
identifies concepts with half-spaces. We prove that all concepts are
linearly separable under this mapping. Hence, given a labeled sample and
an oracle for evaluating the universal kernel, we can efficiently
compute a linear classifier (via SVM, for example) and use margin bounds
to control its generalization error. Even though exact evaluation of the
universal kernel may be infeasible, in various natural situations it is
efficiently approximable.
Though our approach is general, our main application is to regular
languages. Our approach presents a substantial departure from current
learning paradigms and in particular yields a novel method for learning
this fundamental concept class. Unlike existing techniques, we make no
structural assumptions on the corresponding unknown automata, the string
distribution or the completeness of the training set. Instead, given a
labeled sample our algorithm outputs a classifier with guaranteed
distribution-free generalization bounds; to our knowledge, the proposed
framework is the only one capable of achieving the latter. Along the
way, we touch upon several fundamental questions in complexity,
automata, and machine learning.
Join work with Boaz Nadler.