Title: N-Way Composition of Weighted Finite-State Transducers

(NYU-CS-TR902)

Authors: Cyril Allauzen and Mehryar Mohri

Abstract:
Composition of weighted transducers is a fundamental algorithm used in
many applications, including for computing complex edit-distances
between automata, or string kernels in machine learning, or to combine
different components of a speech recognition, speech synthesis, or
information extraction system. We present a generalization of the
composition of weighted transducers, \emph{$n$-way composition}, which
is dramatically faster in practice than the standard composition
algorithm when combining more than two transducers. The expected
worst-case complexity of our algorithm for composing three transducers
$T_1$, $T_2$, and $T_3$\ignore{ depending on the strategy used, is
$O(|T_1|_E|T_2|_Q|T_3|_E + |T|)$ or $(|T_1|_Q|T_2|_E|T_3|_Q + |T|)$, }
is $O(\min(|T_1|_E|T_2|_Q|T_3|_E, |T_1|_Q|T_2|_E|T_3|_Q) + |T|)$,
where $T$ is the result of that composition and $|T_i| = |T_i|_Q + |T_i|_E$ with $|T_i|_Q$ the number of states and $|T_i|_E$ the number
of transitions of $T_i$, $i = 1, 2, 3$. In many cases, this
significantly improves on the complexity of standard composition. Our
algorithm also leads to a dramatically faster composition in
practice. Furthermore, standard composition can be obtained as a
special case of our algorithm. We report the results of several
experiments demonstrating this improvement. These theoretical and
empirical improvements significantly enhance performance in the