Music Identification at NYU


We are working on audio fingerprinting techniques that enable computers to recognize songs. Our algorithm represents songs in terms of "music phonemes", elementary units of music sound that are learned from data, and uses weighted finite-state transducers to construct a compact and efficient index of a large database of songs. The image above depicts an example of such a transducer. As a result, songs can be recognized quickly and accurately when only a recording of a short "audio snippet" is available and even when the recording is distorted.

We have created a working system with a database of 15,000 songs. We have also proven new bounds on the size of the indexing finite automata used that guarantee the compactness of this representation as the number of songs indexed increases and suggests that out techniques scale to much larger song data sets.