Wednesday, March 24, 2010

Eigenanalysis for Lossy Compression

Eigenanalysis is a method for reducing a set of data to the principle dimensions along which that data varies. In the context of imaging data, it has been applied very successfully to Eigenfaces:

Where a set of faces is broken down into a smaller set of face "prototypes" that can recombined in varying portions to recreate the original data set with a limited accuracy.

In the context of music, I can imagine that the spectral characteristics of songs have some self-similarity: portions repeat, chords are repeated in different voices and different octaves, rhythms repeat, etc. I can imagine a lossy compression algorithm that takes the frequency domain representation of a song, does Eigenanalysis on these vectors, and stores the song simply as the collection of N eigenvectors and the reduced representation of each frequency-domain chunk.

Quantization methods may be employed for further reducing bit usage due to similarity between adjacent chunks. Or different portions of the spectrum can be analyzed separately, which allows for better representation of lower frequencies and less information dedicated to higher frequencies. This unfortunately does not account for the obvious relationship between the lower and higher frequencies.

A more advanced implementation may involve doing eigenanalysis on mutiple chunks simultaneously in a moving window, or at different scales, which will help with rhythmic repetition.

The octave or overtone relationship is a little more complicated, and would require something like a constant-Q transforms to get a logarithmic frequency domain.

No comments: