Sunday, August 05, 2007

Audio Respatialization

With a few video cameras you can capture a full spherical panorama of a space. This gives you enough information to recreate any view from that position (i.e., any orientation from a fixed position). We can do similarly with four microphones for audio.

This only allows reorientation though, repositioning is harder. Techniques for determining the geometry of a space from a limited number of viewpoints using computer vision are still in the experimental stages. As far as I know, we don't have any analogous computer hearing algorithms for determining the structure of the "sound scape". (Anyone who's practiced Pauline Oliveros' Deep Listening will understand experientially our ability to derive sonic structure).

I imagine we could triangulate the position of every sound given the right number of microphones. It takes four microphones to make one spherical module. Each module will give us an angle for every sound (or rather, the sound at every angle, up to some resolution). With two modules we can then reconstruct the distance of any sound based on the two angles. That is, we would have a complete recording of a space from which we can derive the sound received by a virtual directional microphone at any location and any orientation.

The problems will be similar to those encountered by the visual correlate: boundaries, shadows (back faces) and reflections (crucial for enclosed spaces) will be poorly represented.

By pairing two spherical camera modules with the audio modules we could record, for example, a parade — which we could later walk through, seeing everything in 3d and hearing an accurate binaural representation from our current position and orientation.

No comments: