Event2Audio: Event-based Optical Vibration Sensing

1Department of Electrical Engineering and Computer Sciences, University of California, Berkeley 2Herbert Wertheim School of Optometry and Vision Science, University of California, Berkeley *Equal contribution
International Conference on Computational Photography (ICCP), 2025

Framework Workflow

Interpolate start reference image.
Figure 1. (a) Imaging defocused speckle. A coherent laser illuminates the vibrating surface influenced by audio, generating a defocused speckle pattern on the sensor plane. The pattern's 2D movements are captured by the event sensor. (b) The captured motion is encoded into a stream of asynchronous events.(c) Audio signal extraction from events. (d) Recovered audio waveform.

Abstract

Small vibrations observed in video can unveil information beyond what is visual, such as sound and material properties. It is possible to passively record these vibrations when they are visually perceptible, or actively amplify their visual contribution with a laser beam when they are not perceptible. In this paper, we improve upon the active sensing approach by leveraging event-based cameras, which are designed to efficiently capture fast motion. We demonstrate our method experimentally by recovering audio from vibrations, even for multiple simultaneous sources, and in the presence of environmental distortions. Our approach matches the state-of-the-art reconstruction quality at much faster speeds, approaching real-time processing.

Experimental Results

Remote Recording of Multiple Audio Sources

Interpolate start reference image.
microphone recording
input left audio / input right audio
recovered left audio / recovered right audio

Speech Recovery from Speaker Membrane

Interpolate start reference image.
input speech
passive event camera (real-time) / ours (real-time)
high-speed camera (offline) / ours (offline)

Speech Recovery from a Chip Bag

Interpolate start reference image.
Howard et al. (real-time) / ours (real-time) / input speech
Davis et al. (offline) / Sheinin et al. (offline) / ours (offline)

Vibrometry Against Environmental Distortion

Interpolate start reference image.
demix audio from noisy scenes (microphone recording / our method / input speech)
vibrometry eliminates echoes (microphone recording / our method / input speech)
underwater vibrometry (microphone recording / our method / input speech)

BibTeX

@article{cai2025event2audio,
  author    = {Cai, Mingxuan and Galor, Dekel and Kohli, Amit Pal Singh and Yates, Jacob L. and Waller, Laura},
  title     = {Event2Audio: Event-based Optical Vibration Sensing},
  journal   = {IEEE International Conference on Computational Photography (ICCP)},
  year      = {2025},
}