The Visual Microphone: Passive Recovery of Sound from Video




  • Paper topic: Images
  • Software type: Code
  • Able to run a replicability test: True
  • Replicability score: 3
  • Software language: Matlab / Mathematica / ..
  • License: unspecified
  • Build mechanism: Not applicable (python, Matlab..)
  • Dependencies: matlab / pyrTools
  • Documentation score {0,1,2}: 1
  • Reviewer: Nicolas Bonneel <>
  • Time spent for the test (build->first run, timeout at 100min): 40min

Source code information


The code partially implements the paper, as there is no support for low-framerate videos by exploiting rolling shutter.
For the remaining high fps videos, some of them did not work at all as they resulted in errors (randomly either "Unable to read the file." or "Dot indexing is not supported for variables of this type (l. 275 of VideoReader/read)) which I could not debug, perhaps due to some codec issue. This was the case of Chips2-2200Hz-Mary_MIDI-input.avi,Chips1-2200Hz-Mary_Had-input.avi and Plant-2200Hz-Mary_MIDI-input.avi).
I successfully ran the code on Chips1-20000Hz-Mary_Had-input.avi. The script (which loads a file 'crabchipsRamp.avi' which I did not find) needs to be adapted so that dsamplefactor = 1 instead of 0.1, otherwise the result is almost pure noise, and of course samplingrate = 20000. **Beware** as well that the default nscales = 1 while the paper's results were produced with nscales = 4 (page 4 in the paper), although I didn't hear much difference in the result.
With these settings, I managed to recover a sound in about 1.5 hours on a good laptop, but the sound is much noisier (though still impressive!) than the result shown in the accompanying webpage. The resulting spectrogram can be found here:
and the corresponding sound here:
The webpage states that the output were further processed with "speech enhancement audio denoising" (the paper indicates [Loizou 2005]), though I could not find code for that algorithm.
Since matlab R2015, wavwrite has been replaced by audiowrite.

If you want to contribute with another review, please follow these instructions.

Please consider to cut/paste/edit the raw JSON data attached to this paper.