VisemeNet: Audio-Driven Animator-Centric Speech Animation

SIGGRAPH 2018


Reviews

Information

  • Paper topic: Animation and Simulation
  • Software type: Code
  • Able to run a replicability test: True
  • Replicability score: 4
  • Software language: Python
  • License: unspecified
  • Build mechanism: Not applicable (python, Matlab..)
  • Dependencies: tensorflow / cudnn / numpy / scipy / matplotlib / python_speech_features / Maya
  • Documentation score {0,1,2}: 1
  • Reviewer: Nicolas Bonneel <nicolas.bonneel@liris.cnrs.fr>
  • Time spent for the test (build->first run, timeout at 100min): 40min

Source code information

Comments

Note that the VisemeNet code has strong requirements on software and library versions. The code does run on Python 3.5 but not on Python 3.6.5. Also, Python 3.5 now comes with a default scipy 1.4.0 which is not suitable, though an older scipy 1.1.0 works.
I could test the prediction with the provided trained network on the single provided audio file, as well as the Maya script to use the results on a public face rig. This worked nicely.
I did not test the training as data are only accessible upon (non-anonymous) request, and training instructions and scripts are not provided (although a non-standalone train_visemenet.py file is present, it does not run on its own).

--alternative test on linux--
I failed to have the tensorflow package within anaconda to work with libcudnn.so.8.0.

If you want to contribute with another review, please follow these instructions.

Please consider to cut/paste/edit the raw JSON data attached to this paper.