Disney Research Helps Take Automated Animation Lip-Syncing to Another Level


Disney Research is researching how to take automated lip-syncing to another level, by adding “deep learning” to the equation. This process could greatly aid Disney and other animation studios in speeding up a rather tedious process.

Automated lip-syncing is not a new technology. It is becoming more and more commonplace outside of the animation industry with people creating their own virtual avatars in apps and movies. But this new approach allows for more realistic lip-syncing by teaching the system what mouth movements it should make based on hours of footage.

From Road to VR

ADVERTISEMENT. Article continues below.

Creating speech animation which matches an audio recording for a CGI character is typically done by hand by a skilled animator. And while this system falls short of the sort of high fidelity speech animation you’d expect from major CGI productions, it could certainly be used as an automated first-pass in such productions or used to add passable speech animation in places where it might otherwise be impractical, such as NPC dialogue in a large RPG, or for low budget projects that would benefit from speech animation but don’t have the means to hire an animator (instructional/training videos, academic projects, etc).

But its uses aren’t limited to just animated films. With virtual reality experiences becoming more and more realistic (including those coming to Disney Parks) the need for better, faster lip-syncing to more fully immerse users is clear.

In the case of VR, the system could be used to make social VR avatars more realistic by animating the avatar’s mouth in real-time as the user speaks. True mouth tracking (optical or otherwise) would be the most accurate method for animating an avatar’s speech, but a procedural speech animation system like this one could be a practical stopgap if / until mouth tracking hardware becomes widespread.

Poor lip-syncing in animation can be distracting and lead to a box office flop, according to Dr. Sarah Taylor, from UEA’s School of Computing Sciences. “Doing it well however is both time consuming and costly as it has to be manually produced by a skilled animator. Our goal is to automatically generate production-quality animated speech for any style of character, given only audio speech as an input” she was quoted as saying in an interview with Cartoon Brew.

Disney Research is only one organization involved in the study. Also included are the University of East Anglia, California Institute of Technology, and Carnegie Mellon University. Together they authored a paper titled “A Deep Learning Approach for Generalized Speech Animation” that can be downloaded here. Alternately, another paper was the result of this topic at the latest SIGGRAPH, a conference dedicated to computer graphics.

ADVERTISEMENT. Article continues below.

Someday, Disney Parks attractions like Turtle Talk with Crush may feature lifelike characters featuring flawless lip-syncing thanks to this research.

[Source: Road to VR]

Wanna go to Disney?
We recommend Academy Travel, a Disney Platinum Earmarked agency. Walt Disney World, Disneyland, Disney Cruise Line, Universal Orlando Resort and more -- they can do it all, and at no extra cost to you! Fill out the form below or call 609-978-0740 today!

I've loved Disney as long as I can remember. As a former newspaper editor, web developer, and Disney comics freelancer, I'm able to combine that experience into writing about Disney online. I'm also the co-host of a Disney fan podcast called 'Pirates & Princesses.' Opinions mine.


Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.