dc.contributor.author | Mc Donnell, Rachel | |
dc.date.accessioned | 2024-02-21T06:40:26Z | |
dc.date.available | 2024-02-21T06:40:26Z | |
dc.date.issued | 2022 | |
dc.date.submitted | 2022 | en |
dc.identifier.citation | Bigioi, D. and Jordan, H. and Jain, R. and Mcdonnell, R. and Corcoran, P., Pose-Aware Speech Driven Facial Landmark Animation Pipeline for Automated Dubbing, IEEE Access, 10, 2022, 133357-133369 | en |
dc.identifier.other | Y | |
dc.description.abstract | A novel neural pipeline allowing one to generate pose aware 3D animated facial landmarks
synchronised to a target speech signal is proposed for the task of automatic dubbing. The goal is to
automatically synchronize a target actors’ lips and facial motion to an unseen speech sequence, while
maintaining the quality of the original performance. Given a 3D facial key point sequence extracted from
any reference video, and a target audio clip, the neural pipeline learns how to generate head pose aware,
identity aware landmarks and outputs accurate 3D lip motion directly at the inference stage. These generated
landmarks can be used to render a photo-realistic video via an additional image to image conversion stage.
In this paper, a novel data augmentation technique is introduced that increases the size of the training
dataset from N audio/visual pairs up to NxN unique pairs for the task of automatic dubbing. The trained
inference pipeline employs a LSTM-based network that takes Mel-coefficients as input from an unseen
speech sequence, combined with head pose, and identity parameters extracted from a reference video to
generate a new set of pose aware 3D landmarks that are synchronized with the unseen speech. | en |
dc.format.extent | 133357-133369 | en |
dc.language.iso | en | en |
dc.relation.ispartofseries | IEEE Access; | |
dc.relation.ispartofseries | 10; | |
dc.rights | Y | en |
dc.subject | speech | en |
dc.subject | novel neural pipeline | en |
dc.subject | automatic dubbing | en |
dc.subject.lcsh | speech | en |
dc.subject.lcsh | novel neural pipeline | en |
dc.subject.lcsh | automatic dubbing | en |
dc.title | Pose-Aware Speech Driven Facial Landmark Animation Pipeline for Automated Dubbing | en |
dc.type | Journal Article | en |
dc.type.supercollection | scholarly_publications | en |
dc.type.supercollection | refereed_publications | en |
dc.identifier.peoplefinderurl | http://people.tcd.ie/ramcdonn | |
dc.identifier.rssinternalid | 251191 | |
dc.identifier.doi | http://dx.doi.org/10.1109/ACCESS.2022.3231137 | |
dc.rights.ecaccessrights | openAccess | |
dc.identifier.orcid_id | 0000-0002-1957-2506 | |
dc.identifier.uri | http://hdl.handle.net/2262/105583 | |