Enhancing human-machine interaction: a novel approach to emotion-controlled speech-to-animation

Rebecca Mobbs (Contributor), Dimitrios Makris (Contributor), Demetris Lappas (Contributor), Vasileios Argyriou (Contributor)

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Downloads (Pure)

Abstract

This paper presents a novel framework for emotion-controlled speech-to-animation, addressing the issue of emotional mismatches between speech and facial expressions in existing methods. Our approach synchronises emotional expression across audio and facial animations using State-of-the-Art (SOTA) pretrained models, eliminating the need for costly custom training while ensuring adaptability. A key contribution of our framework is the creation of a novel a Speech-to-Speech (S2S) pipeline for emotional control over generated speech. In addition, we introduce a novel evaluation metric, the Emotion Distribution Divergence (EDD), to assess our models ability to modify the emotions in the original videos. Experimental results demonstrate significant improvements in emotional expressiveness and realism over existing methods, establishing our approach as a major advancement in human-machine interaction, virtual assistants, and emotion-aware IoT applications.
Original languageEnglish
Title of host publicationSixth International Conference on Computer Vision and Information Technology (CVIT 2025)
EditorsJixin Ma
Place of PublicationWashington, U.S.
PublisherSPIE
Number of pages9
Volume13796
ISBN (Electronic)9781510694736
ISBN (Print)9781510694729
DOIs
Publication statusPublished - 19 Sept 2025
Event2025 6th International Conference on Computer Vision and Information Technology (CVIT 2025) - Florence, Italy
Duration: 20 Jun 202522 Jun 2025

Publication series

NameInternational Conference on Computer Vision and Information Technology
PublisherSPIE
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Conference

Conference2025 6th International Conference on Computer Vision and Information Technology (CVIT 2025)
Period20/06/2522/06/25

Keywords

  • Computer science and informatics

Fingerprint

Dive into the research topics of 'Enhancing human-machine interaction: a novel approach to emotion-controlled speech-to-animation'. Together they form a unique fingerprint.

Cite this