Abstract
Facial Expression Recognition (FER) is crucial for assistive technologies, particularly for individuals with social communication impairments such as Autism Spectrum Disorder (ASD). This thesis introduces a novel pipeline combining image augmentation, dimensionality reduction, and facial re-enactment techniques to enhance FER accuracy, especially for subtle or ambiguous expressions.
The proposed pipeline integrates classical Machine Learning (ML) techniques, including Principal Component Analysis (PCA), T-distributed Stochastic Neighbour Embedding (t-SNE), and Non-Negative Matrix Factorisation (NMF), to augment facial expression images without retraining classifiers. While t-SNE showed the most accuracy improvement, its high computational cost limits real-time application. A key contribution is the enhancement of a facial re-enactment model, adjusting facial landmarks to create clearer expressions. This significantly improves classification accuracy for challenging emotions like anger, sadness, and contempt. Additionally, a custom-trained classifier achieved State-of-the-Art (SotA) results on the AffectNet Database (AFD), surpassing previous benchmarks by 0.21% (7-class) and 0.40% (8-class).
A clinical study with 52 participants, both autistic and non-autistic, validated the pipeline’s effectiveness, showing significant FER improvements, particularly for individuals with ASD. The research also introduces the Mid-Processing Unit (MPU), enhancing classification without retraining models, and a new database containing participant responses and enhanced images, providing a valuable resource for future studies.
The implications of this research extend beyond FER into broader applications in human-computer interaction, affective computing, and mental health assessment. By improving expression recognition, this work supports the development of more adaptive assistive technologies, potentially benefiting education, therapy, and social skills training for neurodivergent individuals. Future research will focus on optimising real-time performance, increasing dataset diversity through synthetic data, and integrating these advancements into practical applications such as mobile and wearable systems.
These contributions advance FER, enabling real-time optimisation, broader expression recognition, and synthetic data generation via Generative Adversarial Networks (GANs), ultimately enhancing assistive technologies for social communication.
The proposed pipeline integrates classical Machine Learning (ML) techniques, including Principal Component Analysis (PCA), T-distributed Stochastic Neighbour Embedding (t-SNE), and Non-Negative Matrix Factorisation (NMF), to augment facial expression images without retraining classifiers. While t-SNE showed the most accuracy improvement, its high computational cost limits real-time application. A key contribution is the enhancement of a facial re-enactment model, adjusting facial landmarks to create clearer expressions. This significantly improves classification accuracy for challenging emotions like anger, sadness, and contempt. Additionally, a custom-trained classifier achieved State-of-the-Art (SotA) results on the AffectNet Database (AFD), surpassing previous benchmarks by 0.21% (7-class) and 0.40% (8-class).
A clinical study with 52 participants, both autistic and non-autistic, validated the pipeline’s effectiveness, showing significant FER improvements, particularly for individuals with ASD. The research also introduces the Mid-Processing Unit (MPU), enhancing classification without retraining models, and a new database containing participant responses and enhanced images, providing a valuable resource for future studies.
The implications of this research extend beyond FER into broader applications in human-computer interaction, affective computing, and mental health assessment. By improving expression recognition, this work supports the development of more adaptive assistive technologies, potentially benefiting education, therapy, and social skills training for neurodivergent individuals. Future research will focus on optimising real-time performance, increasing dataset diversity through synthetic data, and integrating these advancements into practical applications such as mobile and wearable systems.
These contributions advance FER, enabling real-time optimisation, broader expression recognition, and synthetic data generation via Generative Adversarial Networks (GANs), ultimately enhancing assistive technologies for social communication.
| Original language | English |
|---|---|
| Qualification | Doctor of Philosophy (PhD) |
| Awarding Institution | |
| Supervisors/Advisors |
|
| Award date | 23 Apr 2025 |
| Place of Publication | Kingston upon Thames, U.K. |
| Publisher | |
| Publication status | Accepted/In press - 23 Apr 2025 |
PhD type
- Standard route