Abstract
This thesis presents novel methodologies for anomaly detection in both image and video domains, addressing significant challenges and advancing the state-of-the-art in computer vision anomaly detection. The research is divided into two main areas: image anomaly detection and video anomaly detection, each contributing unique architectures and loss functions to improve detection accuracy and robustness.
In the domain of image anomaly detection, we first explore the use of Fourier Transform AutoEncoders and Variational AutoEncoders to enhance denoising and latent space representation. We also introduce the Enforced Isolation Deep Network, which utilises a modified Triplet Loss function to create distinct subspaces for normal and anomalous samples, ensuring clear separation even with minimal anomalous samples.
Transitioning to video anomaly detection, we propose the MaskedSkipUNet architecture, which incorporates MaskedConv3D layers within the skip connections of a UNet. This novel framework prevents the reconstruction of anomalies and leverages the surrounding normal spatiotemporal context to enhance detection. Extensive experiments demonstrate that MaskedSkipUNet outperforms traditional methods, setting new benchmarks in video anomaly detection accuracy. Additionally, we introduce Dynamic Distinction Learning, which uses a novel loss function to adaptively adjust the threshold of pseudo anomalies and trains a reconstruction model to map anomalies back to normality, further improving the detection of subtle anomalies in video data.
Across both image and video domains, this thesis moves beyond traditional approaches by encouraging models to actively infer what is normal, rather than passively failing to reconstruct anomalies. By designing systems that explicitly separate normal from abnormal patterns, this work establishes a more intentional and reliable foundation for detecting anomalies in complex visual data. The contributions of this thesis hold the potential for practical applications across various industries, including industrial quality control, public safety surveillance, and healthcare monitoring.
In the domain of image anomaly detection, we first explore the use of Fourier Transform AutoEncoders and Variational AutoEncoders to enhance denoising and latent space representation. We also introduce the Enforced Isolation Deep Network, which utilises a modified Triplet Loss function to create distinct subspaces for normal and anomalous samples, ensuring clear separation even with minimal anomalous samples.
Transitioning to video anomaly detection, we propose the MaskedSkipUNet architecture, which incorporates MaskedConv3D layers within the skip connections of a UNet. This novel framework prevents the reconstruction of anomalies and leverages the surrounding normal spatiotemporal context to enhance detection. Extensive experiments demonstrate that MaskedSkipUNet outperforms traditional methods, setting new benchmarks in video anomaly detection accuracy. Additionally, we introduce Dynamic Distinction Learning, which uses a novel loss function to adaptively adjust the threshold of pseudo anomalies and trains a reconstruction model to map anomalies back to normality, further improving the detection of subtle anomalies in video data.
Across both image and video domains, this thesis moves beyond traditional approaches by encouraging models to actively infer what is normal, rather than passively failing to reconstruct anomalies. By designing systems that explicitly separate normal from abnormal patterns, this work establishes a more intentional and reliable foundation for detecting anomalies in complex visual data. The contributions of this thesis hold the potential for practical applications across various industries, including industrial quality control, public safety surveillance, and healthcare monitoring.
| Original language | English |
|---|---|
| Qualification | Doctor of Philosophy (PhD) |
| Awarding Institution |
|
| Supervisors/Advisors |
|
| Award date | 22 Sept 2025 |
| Place of Publication | Kingston upon Thames, U.K. |
| Publisher | |
| Publication status | Published - 28 Jan 2026 |
Keywords
- anomaly detection
- Computer Vision Anomaly Detection (CVAD)
- image anomaly detection
- video anomaly detection
- deep learning
- AutoEncoders
- Variational AutoEncoders
- Fourier Transformation AutoEncoder (FAE)
- Enforced Isolation Deep Network (EIDN)
- triplet loss
- masked convolutions
- MaskedSkipUNet
- Dynamic Distinction Learning (DDL)
- pseudo-anomalies
- spatio-temporal context
- surveillance
PhD type
- Standard route