Abstract
The early diagnosis of Alzheimer’s Disease (AD) through non-invasive methods remains a significant healthcare challenge. We present NeuroXVocal, the first end-to-end explainable AD classification system that achieves state-of-the-art performance while providing clinically interpretable explanations. Our novel dual-component architecture consists of: (1) Neuro, a multimodal classifier implementing a unique transformer based fusion strategy that projects acoustic, textual, and speech embeddings into a common dimensional space for complex cross-modal interactions; and (2) XVocal, a specialized RAG-based explainer that retrieves relevant clinical literature to generate evidence-based explanations. Unlike previous approaches using late fusion or simple concatenation, our architecture enables both robust classification and meaningful clinical insights. Using the IS2021 ADReSSo Challenge benchmark dataset, NeuroXVocal achieved 95.77% accuracy, significantly outperforming previous state-of-the-art. Medical professionals validated the clinical relevance of XVocal’s explanations through structured evaluation. This work advances beyond pure classification to bridge the gap between machine learning predictions and clinical decision-making.
| Original language | English |
|---|---|
| Number of pages | 10 |
| Publication status | Accepted/In press - 17 Jun 2025 |
| Event | International Conference on Medical Image Computing and Computer Assisted Intervention - Daejeon, Korea, Democratic People's Republic of Duration: 23 Sept 2025 → 27 Sept 2025 Conference number: 28 https://conferences.miccai.org/2025/en/ |
Conference
| Conference | International Conference on Medical Image Computing and Computer Assisted Intervention |
|---|---|
| Abbreviated title | MICCAI 2025 |
| Country/Territory | Korea, Democratic People's Republic of |
| City | Daejeon |
| Period | 23/09/25 → 27/09/25 |
| Internet address |