Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Other literature type . 2026
License: CC BY
Data sources: Datacite
ZENODO
Other literature type . 2026
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Sensitivity-Prioritized Comparative Evaluation of VGG16 and ResNet50 Deep Learning Architectures for Early-Stage Alzheimer's Disease Detection from Brain MRI

Authors: Ghimire, Shankar; Thapa Magar, Bikash; Adhikari, Nikesh; Neupane, Sagar; Maharjan, Sachin;

Sensitivity-Prioritized Comparative Evaluation of VGG16 and ResNet50 Deep Learning Architectures for Early-Stage Alzheimer's Disease Detection from Brain MRI

Abstract

Background: Alzheimer's Disease (AD) is a progressive neurodegenerative disorder affecting approximately 55 million individuals globally, with projections exceeding 139 million cases by 2050. Clinical diagnosis currently depends on manual radiologist interpretation of Structural Magnetic Resonance Imaging (MRI), a workflow subject to substantial inter-observer variability particularly at the Very Mild Demented stage, where delayed detection directly translates into missed therapeutic windows. Objective: This study addresses a critical methodological gap in the deep learning literature: prior comparative studies of CNN architectures for AD detection have overwhelmingly optimized for aggregate classification accuracy rather than Sensitivity (Recall), the metric of primary clinical concern in medical screening. We present a rigorously controlled Champion vs. Challenger comparative evaluation of two widely adopted CNN architectures VGG16 (sequential) and ResNet50 (residual) with explicit priority on minimizing false negatives, particularly in the clinically decisive Very Mild Demented stage. Methods: Both architectures were trained on the publicly available Augmented Alzheimer MRI Dataset (33,984 images, four severity classes) using a stratified 80/20 train–validation split. A two-stage Transfer Learning protocol ImageNet-based feature extraction (5 epochs at learning rate 1.26×10⁻³) followed by full-network fine-tuning (5 epochs at learning rate 1×10⁻⁵) was applied uniformly to both models. The Adam optimizer and categorical cross-entropy loss were used throughout. Performance was evaluated using accuracy, macro-averaged precision, macro-averaged recall (primary criterion), macro-averaged F1-score, and class-level confusion matrices. Results: Contrary to theoretical expectations favoring deeper residual architectures, VGG16 outperformed ResNet50 on every evaluated metric. VGG16 achieved a final validation accuracy of 97.76%, macro-average recall of 97.92%, and validation loss of 0.0653, compared to ResNet50's 94.25%, 94.68%, and 0.1694 respectively. Critically, in the Very Mild Demented class, VGG16 attained 94.05% recall versus ResNet50's 88.64% a 5.41-percentage-point improvement that reduced false negatives from 108 to 74 patients per validation cohort of 1,814 early-stage cases. Conclusions: These findings challenge the assumption that deeper residual architectures universally outperform shallower sequential networks in constrained medical imaging domains. In applications where training data volume is moderate and target features are spatially coherent (as with cortical atrophy), VGG16's hierarchical sequential learning, combined with careful fine-tuning, produces superior clinical sensitivity. We recommend VGG16 as a foundational architecture for Alzheimer's clinical decision support systems and identify explainability (Grad-CAM), 3D volumetric modelling, and multi-site prospective validation as the highest-priority directions for subsequent research. Index Terms: Alzheimer's Disease, Convolutional Neural Networks, VGG16, ResNet50, Transfer Learning, MRI Classification, Medical Image Analysis, Dementia Staging, Sensitivity, False Negative Reduction, Computer-Aided Diagnosis, Explainable AI.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!