Sensitivity-Prioritized Comparative Evaluation of VGG16 and ResNet50 Deep Learning Architectures for Early-Stage Alzheimer's Disease Detection from Brain MRI

Ghimire, Shankar; Thapa Magar, Bikash; Adhikari, Nikesh; Neupane, Sagar; Maharjan, Sachin

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Other literature type . 2026

License: CC BY

Data sources: Datacite

ZENODO

Other literature type . 2026

License: CC BY

Data sources: Datacite

Sensitivity-Prioritized Comparative Evaluation of VGG16 and ResNet50 Deep Learning Architectures for Early-Stage Alzheimer's Disease Detection from Brain MRI

descriptionPublicationkeyboard_double_arrow_right Other literature type 15 May 2026Publisher:Zenodo

Authors: Ghimire, Shankar; Thapa Magar, Bikash; Adhikari, Nikesh; Neupane, Sagar; Maharjan, Sachin;

doi: 10.5281/zenodo.20199049 , 10.5281/zenodo.20199048

Sensitivity-Prioritized Comparative Evaluation of VGG16 and ResNet50 Deep Learning Architectures for Early-Stage Alzheimer's Disease Detection from Brain MRI

- Summary
- Metrics

Abstract

Background: Alzheimer's Disease (AD) is a progressive neurodegenerative disorder affecting approximately 55 million individuals globally, with projections exceeding 139 million cases by 2050. Clinical diagnosis currently depends on manual radiologist interpretation of Structural Magnetic Resonance Imaging (MRI), a workflow subject to substantial inter-observer variability particularly at the Very Mild Demented stage, where delayed detection directly translates into missed therapeutic windows. Objective: This study addresses a critical methodological gap in the deep learning literature: prior comparative studies of CNN architectures for AD detection have overwhelmingly optimized for aggregate classification accuracy rather than Sensitivity (Recall), the metric of primary clinical concern in medical screening. We present a rigorously controlled Champion vs. Challenger comparative evaluation of two widely adopted CNN architectures VGG16 (sequential) and ResNet50 (residual) with explicit priority on minimizing false negatives, particularly in the clinically decisive Very Mild Demented stage. Methods: Both architectures were trained on the publicly available Augmented Alzheimer MRI Dataset (33,984 images, four severity classes) using a stratified 80/20 train–validation split. A two-stage Transfer Learning protocol ImageNet-based feature extraction (5 epochs at learning rate 1.26×10⁻³) followed by full-network fine-tuning (5 epochs at learning rate 1×10⁻⁵) was applied uniformly to both models. The Adam optimizer and categorical cross-entropy loss were used throughout. Performance was evaluated using accuracy, macro-averaged precision, macro-averaged recall (primary criterion), macro-averaged F1-score, and class-level confusion matrices. Results: Contrary to theoretical expectations favoring deeper residual architectures, VGG16 outperformed ResNet50 on every evaluated metric. VGG16 achieved a final validation accuracy of 97.76%, macro-average recall of 97.92%, and validation loss of 0.0653, compared to ResNet50's 94.25%, 94.68%, and 0.1694 respectively. Critically, in the Very Mild Demented class, VGG16 attained 94.05% recall versus ResNet50's 88.64% a 5.41-percentage-point improvement that reduced false negatives from 108 to 74 patients per validation cohort of 1,814 early-stage cases. Conclusions: These findings challenge the assumption that deeper residual architectures universally outperform shallower sequential networks in constrained medical imaging domains. In applications where training data volume is moderate and target features are spatially coherent (as with cortical atrophy), VGG16's hierarchical sequential learning, combined with careful fine-tuning, produces superior clinical sensitivity. We recommend VGG16 as a foundational architecture for Alzheimer's clinical decision support systems and identify explainability (Grad-CAM), 3D volumetric modelling, and multi-site prospective validation as the highest-priority directions for subsequent research. Index Terms: Alzheimer's Disease, Convolutional Neural Networks, VGG16, ResNet50, Transfer Learning, MRI Classification, Medical Image Analysis, Dementia Staging, Sensitivity, False Negative Reduction, Computer-Aided Diagnosis, Explainable AI.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now