
Safety in autonomous vehicle (AV) perception requires models that are not only accurate but also interpretable and robust to environmental noise. This technical note details the development of a custom 3-block Convolutional Neural Network (CNN) trained from scratch on a dataset of 26,378 vehicle images. We demonstrate a test accuracy of 78.54% with a negligible generalization gap (0.06%). Crucially, we utilize Gradient-weighted Class Activation Mapping (Grad-CAM) to prove that the model's decision-making is grounded in structural vehicle geometry rather than spurious background correlations.
