Mediapipe based Preprocessed VGGFace2 Dataset

Authors: Shah, Syed Taimoor Hussain; Shah, Syed Adil Hussain; Zamir, Ammara; Qayyum, Kainat; Shah, Syed Baqir Hussain; Fatima, Syeda Maryam; Deriu, Marco Agostino;

doi: 10.5281/zenodo.15078556 , 10.5281/zenodo.15078557

Mediapipe based Preprocessed VGGFace2 Dataset

- Summary
- Metrics

Abstract

VGGFace2 Dataset and Face Mesh PreprocessingIntroductionThe VGGFace2 dataset is a large-scale face recognition dataset containing over 3.31 million images of 9,131 identities, with an average of 362 images per identity. The dataset is designed to include extensive variations in pose, age, illumination, ethnicity, and profession, making it one of the most diverse and challenging face recognition datasets available. For more details, please refer to the original publication:VGGFace2: A dataset for recognizing faces across pose and age - DOI: 10.48550/arXiv.1710.08092 Preprocessing Using MediaPipe 3D Face MeshOn this dataset, we applied the MediaPipe-based 3D face mesh algorithm to accurately detect faces while removing all background elements, including hair. Our preprocessing strictly retained facial landmarks, ensuring that only the essential facial features were preserved. This approach significantly enhanced the accuracy and generalization of our model, as the model was trained exclusively on landmark-based facial data. Training and PerformanceThe preprocessed data was utilized to train Xception model, which resulted in remarkably accurate outcomes due to the strictly landmark-based facial representation. The model demonstrated robust performance including explainable-AI, proving that eliminating unnecessary background elements contributed positively to its efficiency and reliability. CitationIf you use this dataset or the preprocessed version in your work, please cite both of the following: VGGFace2 Dataset: @article{Cao2018VGGFace2, title={VGGFace2: A dataset for recognizing faces across pose and age}, author={Cao, Qiong and Shen, Li and Xie, Weidi and Parkhi, Omkar M and Zisserman, Andrew}, journal={arXiv preprint arXiv:1710.08092}, year={2018}} DOI: [10.48550/arXiv.1710.08092](https://doi.org/10.48550/arXiv.1710.08092) Preprocessed Dataset using MediaPipe:@dataset{Shah2025_MediaPipe_FaceMesh, title={MediaPipe-based 3D Face Mesh Preprocessed VGGFace2 Dataset}, author={Shah, Syed Taimoor Hussain and Shah, Syed Adil Hussain and Zamir, Ammara and Qayyum, Kainat and Shah, Syed Baqir Hussain and Fatima, Syeda Maryam and Deriu, Marco Agostino}, year={2025}, doi={10.5281/zenodo.15078557}} DOI: [10.5281/zenodo.15078557](https://doi.org/10.5281/zenodo.15078557) ContactFor any questions or further details, please feel free to contact us.Syed Taimoor Hussain ShahPolitoBIOMed Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Turin, ItalyEmail: taimoor.shah@polito.itORCID: 0000-0002-6010-6777

Related Organizations

University of the Punjab
Pakistan
Ca Foscari University of Venice
Italy
Tianjin Polytechnic University
China (People's Republic of)
COMSATS University Islamabad
Pakistan
Polytechnic University of Turin
Italy

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Funded by

EC| PARENT

Related to Research communities

EUTOPIA Open Research Portal