Enhancing egocentric 3D pose estimation with third person views

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jun 2023Embargo end date: 01 Jan 2022 Spain English Publisher:Elsevier BVJournal:Pattern Recognition, volume 138, page 109,358 (issn: 0031-3203,

Copyright policy )

Authors: Dhamanaskar, Ameya; Dimiccoli, Mariella; Corona Puyane, Enric; Pumarola Peris, Albert; Moreno-Noguer, Francesc;

doi: 10.1016/j.patcog.2023.109358 , 10.48550/arxiv.2201.02017

arXiv: 2201.02017

handle: 10261/339453 , 2117/387921

Enhancing egocentric 3D pose estimation with third person views

- Summary
- Subjects
- Metrics

Abstract

In this paper, we propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera. The key idea is to leverage high-level features linking first- and third-views in a joint embedding space. To learn such embedding space we introduce First2Third-Pose, a new paired synchronized dataset of nearly 2,000 videos depicting human activities captured from both first- and third-view perspectives. We explicitly consider spatial- and motion-domain features, combined using a semi-Siamese architecture trained in a self-supervised fashion. Experimental results demonstrate that the joint multi-view embedded space learned with our dataset is useful to extract discriminatory features from arbitrary single-view egocentric videos, without needing domain adaptation nor knowledge of camera parameters. We achieve significant improvement of egocentric 3D body pose estimation performance on two unconstrained datasets, over three supervised state-of-the-art approaches. Our dataset and code will be available for research purposes.

Country

Spain

Related Organizations

Keywords

Self-supervised learning, FOS: Computer and information sciences, Classificació INSPEC::Pattern recognition::Computer vision, Computer Vision and Pattern Recognition (cs.CV), Àrees temàtiques de la UPC::Informàtica::Automàtica i control, 3D pose estimation, Computer Science - Computer Vision and Pattern Recognition, Egocentric vision

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	15
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%