Self-Supervised Learning of Pose Embeddings from Spatiotemporal Relations in Videos

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Oct 2017 Germany Publisher:IEEEJournal:2017 IEEE International Conference on Computer Vision (ICCV)

Authors: Sümer, Ömer; Dencker, Tobias; Ommer, Björn;

doi: 10.1109/iccv.2017.461

arXiv: 1708.02179

Self-Supervised Learning of Pose Embeddings from Spatiotemporal Relations in Videos

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Human pose analysis is presently dominated by deep convolutional networks trained with extensive manual annotations of joint locations and beyond. To avoid the need for expensive labeling, we exploit spatiotemporal relations in training videos for self-supervised learning of pose embeddings. The key idea is to combine temporal ordering and spatial placement estimation as auxiliary tasks for learning pose similarities in a Siamese convolutional network. Since the self-supervised sampling of both tasks from natural videos can result in ambiguous and incorrect training labels, our method employs a curriculum learning idea that starts training with the most reliable data samples and gradually increases the difficulty. To further refine the training process we mine repetitive poses in individual videos which provide reliable labels while removing inconsistencies. Our pose embeddings capture visual characteristics of human pose that can boost existing supervised representations in human pose estimation and retrieval. We report quantitative and qualitative results on these tasks in Olympic Sports, Leeds Pose Sports and MPII Human Pose datasets.

Comment: To appear in ICCV 2017

Country

Germany

Related Organizations

Heidelberg University
Germany
University of Tübingen
Germany
Interdisciplinary Center for Scientific Computing
Universität Augsburg
Germany

Keywords

ddc:004, Computer Science - Computer Vision and Pattern Recognition

1 Research products, page 1 of 1

JigsawPuzzleSolver software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	19
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

19

Top 10%

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Self-Supervised Learning of Pose Embeddings from Spatiotemporal Relations in Videos

Self-Supervised Learning of Pose Embeddings from Spatiotemporal Relations in Videos

1 Research products, page 1 of 1

JigsawPuzzleSolver software on GitHub