Location-Free Human Pose Estimation

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jun 2022Embargo end date: 01 Jan 2022Publisher:IEEEJournal:2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Authors: Xixia Xu; Yingguo Gao; Ke Yan; Xue Lin 0003; Qi Zou 0001;

doi: 10.1109/cvpr52688.2022.01279 , 10.48550/arxiv.2205.12619

arXiv: 2205.12619

Location-Free Human Pose Estimation

- Summary
- Subjects
- Metrics

Abstract

Human pose estimation (HPE) usually requires large-scale training data to reach high performance. However, it is rather time-consuming to collect high-quality and fine-grained annotations for human body. To alleviate this issue, we revisit HPE and propose a location-free framework without supervision of keypoint locations. We reformulate the regression-based HPE from the perspective of classification. Inspired by the CAM-based weakly-supervised object localization, we observe that the coarse keypoint locations can be acquired through the part-aware CAMs but unsatisfactory due to the gap between the fine-grained HPE and the object-level localization. To this end, we propose a customized transformer framework to mine the fine-grained representation of human context, equipped with the structural relation to capture subtle differences among keypoints. Concretely, we design a Multi-scale Spatial-guided Context Encoder to fully capture the global human context while focusing on the part-aware regions and a Relation-encoded Pose Prototype Generation module to encode the structural relations. All these works together for strengthening the weak supervision from image-level category labels on locations. Our model achieves competitive performance on three datasets when only supervised at a category-level and importantly, it can achieve comparable results with fully-supervised methods with only 25\% location labels on MS-COCO and MPII.

Beijing Jiaotong University, Tencent Toutu Lab

Related Organizations

Beijing Jiaotong University
China (People's Republic of)

Keywords

FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	6
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

6

Top 10%

Average

Top 10%

Green

Fields of Science (4) View all

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

View all