
pmid: 28880190
A fundamental bottleneck for achieving highly discriminative action representation is that local motion/appearance features are usually not semantic aligned. Namely, a local feature, such as a motion vector or motion trajectory, does not possess any attribute that indicates which moving body part or operated object it is associated with. This mostly leads to global feature pooling/representation learning methods that are often too coarse. Inspired by the recent success of end-to-end (pixel-to-pixel) deep convolutional neural networks (DCNNs), in this paper, we first propose a DCNN architecture, which maps a human centric image region onto human body part response maps. Based on these response maps, we propose a second DCNN, which achieves semantic-aligned feature representation learning. Prior knowledge that only a few parts are responsible for a certain action is also utilized by introducing a group (part) sparseness prior during feature learning. The learned semantic-aligned feature not only boosts the discriminative capability of action representation, but also possesses the good nature of robustness to pose variations and occlusions. Finally, an iterative mining method is employed for learning discriminative action primitive detectors. Extensive experiments on action recognition benchmarks demonstrate a superior recognition performance of the proposed framework.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 9 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
