Video Person Re-Identification by Temporal Residual Learning

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Other literature type 01 Mar 2019Embargo end date: 01 Jan 2018Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Transactions on Image Processing, volume 28, pages 1,366-1,377 (issn: 1057-7149, eissn: 1941-0042,

Copyright policy )

Authors: Ju Dai; Pingping Zhang; Dong Wang 0004; Huchuan Lu; Hongyu Wang 0001;

doi: 10.1109/tip.2018.2878505 , 10.48550/arxiv.1802.07918

pmid: 30371373

arXiv: 1802.07918

Video Person Re-Identification by Temporal Residual Learning

- Summary
- Subjects
- Metrics

Abstract

In this paper, we propose a novel feature learning framework for video person re-identification (re-ID). The proposed framework largely aims to exploit the adequate temporal information of video sequences and tackle the poor spatial alignment of moving pedestrians. More specifically, for exploiting the temporal information, we design a temporal residual learning (TRL) module to simultaneously extract the generic and specific features of consecutive frames. The TRL module is equipped with two bi-directional LSTM (BiLSTM), which are respectively responsible to describe a moving person in different aspects, providing complementary information for better feature representations. To deal with the poor spatial alignment in video re-ID datasets, we propose a spatial-temporal transformer network (ST^2N) module. Transformation parameters in the ST^2N module are learned by leveraging the high-level semantic information of the current frame as well as the temporal context knowledge from other frames. The proposed ST^2N module with less learnable parameters allows effective person alignments under significant appearance changes. Extensive experimental results on the large-scale MARS, PRID2011, ILIDS-VID and SDU-VID datasets demonstrate that the proposed method achieves consistently superior performance and outperforms most of the very recent state-of-the-art methods.

Submitted to IEEE Transactions on Image Processing, including 5 figures and 4 tables. The first two authors contribute equally to this work

Related Organizations

Dalian University of Technology
Dalian University of Technology
Dalian Polytechnic University
China (People's Republic of)
Dalian University of Technology
China (People's Republic of)

Keywords

FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	86
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 1%

Found an issue? Give us feedback

86

Top 1%

Top 10%

Top 1%

Green

bronze

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering