Attention Mechanisms, Signal Encodings and Fusion Strategies for Improved Ad-hoc Video Search with Dual Encoding Networks

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 08 Jun 2020Publisher:ACMJournal:Proceedings of the 2020 International Conference on Multimedia RetrievalFunded by:EC | ReTV

Authors: Galanopoulos, Damianos; Mezaris, Vasileios;

doi: 10.1145/3372278.3390737

Attention Mechanisms, Signal Encodings and Fusion Strategies for Improved Ad-hoc Video Search with Dual Encoding Networks

- Summary
- Subjects
- Metrics

Abstract

In this paper, the problem of unlabeled video retrieval using textual queries is addressed. We present an extended dual encoding network which makes use of more than one encodings of the visual and textual content, as well as two different attention mechanisms. The latter serve the purpose of highlighting temporal locations in every modality that can contribute more to effective retrieval. The different encodings of the visual and textual inputs, along with early/late fusion strategies, are examined for further improving performance. Experimental evaluations and comparisons with state-of-the-art methods document the merit of the proposed network.

Related Organizations

Centre for Research and Technology Hellas
Greece

Keywords

Dual encoding network, Video search, Attention mechanism, Deep learning, Video retrieval, Ad-hoc video search

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	20
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%