Concept Detection Scores For The Med16Train Dataset (Trecvid Med Task)

Name: Concept Detection Scores For The Med16Train Dataset (Trecvid Med Task)
Keywords: TRECVID MED task, video analysis, concept detection, Multimedia event detection, event detection

Galanopoulos, Damianos; Markatopoulou, Foteini; Mezaris, Vasileios

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Dataset . 2017

License: CC BY

Data sources: Datacite

https://doi.org/10.5281/zenodo...

Dataset . 2017

License: CC BY

Data sources: Sygma

Concept Detection Scores For The Med16Train Dataset (Trecvid Med Task)

Research datakeyboard_double_arrow_right Dataset 28 Mar 2017Publisher:ZenodoFunded by:EC | MOVING

Authors: Galanopoulos, Damianos; Markatopoulou, Foteini; Mezaris, Vasileios;

doi: 10.5281/zenodo.438641

Concept Detection Scores For The Med16Train Dataset (Trecvid Med Task)

- Summary
- Subjects
- Metrics

Abstract

We provide concept detection scores for the MED16train dataset which is used at the TRECVID Multimedia Event Detection (MED) task [1]. First, each video is decoded into a set of keyframes at fixed temporal intervals (2 keyframes per second). Then, we calculated concept detection scores for the two following concept sets: i) 487 sport-related concepts from YouTube Sports-1M Dataset[1] and ii) 345 TRECVID SIN concepts [3]. The scores have been generated as follows: 1) For the 487 concepts for the Sports-1M Dataset, a Googlenet network [4] originally trained on 5055 ImageNet concepts was fine-tuned, following the extension strategy of [2] with one extension layer of dimension 128. 2) For the 345 TRECVID SIN concepts, a pre-trained Googlenet network [4] on 5055 ImageNet concepts was fine-tuned on these concepts, again following the extension strategy of [2] with one extension layer of dimension 1024. After unpacking the compressed file two different folders can be found, namely "Prob_sports_MED16train" and "Prob_SIN_MED16train", one for each concept set. We provide one file for every video of the MED16train dataset for each concept set. Each file consists of N columns (where N = 345 for TRECVID SIN and N = 487 for Sports-1M Dataset) and M rows (where M is the number of extracted keyframes for the corresponding video). Each column corresponds to a different concept, with all concept scores being in the range [0,1]. The higher the score the more likely that the corresponding concept appears in the keyframe. Two additional files are provided; files "sports_487_Classes.txt" and "SIN_345_Classes.txt" indicate the order of the concepts that is used in the concept score files. [1] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar and L. Fei-Fei, "Large-scale video classification with convolutional neural networks", In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1725-1732, 2014. [2] N. Pittaras, F. Markatopoulou, V. Mezaris and I. Patras, "Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neural Networks", Proc. 23rd Int. Conf. on MultiMedia Modeling (MMM'17), Reykjavik, Iceland, Springer LNCS vol. 10132, pp. 102-114, Jan. 2017. [3] G. Awad, C. Snoek, A. Smeaton, and G. Quénot, "TRECVid semantic indexing of video: a 6-year retrospective", ITE Transactions on Media Technology and Applications, 4 (3). pp. 187-208, 2016. [4] C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, "Going deeper with convolutions", In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.

Linked publications: (1) N. Pittaras, F. Markatopoulou, V. Mezaris, I. Patras, "Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neural Networks", Proc. 23rd Int. Conf. on MultiMedia Modeling (MMM'17), Reykjavik, Iceland, Jan. 2017 (2) F. Markatopoulou, A. Moumtzidou, D. Galanopoulos, T. Mironidis, V. Kaltsa, A. Ioannidou, S. Symeonidis, K. Avgerinakis, S. Andreadis, I. Gialampoukidis, S. Vrochidis, A. Briassouli, V. Mezaris, I. Kompatsiaris, I. Patras, "ITI-CERTH participation to TRECVID 2016", In TRECVID 2016 Workshop, Gaithersburg, MD, USA, 2016.

Keywords

TRECVID MED task, video analysis, concept detection, Multimedia event detection, event detection

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average