Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Part of book or chapter of book . 2019
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Part of book or chapter of book . 2019
License: CC BY
Data sources: ZENODO
http://dx.doi.org/10.1002/9781...
Part of book or chapter of book . 2019
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

concept based and event based video search in large video collections

Authors: Foteini Markatopoulou; Damianos Galanopoulos; Christos Tselepis; Vasileios Mezaris; Ioannis Patras;

concept based and event based video search in large video collections

Abstract

Video content can be annotated with semantic information such as simple concept labels that may refer to objects (e.g., “car” and “chair”), activities (e.g., “running” and “dancing”), scenes (e.g., “hills” and “beach”), etc.; or more complex (or highlevel) events that describe the main action that takes place in the complete video. An event may refer to complex activities, occurring at specific places and times, which involve people interacting with other people and/or object(s), such as “changing a vehicle tire”, “making a cake”, or “attempting a bike trick”, etc. Concept-based and event-based video search refers to the retrieval of videos/video fragments (e.g., keyframes) that present specific simple concept labels or more complex events from large-scale video collections, respectively. To deal with concept-based video search, video concept detection methods have been developed that automatically annotate video-fragments with semantic labels (concepts). Then, given a specific concept a ranking component retrieves the top related video fragments for this concept. While significant progress has been made during the last years in video concept detection, it continues to be a difficult and challenging task. This is due to the diversity in form and appearance exhibited by the majority of semantic concepts and the difficulty to express them using a finite number of representations. A recent trend is to learn features directly from the raw keyframe pixels using deep convolutional neural networks (DCNNs). Other studies focus on combining many different video representations in order to capture different perspectives of the visual information. Finally, there are studies that focus on multi-task learning in order to exploit concept model sharing, and methods that look for existing semantic relations e.g., concept correlations. In contrast to concept detection, where we most often can use annotated training data for learning the detectors, in the problem of video event detection we can distinguish two different but equally important cases: when a number of positive examples, or no positive examples at all (“zero-example” case), are available for training. In the first case, a typical video event detection framework includes a feature extraction and a classification stage, where an event detector is learned by training one or more classifiers for each event class using available features (sometimes similarly to the learning of concept detectors), usually followed by a fusion approach in order to combine different modalities. In the latter case, where solely a textual description is available for each event class, the research community has directed its efforts towards effectively combining textual and visual analysis techniques, such as using text analysis techniques, exploiting large sets of DCNN-based concept detectors and using various re-ranking methods, such as pseudo-relevance feedback, or self-paced re-ranking. In this chapter, we survey the literature and we present our research efforts towards improving concept- and event-based video search. For concept-based video search, we focus on i) feature extraction using hand-crafted and DCNN-based descriptors, ii) dimensionality reduction using accelerated generalised subclass discriminant analysis (AGSDA), iii) cascades of hand-crafted and DCNN-based descriptors, iv) multi-task learning (MTL) to exploit model sharing and v) stacking architectures to exploit concept relations. For video event detection, we focus on methods which exploit positive examples, when available, again using DCNN-based features and AGSDA, and we also develop a framework for zero-example event detection that associates the textual description of an event class with the available visual concepts in order to identify the most relevant concepts regarding the event class. Additionally, we present a pseudorelevant feedback mechanism that relies on AGSDA.

Related Organizations
Keywords

video concept detection, concept-based search, event-based search, event detection

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 5
    download downloads 9
  • 5
    views
    9
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
5
9
Green
Funded by