Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Speech Communicationarrow_drop_down
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
Speech Communication
Article . 2016 . Peer-reviewed
License: Elsevier TDM
Data sources: Crossref
DBLP
Article . 2021
Data sources: DBLP
versions View all 2 versions
addClaim

Finding relevant features for zero-resource query-by-example search on speech

Authors: Paula Lopez-Otero; Laura Docío Fernández; Carmen García-Mateo;

Finding relevant features for zero-resource query-by-example search on speech

Abstract

Zero-resource query-by-example search on speech strategies have raised the interest of the research community, as they do not imply training (and therefore, large amounts of training data) or any knowledge about either the language to be processed or any others. These systems usually rely on Mel-frequency cepstral coefficients (MFCCs) for speech representation and dynamic time warping (DTW) or any of its variants for performing the search. Nevertheless, which features to use in this task is still an open research problem, and the use of large feature sets combined with feature selection approaches have not been addressed yet in the query-by-example search on speech scenario. In this paper, we present two methods to select the most relevant features among a large set of acoustic features, for the purpose of estimating the relevance of each feature using the costs of the best alignment path (obtained when performing DTW) and their neighbouring region. To prove the validity of these methods, experiments were carried out in four different search on speech scenarios that were used in international benchmarks, namely Albayzin 2014 search on speech evaluation, MediaEval spoken web search SWS 2013, and MediaEval query-by-example search on speech QUESST2014 and QUESST2015. Experimental results showed a dramatic improvement in the results when reducing the feature set using the proposed techniques, especially in the case of the relevance-based approaches. A comparison between the proposed methods and other representations such as MFCCs, phonetic posteriorgrams and dimensionality reduction based on principal component analysis, showed that the zero-resource approaches presented in this paper are promising, as they outperformed more extended approaches in all the experimental scenarios. The feature relevance estimation approaches, apart from improving search on speech results, also revealed features other than MFCCs that seemed to be a value-added in query-by-example tasks.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    9
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Top 10%
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Top 10%
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
9
Average
Top 10%
Top 10%
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!