Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ UNSWorksarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
IEEE Journal of Selected Topics in Signal Processing
Article . 2020 . Peer-reviewed
License: IEEE Copyright
Data sources: Crossref
DBLP
Article
Data sources: DBLP
versions View all 2 versions
addClaim

Natural Language Processing Methods for Acoustic and Landmark Event-Based Features in Speech-Based Depression Detection

Authors: Huang, Z; Epps, J; Joachim, D; Sethu, V;

Natural Language Processing Methods for Acoustic and Landmark Event-Based Features in Speech-Based Depression Detection

Abstract

The processing of speech as an explicit sequence of events is common in automatic speech recognition (linguistic events), but has received relatively little attention in paralinguistic speech classification despite its potential for characterizing broad acoustic event sequences. This paper proposes a framework for analyzing speech as a sequence of acoustic events, and investigates its application to depression detection. In this framework, acoustic space regions are tokenized to ‘words’ representing speech events at fixed or irregular intervals. This tokenization allows the exploitation of acoustic word features using proven natural language processing methods. A key advantage of this framework is its ability to accommodate heterogeneous event types: herein we combine acoustic words and speech landmarks, which are articulation-related speech events. Another advantage is the option to fuse such heterogeneous events at various levels, including the embedding level. Evaluation of the proposed framework on both controlled laboratory-grade supervised audio recordings as well as unsupervised self-administered smartphone recordings highlight the merits of the proposed framework across both datasets, with the proposed landmark-dependent acoustic words achieving improvements in F1(depressed) of up to 15% and 13% for SH2-FS and DAIC-WOZ respectively, relative to acoustic speech baseline approaches.

Country
Australia
Related Organizations
Keywords

anzsrc-for: 1005 Communications Technologies, 4608 Human-Centred Computing, Depression, anzsrc-for: 46 Information and Computing Sciences, anzsrc-for: 4603 Computer vision and multimedia computation, Mental Illness, Brain Disorders, 004, anzsrc-for: 4608 Human-Centred Computing, Mental Health, 46 Information and Computing Sciences, Clinical Research, Behavioral and Social Science, anzsrc-for: 0801 Artificial Intelligence and Image Processing, anzsrc-for: 4006 Communications engineering, anzsrc-for: 0906 Electrical and Electronic Engineering

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    31
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Top 10%
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Top 10%
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
31
Top 10%
Top 10%
Top 10%
Green