Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Imperial College Lon...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://dx.doi.org/10.25560/11...
Other literature type . 2023
Data sources: Datacite
versions View all 1 versions
addClaim

Audiovisual speech comprehension of degraded and synthetic visual signals

Authors: Varano, Enrico;

Audiovisual speech comprehension of degraded and synthetic visual signals

Abstract

Seeing a speaker’s face helps comprehension, in particular in challenging listening conditions or for those living with hearing loss – an effect thought to arise from the integration of temporal and categorical features carried by the visual stream. Recent studies into the neurobiological mechanisms of speech perception have employed continuous stimuli, an important milestone towards understanding of such processes in ecological paradigms. However, efforts to extend this principle further to audiovisual speech are impeded by a lack of high-quality recordings. We seek to close this gap by presenting the AVbook corpus, which includes methods designed to enable synchronised delivery of the streams, is presented alongside validation data, and is publicly available to support research in neurobiology and speech recognition. We then employ this corpus to investigate how the cortical tracking of the speech envelope is affected by degraded visual signals. We find that visual signals need to contain information beyond the speech envelope to convey a benefit and, employing electroencephalography, that this benefit is linked to the gain in the delta-band activity, evidencing a role of the cortical tracking of words in audiovisual speech comprehension. Recent advances in speech-driven models have made it possible to synthesise photo-realistic talkers from a still portrait. We demonstrate the suitability of such signals in improving speech- in-noise comprehension by showing that humans cannot distinguish between the natural and the synthesised videos, that the latter aid humans in understanding speech and that audiovisual speech recognisers benefit from the these animations too. The work discussed in this thesis sheds light on some of the poorly understood neural and be- havioural processes underlying audiovisual speech perception, and characterises the effectiveness of synthesised videos as listening aids. The AVbook corpus significantly reduces the activation energy for further works on the matter and a range of experimental paradigms beyond those considered here.

Country
United Kingdom
Related Organizations
Keywords

150, 400

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green