The contribution of visual information to the perception of speech in noise with and without informative temporal fine structure

Article English OPEN
Stacey, Paula C. ; Kitterick, Pádraig T. ; Morris, Saffron D. ; Sumner, Christian J. (2016)
  • Publisher: Elsevier
  • Journal: Hearing research, volume 336, pages 17-28 (issn: 0378-5955, eissn: 1878-5891)
  • Related identifiers: doi: 10.1016/j.heares.2016.04.002, pmc: PMC5706637
  • Subject: Temporal fine structure | Visual speech | Sine-wave vocoding | Sensory Systems | Audio-visual | Cochlear implants | Article

Understanding what is said in demanding listening situations is assisted greatly by looking at the face of a talker. Previous studies have observed that normal-hearing listeners can benefit from this visual information when a talker’s voice is presented in background noise. These benefits have also been observed in quiet listening conditions in cochlear-implant users, whose device does not convey the informative temporal fine structure cues in speech, and when normal-hearing individuals listen to speech processed to remove these informative temporal fine structure cues. The current study (1) characterised the benefits of visual information when listening in background noise; and (2) used sine-wave vocoding to compare the size of the visual benefit when speech is presented with or without informative temporal fine structure. The accuracy with which normal-hearing individuals reported words in spoken sentences was assessed across three experiments. The availability of visual information and informative temporal fine structure cues was varied within and across the experiments. The results showed that visual benefit was observed using open- and closed-set tests of speech perception. The size of the benefit increased when informative temporal fine structure cues were removed. This finding suggests that visual information may play an important role in the ability of cochlear-implant users to understand speech in many everyday situations. Models of audio-visual integration were able to account for the additional benefit of visual information when speech was degraded and suggested that auditory and visual information was being integrated in a similar way in all conditions. The modelling results were consistent with the notion that audio-visual benefit is derived from the optimal combination of auditory and visual sensory cues.
  • References (59)
    59 references, page 1 of 6

    Altieri, N.A., Pisoni, D.B., Townsend, J.T., 2011. Some normative data on lip-reading skills. J. Acoust. Soc. Am. 130 (1), 1e4. http://dx.doi.org/10.1121/1.3593376.

    Bernstein, J.G.W., Grant, K.W., 2009. Auditory and auditory-visual intelligibility of speech in fluctuating maskers for normal-hearing and hearing-impaired listeners. J. Acoust. Soc. Am. 125 (5), 3358e3372. http://dx.doi.org/10.1121/ 1.3110132.

    Blamey, P.J., Cowan, R.S.C., Alcantara, J.I., Whitford, L.A., Clark, G.M., 1989. Speech perception using combinations of auditory visual and tactile information. J. Rehab. Res. Dev. 26 (1), 15e24.

    Boothroyd, A., Hnath-Chisolm, T., Hanin, L., Kishon-Rabin, L., 1988. Voice fundamental frequency as an auditory supplement to the speechreading of sentences. Ear Hear. 9, 306e312.

    Braida, L., 1991. Crossmodal integration in the identification of consonant segments. Q. J. Exp. Psychol. Sect. A Hum. Exp. Psychol. 43 (3), 647e677.

    Brungart, D.S., Chang, P.S., Simpson, B.D., Wang, D., 2009. Multitalker speech perception with ideal time-frequency segregation: effects of voice characteristics and number of talkers. J. Acoust. Soc. Am. 125 (6), 4006e4022. http:// dx.doi.org/10.1121/1.3117686.

    Cohen, L.T., Richardson, L.M., Saunders, E., Cowan, R.S.C., 2003. Spatial spread of neural excitation in cochlear implant recipients: comparison of improved ECAP method and psychophysical forward masking. Hear. Res. 179, 72e87. http:// dx.doi.org/10.1016/S0378-5955(03)00096-0.

    Cooke, M., Barker, J., Cunningham, S., Shao, X., 2006. An audio-visual corpus for speech perception and automatic speech recognition (L). J. Acoust. Soc. Am. 120 (5), 2421e2424. http://dx.doi.org/10.1121/1.2229005.

    Dau, T., Verhey, J., Kohlrausch, A., 1999. Intrinsic envelope fluctuations and modulation-detection thresholds for narrow-band noise carriers. J. Acoust. Soc. Am. 106 (5), 2752e2760. http://dx.doi.org/10.1121/1.428103.

    Davis, A.C., 1989. The prevalence of hearing impairment and reported hearing disability among adults in great-britain. Int. J. Epidemiol. 18 (4), 911e917. http:// dx.doi.org/10.1093/ije/18.4.911.

  • Metrics
    No metrics available
Share - Bookmark