Speech Recognition for Medical Conversations

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 02 Sep 2018Embargo end date: 01 Jan 2017Publisher:ISCAJournal:Interspeech 2018

Authors: Chung-Cheng Chiu; Anshuman Tripathi; Katherine Chou; Chris Co; Navdeep Jaitly; Diana Jaunzeikare; Anjuli Kannan; +7 Authors

doi: 10.21437/interspeech.2018-40 , 10.48550/arxiv.1711.07274

arXiv: 1711.07274

Speech Recognition for Medical Conversations

- Summary
- Subjects
- Metrics

Abstract

In this work we explored building automatic speech recognition models for transcribing doctor patient conversation. We collected a large scale dataset of clinical conversations ($14,000$ hr), designed the task to represent the real word scenario, and explored several alignment approaches to iteratively improve data quality. We explored both CTC and LAS systems for building speech recognition models. The LAS was more resilient to noisy data and CTC required more data clean up. A detailed analysis is provided for understanding the performance for clinical tasks. Our analysis showed the speech recognition models performed well on important medical utterances, while errors occurred in causal conversations. Overall we believe the resulting models can provide reasonable quality in practice.

Interspeech 2018 camera ready

Related Organizations

Google (United States)
United States

Keywords

FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Computation and Language, Machine Learning (stat.ML), Computer Science - Sound, Statistics - Machine Learning, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Computation and Language (cs.CL), Electrical Engineering and Systems Science - Audio and Speech Processing

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	39
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

39

Top 10%

Green

Fields of Science (4) View all

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

View all