publication . Preprint . 2020

VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking

Wang, Xingbo; Zeng, Haipeng; Wang, Yong; Wu, Aoyu; Sun, Zhida; Ma, Xiaojuan; Qu, Huamin;
Open Access English
  • Published: 21 Jan 2020
Abstract
The modulation of voice properties, such as pitch, volume, and speed, is crucial for delivering a successful public speech. However, it is challenging to master different voice modulation skills. Though many guidelines are available, they are often not practical enough to be applied in different public speaking situations, especially for novice speakers. We present VoiceCoach, an interactive evidence-based approach to facilitate the effective training of voice modulation skills. Specifically, we have analyzed the voice modulation skills from 2623 high-quality speeches (i.e., TED Talks) and use them as the benchmark dataset. Given a voice input, VoiceCoach automa...
Subjects
free text keywords: Computer Science - Human-Computer Interaction, Computer Science - Computation and Language, Computer Science - Information Retrieval
Download from
35 references, page 1 of 3

[3] Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, and others. 2018. Universal sentence encoder. arXiv preprint arXiv:1803.11175 (2018).

[4] Ionut Damian, Chiew Seng Sean Tan, Tobias Baur, Johannes Scho¨ning, Kris Luyten, and Elisabeth Andre´. 2015. Augmenting social interactions: Realtime behavioural feedback using social signal processing techniques. In Proceedings of the International Conference on Human Factors in Computing Systems. ACM, 565-574. [OpenAIRE]

[5] Fiona Dermody and Alistair Sutherland. 2016. Multimodal system for public speaking with real time feedback: a positive computing perspective. In Proceedings of the International Conference on Multimodal Interaction. ACM, 408-409. [OpenAIRE]

[6] Joseph A DeVito. 2003. The essential elements of public speaking. Allyn and Bacon.

[7] Jonathan Foote. 1999. Visualizing music and audio using self-similarity. In Proceedings of the International Conference on Multimedia. ACM, 77-80. [OpenAIRE]

[8] C. Gallo. 2014. Talk Like TED: The 9 Public Speaking Secrets of the World's Top Minds. Pan Macmillan. https://books.google.com.hk/books?id=K3v8AgAAQBAJ

[9] Jiawei Han, Jian Pei, and Yiwen Yin. 2000. Mining frequent patterns without candidate generation. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Vol. 29. ACM, 1-12.

[10] Julia Bell Hirschberg and Andrew Rosenberg. 2005. Acoustic/prosodic and lexical correlates of charismatic speech. In Proceedings of European Conference on Speech Communication and Technology. Lisbon, 539-546. [OpenAIRE]

[11] Mohammed Ehsan Hoque, Matthieu Courgeon, Jean-Claude Martin, Bilge Mutlu, and Rosalind W Picard. 2013. Mach: My automated conversation coach. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, 697-706.

[12] Toastmasters International. 2011. Your speaking voice. https: //www.toastmasters.org/Resources/Your-Speaking-Voice. (2011). Last accessed on 2019-09-20.

[13] Dasaem Jeong and Juhan Nam. 2016. Visualizing music in its entirety using acoustic features: Music flowgram. In Proceedings of the International Conference on Technologies for Music Notation and Representation. Anglia Ruskin University, 25-32.

[14] Kazutaka Kurihara, Masataka Goto, Jun Ogata, Yosuke Matsusaka, and Takeo Igarashi. 2007. Presentation sensei: a presentation training system using speech and image processing. In Proceedings of the International Conference on Multimodal Interfaces. ACM, 358-365. [OpenAIRE]

[15] Stephen Lucas and Paul Stob. 2004. The art of public speaking. McGraw-Hill New York.

[16] Piet Mertens. 2004. The prosogram: Semi-automatic transcription of prosody based on a tonal perception model. In Proceedings of the International Conference on Speech Prosody. Nara, Japan, 23-26.

[17] Leslie Milton and Christine Lu. 2015. VerseVis: Visualization of spoken features in poetry. University of Maryland, Tech. Rep (2015), 1-9.

35 references, page 1 of 3
Abstract
The modulation of voice properties, such as pitch, volume, and speed, is crucial for delivering a successful public speech. However, it is challenging to master different voice modulation skills. Though many guidelines are available, they are often not practical enough to be applied in different public speaking situations, especially for novice speakers. We present VoiceCoach, an interactive evidence-based approach to facilitate the effective training of voice modulation skills. Specifically, we have analyzed the voice modulation skills from 2623 high-quality speeches (i.e., TED Talks) and use them as the benchmark dataset. Given a voice input, VoiceCoach automa...
Subjects
free text keywords: Computer Science - Human-Computer Interaction, Computer Science - Computation and Language, Computer Science - Information Retrieval
Download from
35 references, page 1 of 3

[3] Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, and others. 2018. Universal sentence encoder. arXiv preprint arXiv:1803.11175 (2018).

[4] Ionut Damian, Chiew Seng Sean Tan, Tobias Baur, Johannes Scho¨ning, Kris Luyten, and Elisabeth Andre´. 2015. Augmenting social interactions: Realtime behavioural feedback using social signal processing techniques. In Proceedings of the International Conference on Human Factors in Computing Systems. ACM, 565-574. [OpenAIRE]

[5] Fiona Dermody and Alistair Sutherland. 2016. Multimodal system for public speaking with real time feedback: a positive computing perspective. In Proceedings of the International Conference on Multimodal Interaction. ACM, 408-409. [OpenAIRE]

[6] Joseph A DeVito. 2003. The essential elements of public speaking. Allyn and Bacon.

[7] Jonathan Foote. 1999. Visualizing music and audio using self-similarity. In Proceedings of the International Conference on Multimedia. ACM, 77-80. [OpenAIRE]

[8] C. Gallo. 2014. Talk Like TED: The 9 Public Speaking Secrets of the World's Top Minds. Pan Macmillan. https://books.google.com.hk/books?id=K3v8AgAAQBAJ

[9] Jiawei Han, Jian Pei, and Yiwen Yin. 2000. Mining frequent patterns without candidate generation. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Vol. 29. ACM, 1-12.

[10] Julia Bell Hirschberg and Andrew Rosenberg. 2005. Acoustic/prosodic and lexical correlates of charismatic speech. In Proceedings of European Conference on Speech Communication and Technology. Lisbon, 539-546. [OpenAIRE]

[11] Mohammed Ehsan Hoque, Matthieu Courgeon, Jean-Claude Martin, Bilge Mutlu, and Rosalind W Picard. 2013. Mach: My automated conversation coach. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, 697-706.

[12] Toastmasters International. 2011. Your speaking voice. https: //www.toastmasters.org/Resources/Your-Speaking-Voice. (2011). Last accessed on 2019-09-20.

[13] Dasaem Jeong and Juhan Nam. 2016. Visualizing music in its entirety using acoustic features: Music flowgram. In Proceedings of the International Conference on Technologies for Music Notation and Representation. Anglia Ruskin University, 25-32.

[14] Kazutaka Kurihara, Masataka Goto, Jun Ogata, Yosuke Matsusaka, and Takeo Igarashi. 2007. Presentation sensei: a presentation training system using speech and image processing. In Proceedings of the International Conference on Multimodal Interfaces. ACM, 358-365. [OpenAIRE]

[15] Stephen Lucas and Paul Stob. 2004. The art of public speaking. McGraw-Hill New York.

[16] Piet Mertens. 2004. The prosogram: Semi-automatic transcription of prosody based on a tonal perception model. In Proceedings of the International Conference on Speech Prosody. Nara, Japan, 23-26.

[17] Leslie Milton and Christine Lu. 2015. VerseVis: Visualization of spoken features in poetry. University of Maryland, Tech. Rep (2015), 1-9.

35 references, page 1 of 3
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue