Estimating speech from lip dynamics

Preprint English OPEN
George, Jithin Donny; Keane, Ronan; Zellmer, Conor;
  • Subject: Computer Science - Computer Vision and Pattern Recognition | Statistics - Machine Learning | Statistics - Computation
    acm: ComputingMethodologies_COMPUTERGRAPHICS | ComputingMethodologies_PATTERNRECOGNITION | ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION

The goal of this project is to develop a limited lip reading algorithm for a subset of the English language. We consider a scenario in which no audio information is available. The raw video is processed and the position of the lips in each frame is extracted. We then pr... View more
  • References (11)
    11 references, page 1 of 2

    [1] Davis, Abe, et al. "The visual microphone: passive recovery of sound from video." (2014)..

    [2] Rabiner, Lawrence R. "A tutorial on hidden Markov models and selected applications in speech recognition." Proceedings of the IEEE 77.2 (1989): 257-286.

    [3] Choi, Kyoung Ho, and Jenq-Neng Hwang. "Baum-welch hidden Markov model inversion for reliable audio-to-visual conversion." Multimedia Signal Processing, 1999 IEEE 3rd Workshop on. IEEE, 1999.

    [4] Yang, Jie, and Yangsheng Xu. Hidden markov model for gesture recognition. No. CMU-RI-TR-94-10. CARNEGIE-MELLON UNIV PITTSBURGH PA ROBOTICS INST, 1994.

    [6] Hassanat, Ahmad B., `Visual Words for Automatic Lip- Reading.' PhD diss., University of Buckingham, 2009.

    [7] J. Zhong, W. Chou, and E. Petajan, `Acoustic Driven Viseme Identi cation for Face Animation.' Bell Laboratories. Murray Hill, NJ. IEEE 0-7803-378. Aug. 1997.

    [8] L. Cappelletta and N. Harte. `Phoneme-to-Viseme Mapping for Visual Speech.' Department of Electronic and Electrical Engineering, Trinity College Dublin, Ireland. May 2012.




  • Related Research Results (1)
  • Metrics
Share - Bookmark