
As a first step towards a complete computational model of speech learning involving perception-production loops, we investigate the forward mapping between pseudo-motor commands and articulatory trajectories. Two phonological feature sets, based respectively on generative and articulatory phonology, are used to encode a phonetic target sequence. Different interpolation techniques are compared to generate smooth trajectories in these feature spaces, with a potential optimisation of the target value and timing to capture co-articulation effects. We report the Pearson correlation between a linear projection of the generated trajectories and articulatory data derived from a multi-speaker dataset of electromagnetic articulography (EMA) recordings. A correlation of 0.67 is obtained with an extended feature set based on generative phonology and a linear interpolation technique. We discuss the implications of our results for our understanding of the dynamics of biological motion.
accepted at Interspeech 2024
FOS: Computer and information sciences, Computer Science - Computation and Language, speech production, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], Audio and Speech Processing (eess.AS), computational modelling, FOS: Electrical engineering, electronic engineering, information engineering, articulatory-to-acoustic mapping, phonological features, Computation and Language (cs.CL), [INFO.INFO-SD] Computer Science [cs]/Sound [cs.SD], Electrical Engineering and Systems Science - Audio and Speech Processing
FOS: Computer and information sciences, Computer Science - Computation and Language, speech production, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], Audio and Speech Processing (eess.AS), computational modelling, FOS: Electrical engineering, electronic engineering, information engineering, articulatory-to-acoustic mapping, phonological features, Computation and Language (cs.CL), [INFO.INFO-SD] Computer Science [cs]/Sound [cs.SD], Electrical Engineering and Systems Science - Audio and Speech Processing
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
