
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
PB2007 acoustic-articulatory speech dataset Badin, P.,Bailly G., Ben Youssef A., Elisei F., Savariaux C., Hueber T. Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France LICENSE: ======== This dataset is made available under the Creative Commons Attribution Share-Alike (CC-BY-SA) license CREDITS - ATTRIBUTION: ====================== If using this dataset, please cite one of the following studies (all of them exploit this dataset) - Ben Youssef, A., Badin, P., Bailly, G. & Heracleous, P. (2009). Acoustic-to-articulatory inversion using speech recognition and trajectory formation based on phoneme hidden Markov models. In Interspeech 2009, vol., pp. 2255-2258. Brighton, UK. - Ben Youssef, A., Badin, P. & Bailly, G. (2010). Can tongue be recovered from face? The answer of data-driven statistical models. In Interspeech 2010 (11th Annual Conference of the International Speech Communication Association) (T. Kobayashi, K. Hirose & S. Nakamura, editors), vol., pp. 2002-2005. Makuhari, Japan. - Hueber T., Bailly G., Badin P., Elisei F., "Speaker Adaptation of an Acoustic-Articulatory Inversion Model using Cascaded Gaussian Mixture Regressions", Proceedings of Interspeech, Lyon, France, 2013, pp. 2753-2757. DATA FILES DESCRIPTION: ======================= /_seq/: Electro-magnetic Articulography data, recorded at 100Hz Sensors : PAR01 : LT_x (lower incisor, x coordinate) PAR02 : tip_x (tongue tip, x coordinate) PAR03 : mid_x (tongue dorsum, x coordinate) PAR04 : bck_x (tongue back, x coordinate) PAR05 : LL_vis_x (lower lips, x coordinate) PAR06 : UL_vis_x (upper lips, x coordinate) PAR07 : LT_z (lower incisor, z coordinate) PAR08 : tip_z (tongue tip, z coordinate) PAR09 : mid_z (tongue dorsum, z coordinate) PAR10 : bck_z (tongue back, z coordinate) PAR11 : LL_vis_z (lower lips, z coordinate) PAR12 : UL_vis_z (upper lips, z coordinate) /_wav16: subject audio signal, synchronized with the EMA data Format: PCA wav, 16kHz, 16bits /_lab: phonetic segmentation using the following set __ (long pause), _ (short pause), a, e^ (as in "lait"), e (as in "blé"), i, y (as in "voiture"), u (as in "loup"), o^ (as in "pomme"),x (as in "pneu"), x^ (as in "coeur"), a~ (as in "flan"), e~ (as in "in"), x~ (as in "un"), o~ (as in "mon"), p, t, k, f, s, s^ (as in "CHat"), b, d, g, v, z, z^ (as in "les Gens"), m, n, r, l, w, h, j, o, q (schwa)
speech, articulatory, EMA
speech, articulatory, EMA
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
views | 20 | |
downloads | 5 |