
The EMO dataset is a high-quality, paired audio corpus specifically developed to support research in vocal timbral technique conversion, with a primary focus on the vocal fry scream. It was created to address the significant scarcity of paired data for extreme vocalizations in the research community. Key Dataset Features: Content: A total of 1040 high-quality clips consisting of 520 modal voice and 520 vocal fry scream pairs. Duration: Approximately 42 min. Source: Recorded by a single professional metal singer. Languages: Includes vocalizations in both Chinese and English. Alignment: All clips were manually aligned within a Digital Audio Workstation to ensure precise temporal consistency between the modal and scream pairs.
voice conversion, vocal technique
voice conversion, vocal technique
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
