Model for Dimensional Speech Emotion Recognition based on Wav2vec 2.0

Name: Model for Dimensional Speech Emotion Recognition based on Wav2vec 2.0
Keywords: Valence, Speech emotion recognition, wav2vec 2.0, ONNX, Transformer model, Deep learning, MSP-Podcast, Arousal, Dominance

Wagner, Johannes; Triantafyllopoulos, Andreas; Wierstorf, Hagen; Schmitt, Maximilian; Burkhardt, Felix; Eyben, Florian; Schuller, Björn W.

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Other ORP type . 2022

License: CC BY NC SA

Data sources: Datacite

ZENODO

Other ORP type . 2022

License: CC BY NC SA

Data sources: ZENODO

ZENODO

Other ORP type . 2022

License: CC BY NC SA

Data sources: Datacite

Model for Dimensional Speech Emotion Recognition based on Wav2vec 2.0

appsOther research productkeyboard_double_arrow_right Other ORP type 22 Feb 2022Publisher:Zenodo

Authors: Wagner, Johannes; Triantafyllopoulos, Andreas; Wierstorf, Hagen; Schmitt, Maximilian; Burkhardt, Felix; Eyben, Florian; Schuller, Björn W.;

doi: 10.5281/zenodo.6221127 , 10.5281/zenodo.6221126

Model for Dimensional Speech Emotion Recognition based on Wav2vec 2.0

- Summary
- Subjects
- Metrics

Abstract

The model expects a raw audio signal as input and outputs predictions for arousal, dominance and valence in a range of approximately 0...1. In addition, it also provides the pooled states of the last transformer layer. The model was created by fine-tuning a pre-trained wav2vec 2.0 model on MSP-Podcast (v1.7). As foundation we use wav2vec2-large-robust released by Facebook under Apache.2.0, which we pruned from 24 to 12 transformer layers before fine-tuning. The model was afterwards exported to ONNX format. Further details are given in the associated paper. For an introduction how to use the model, please visit our tutorial project. The original [Torch](https://pytorch.org/docs/stable/torch.html) model is hosted on Hugging Face.

Related Organizations

Universität Augsburg
Germany

Keywords

Valence, Speech emotion recognition, wav2vec 2.0, ONNX, Transformer model, Deep learning, MSP-Podcast, Arousal, Dominance

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Usage byUsageCounts

visibility	views	921
download	downloads	3K

921
views
3K
downloads
Powered by

Found an issue? Give us feedback

visibility

download

2

Average

921

3K