Multimodal Deep Learning Architecture for Hindustani Raga Classification

In this paper, our key aspect is the design of a deep learning architecture for the classification of Hindustani (classical North Indian music) ragas (music modes). In an attempt to address this task, we propose a modular deep learning architecture designed to process data from two modalities, comprising audio recordings and metadata. Our bipolar classifier utilizes convolutional and feed forward neural networks and incorporates spectral information of audio data and metadata descriptors tailored to the peculiar melodic characteristics of Hindustani music. In specific, audio recordings as well as manually annotated and automatically extracted metadata were utilized for audio samples of both Hindustani improvisations and compositions available in the Saraga open dataset of Indian art music. Experiments are conducted on two Hindustani ragas, namely Yaman and Bhairavi. Results indicate that the integration of multimodal data increases the classification accuracy of the classifier in comparison to simply using audio features. Additionally, for the specific task of raga classification the use of the swaragram feature, which is customized for Hindustani music, outperforms the effectiveness of audio features that are commonly used in Eurocentric music genres.

Related Organizations

Hellenic Mediterranean University
Greece
Ionian University
Greece

Keywords

Multimodal, Deep learning, Convolutional neural networks, Hindustani raga identification

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green