Efficient Speech Separation with Differencing

Given an input audio signal where multiple speakers talk over each other, the goal of speech separation is to recover the original signals of each speaker. In this paper we propose a novel sequence modelling method called relative context based on differencing and use it for a speech separation architecture called RCSep. The main advantages of relative context is that it does not require trainable parameters, is very lightweight and highly parallelized. The RCSep model which heavily uses relative context is an extremely efficient source separation model. It has less than 500k trainable parameters, lower memory usage and is significantly faster than all previous source separation methods while still maintaining reasonably high separation accuracy.

Related Organizations

Kiel University
Germany

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

hybrid