
Generative speech enhancement methods based on generative adversarial networks (GANs) and diffusion models have shown promising results in various speech enhancement tasks. However, their performance in very low signal-to-noise ratio (SNR) scenarios remains under-explored and limited, as these conditions pose significant challenges to both discriminative and generative state-of-the-art methods. To address this, we propose a method that leverages latent features extracted from discriminative speech enhancement models as generic conditioning features to improve GAN-based speech enhancement. TheResearch goal: How does latent discriminative conditioning in GAN-based speech enhancement compare to unconditional diffusion models in terms of convergence speed and inference throughput under low SNR conditions?Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 8.2/10.
