Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

Continuous Audio Denoising: Zero Shot Scale Invariance Via Fourier Neural Operators

Authors: Mirpuri, Suraj;

Continuous Audio Denoising: Zero Shot Scale Invariance Via Fourier Neural Operators

Abstract

Convolutional Neural Networks (CNNs) for audio processing typically learn discrete, fixed grid representations, making them brittle across varying sample rates. We present a 1D Fourier Neural Operator (FNO) designed for continuous scale invariant audio denoising. By formulating the signal as a continuous function in the frequency domain and utilizing a curriculum fine tuning approach with alternating multi rate batches, we achieve state of the art zero shot generalization. Evaluated on 44.1 kHz audio, our model achieves a Scale Invariant Signal to Distortion Ratio (SI-SDR) of 14.51 dB outperforming the 16 kHz trained Wave-U-Net baseline (12.74 dB) without requiring full retraining. Ablation studies yield a potential architectural simplification: multi rate fine tuning curricula rather than explicit coordinate mapping are the primary drivers of scale invariance. This finding allows for a simpler architecture that maintains O (N log N) efficiency while offering a robust, resolution agnostic pathway for audio processing.

Powered by OpenAIRE graph
Found an issue? Give us feedback