Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks
Subject: Computer Science - Computer Vision and Pattern Recognition
Recent successful applications of convolutional neural networks (CNNs) to audio classification and speech recognition have motivated the search for better input representations for more efficient training. Visual displays of an audio signal, through various time-frequen... View more
 R. Radhakrishnan, A. Divakaran, and A. Smaragdis, “Audio analysis for surveillance applications,” in Applications of Signal Processing to Audio and Acoustics, 2005. IEEE Workshop on. IEEE, 2005, pp. 158-161.
 N. Yamakawa, T. Takahashi, T. Kitahara, T. Ogata, and H. Okuno, “Environmental sound recognition for robot audition using matchingpursuit,” Modern Approaches in Applied Intelligence, pp. 1-10, 2011.
 J.-C. Wang, H.-P. Lee, J.-F. Wang, and C.-B. Lin, “Robust environmental sound recognition for home automation,” IEEE transactions on automation science and engineering, vol. 5, no. 1, pp. 25-31, 2008.
 O. Abdel-Hamid, A.-r. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, “Convolutional neural networks for speech recognition,” IEEE/ACM Transactions on audio, speech, and language processing, vol. 22, no. 10, pp. 1533-1545, 2014.
 L. Deng, O. Abdel-Hamid, and D. Yu, “A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion,” in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 6669-6673.
 T. N. Sainath, B. Kingsbury, G. Saon, H. Soltau, A.-r. Mohamed, G. Dahl, and B. Ramabhadran, “Deep convolutional neural networks for large-scale speech tasks,” Neural Networks, vol. 64, pp. 39-48, 2015.
 T. N. Sainath, A.-r. Mohamed, B. Kingsbury, and B. Ramabhadran, “Deep convolutional neural networks for lvcsr,” in Acoustics, speech and signal processing (ICASSP), 2013 IEEE international conference on. IEEE, 2013, pp. 8614-8618.
 H. Lee, P. Pham, Y. Largman, and A. Y. Ng, “Unsupervised feature learning for audio classification using convolutional deep belief networks,” in Advances in neural information processing systems, 2009, pp. 1096-1104.
 K. J. Piczak, “Environmental sound classification with convolutional neural networks,” in Machine Learning for Signal Processing (MLSP), 2015 IEEE 25th International Workshop on. IEEE, 2015, pp. 1-6.
 J. Salamon and J. P. Bello, “Deep convolutional neural networks and data augmentation for environmental sound classification,” IEEE Signal Processing Letters, vol. 24, no. 3, pp. 279-283, 2017.