Access the full text.
Sign up today, get DeepDyve free for 14 days.
Deepanway Ghosal, M. Kolekar (2018)
Music Genre Recognition Using Deep Neural Networks and Transfer Learning
Rishabh Tak, Dharmesh Agrawal, H. Patil (2017)
Novel Phase Encoded Mel Filterbank Energies for Environmental Sound Classification
Yuji Tokozume, T. Harada (2017)
Learning environmental sounds with end-to-end convolutional neural network2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Zhichao Zhang, Shugong Xu, Tianhao Qiao, Shunqing Zhang, Shan Cao (2019)
Attention based Convolutional Recurrent Neural Network for Environmental Sound ClassificationNeurocomputing, 453
Xing Hao, Guigang Zhang, Shang Ma (2016)
Deep LearningInt. J. Semantic Comput., 10
Zhichao Zhang, Shugong Xu, Shunqing Zhang, Tianhao Qiao, Shan Cao (2019)
Learning Attentive Representations for Environmental Sound ClassificationIEEE Access, 7
Jia-Ming Liu, Mingyu You, Guozheng Li, Zheng Wang, Xianghuai Xu, Z. Qiu, Wenjia Xie, Chao An, Sili Chen (2013)
Cough signal recognition with Gammatone Cepstral Coefficients2013 IEEE China Summit and International Conference on Signal and Information Processing
Kyunghyun Cho, Bart Merrienboer, Dzmitry Bahdanau, Yoshua Bengio (2014)
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
Jonghee Sang, Soomyung Park, Junwoo Lee (2018)
Convolutional Recurrent Neural Networks for Urban Sound Classification Using Raw Waveforms2018 26th European Signal Processing Conference (EUSIPCO)
Yuji Tokozume, Y. Ushiku, T. Harada (2017)
Learning from Between-class Examples for Deep Sound RecognitionArXiv, abs/1711.10282
Md. Sahidullah, G. Saha (2012)
Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognitionSpeech Commun., 54
Jürgen Geiger, Karim Helwani (2015)
Improving event detection for audio surveillance using Gabor filterbank features2015 23rd European Signal Processing Conference (EUSIPCO)
Selina Chu, Shrikanth Narayanan, C.-C. Kuo (2009)
Environmental Sound Recognition With Time–Frequency Audio FeaturesIEEE Transactions on Audio, Speech, and Language Processing, 17
V. Boddapati, Andrej Petef, J. Rasmusson, L. Lundberg (2017)
Classifying environmental sounds using image recognition networks
Dharmesh Agrawal, Hardik Sailor, Meet Soni, H. Patil (2017)
Novel TEO-based Gammatone features for environmental sound classification2017 25th European Signal Processing Conference (EUSIPCO)
Hazrat Ali, S. Tran, Emmanouil Benetos, A. Garcez (2016)
Speaker recognition with hybrid features from a deep belief networkNeural Computing and Applications, 29
Toan Vu, Jia-Ching Wang (2016)
ACOUSTIC SCENE AND EVENT RECOGNITION USING RECURRENT NEURAL NETWORKS
Dimitri Palaz, M. Magimai.-Doss, Ronan Collobert (2015)
Analysis of CNN-based speech recognition system using raw speech as input
Yann LeCun, Yoshua Bengio, Geoffrey Hinton (2015)
Deep LearningNature, 521
Zhichao Jin, Guoxu Zhou, Daqi Gao, Yu Zhang (2018)
EEG classification using sparse Bayesian extreme learning machine for brain–computer interfaceNeural Computing and Applications, 32
Yu Su, Ke Zhang, Jingyu Wang, K. Madani (2019)
Environment Sound Classification Using a Two-Stream CNN Based on Decision-Level FusionSensors (Basel, Switzerland), 19
D. Stowell, D. Giannoulis, Emmanouil Benetos, M. Lagrange, MarkD Plumbley (2015)
Detection and Classification of Acoustic Scenes and EventsIEEE Transactions on Multimedia, 17
Hardik Sailor, Dharmesh Agrawal, H. Patil (2017)
Unsupervised Filterbank Learning Using Convolutional Restricted Boltzmann Machine for Environmental Sound Classification
Wei Dai, Chia Dai, Shuhui Qu, Juncheng Li, Samarjit Das (2016)
Very deep convolutional neural networks for raw waveforms2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Karol Piczak (2015)
ESC: Dataset for Environmental Sound ClassificationProceedings of the 23rd ACM international conference on Multimedia
Yang Shao, Deliang Wang (2008)
Robust speaker identification using auditory features and computational auditory scene analysis2008 IEEE International Conference on Acoustics, Speech and Signal Processing
Jia-Ching Wang, Jhing-Fa Wang, Wai-He Kuok, Cheng-Shu Hsu (2006)
Environmental Sound Classification using Hybrid SVM/KNN Classifier and MPEG-7 Audio Low-Level DescriptorThe 2006 IEEE International Joint Conference on Neural Network Proceedings
Yu Zhang, Yu Wang, Guoxu Zhou, Jing Jin, Bei Wang, Xingyu Wang, A. Cichocki (2018)
Multi-kernel extreme learning machine for EEG classification in brain-computer interfacesExpert Syst. Appl., 96
Haomin Zhang, I. Mcloughlin, Yan Song (2015)
Robust sound event recognition using convolutional neural networks2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Bin Gao, Wai Woo, L. Khor (2014)
Cochleagram-based audio pattern separation using two-dimensional non-negative matrix factorization with automatic sparsity adaptation.The Journal of the Acoustical Society of America, 135 3
Jivitesh Sharma, Ole-Christoffer Granmo, M. Olsen (2019)
Environment Sound Classification using Multiple Feature Channels and Deep Convolutional Neural NetworksArXiv, abs/1908.11219
Karol Piczak (2015)
Environmental sound classification with convolutional neural networks2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP)
X. Valero, Francesc Alías (2012)
Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio ClassificationIEEE Transactions on Multimedia, 14
Zhichao Zhang, Shugong Xu, Shan Cao, Shunqing Zhang (2018)
Deep Convolutional Neural Network with Mixup for Environmental Sound Classification
Naoya Takahashi, Michael Gygli, Beat Pfister, L. Gool (2016)
Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event DetectionArXiv, abs/1604.07160
R. Sharan, T. Moir (2018)
Pseudo-color cochleagram image feature and sequential feature selection for robust acoustic event recognitionApplied Acoustics
J. Salamon, C. Jacoby, J. Bello (2014)
A Dataset and Taxonomy for Urban Sound ResearchProceedings of the 22nd ACM international conference on Multimedia
Y. Aytar, Carl Vondrick, A. Torralba (2016)
SoundNet: Learning Sound Representations from Unlabeled VideoArXiv, abs/1610.09001
A. Krizhevsky, Ilya Sutskever, Geoffrey Hinton (2012)
ImageNet classification with deep convolutional neural networksCommunications of the ACM, 60
Judith Brown (1991)
Calculation of a constant Q spectral transformJournal of the Acoustical Society of America, 89
Eli Baum, Mario Harper, Ryan Alicea, Camilo Ordonez (2018)
Sound Identification for Fire-Fighting Mobile Robots2018 Second IEEE International Conference on Robotic Computing (IRC)
With the popular application of deep learning-based models in various classification problems, more and more researchers have applied these models to environmental sound classification (ESC) tasks in recent years. However, the performance of existing models that use acoustic features such as log-scaled mel spectrogram (Log mel) and mel frequency cepstral coefficient or raw waveform to train deep neural networks for ESC is unsatisfactory. In this paper, first of all, a fusion of multiple features consisting of Log mel, log-scaled cochleagram and log-scaled constant-Q transform are proposed, and these features are fused to form the feature set that is called LMCC. Then, a network called CNN-GRUNN which consists of convolutional neural network and gated recurrent unit neural network in parallel is presented to improve the performance of ESC with the proposed aggregated features. Experiments were conducted on ESC-10, ESC-50, and UrbanSound8K datasets. The experimental results indicate that the model with LMCC as input to CNN-GRUNN is appropriate for ESC problems. And our model is able to achieve good classification accuracy for the three datasets, i.e., ESC-10 (92.30%), ESC-50 (87.43%), and UrbanSound8K (96.10%).
Automatic Control and Computer Sciences – Springer Journals
Published: Jul 1, 2021
Keywords: environmental sound classification; feature fusion; convolutional neural network-gated recurrent unit neural network
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.