Research Article |

A Systematic Approach of Advanced Dilated Convolution Network for Speaker Identification

Author(s): Hema Kumar Pentapati¹ and Sridevi K²

Published In : International Journal of Electrical and Electronics Research (IJEER) Volume 11, Issue 1

Publisher : FOREX Publication

Published : 05 February 2023

e-ISSN : 2347-470X

Page(s) : 25-30

DOI: https://doi.org/10.37391/IJEER.110104

Abstract

Over the years, the Speaker recognition area is facing various challenges in identifying the speakers accurately. Remarkable changes came into existence with the advent of deep learning algorithms. Deep learning made a remarkable impact on the speaker recognition approaches. This paper introduces a simple novel architectural approach to an advanced Dilated Convolution network. The novel idea is to induce the well-structured log-Melspectrum to the proposed dilated convolution neural network and reduce the number of layers to 11. The network utilizes the Global average pooling to accumulate the outputs from all layers to get the feature vector representation for classification. Only 13 coefficients are extracted per frame of each speech sample. This novel dilated convolution neural network exhibits an accuracy of 90.97%, Equal Error Rate(EER) of 3.75% and 207 Seconds training time outperforms the existing systems on the LibriSpeech corpus.

Keywords: Log-MelSpectrum, MFCC, Dilated Convolution neural networks, Speaker Identification, Deep Learning .

Hema Kumar Pentapati^*, Research Scholar, Department of EECE, GITAM School of Technology, Visakhapatnam, India; Email: hpentapa@gitam.in

Sridevi K, Associate Professor, Department of EECE, GITAM School of Technology, Visakhapatnam, India, Email: skataman@gitam.edu.

[Cross Ref]

[7] Z. Liu, Z. Wu, T. Li, J. Li, and C. Shen, “GMM and CNN Hybrid Method for Short Utterance Speaker Recognition,” IEEE Trans. Ind. Informatics, vol. 14, no. 7, pp. 3244–3252, 2018, doi: 10.1109/TII.2018.2799928.

[Cross Ref]

Hema Kumar Pentapati and Sridevi K (2023), A Systematic Approach of Advanced Dilated Convolution Network for Speaker Identification. IJEER 11(1), 25-30. DOI: 10.37391/IJEER.110104.

I. J. of Electrical & Electronics Research Support Open Access

A Systematic Approach of Advanced Dilated Convolution Network for Speaker Identification

Abstract

I. J. of Electrical & Electronics Research
Support Open Access