year 10, Issue 2 (Journal of Acoustical Engineering Society of Iran 2023)                   مجله علمی پژوهشی انجمن علوم صوتی ایران 2023, 10(2): 11-20 | Back to browse issues page

XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Asgari M, Akbari N, Aghagolzade M, Mehrabikia M. Telephone robustness speaker verification using time delay neural network (Research Article). مجله علمی پژوهشی انجمن علوم صوتی ایران 2023; 10 (2) :11-20
URL: http://joasi.ir/article-1-231-en.html
Abstract:   (640 Views)
In this research, TDNN model and x-vector are presented in order to robust noise and frequency filtering caused by telephone communication. MFCC is used as the speaker-related audio feature as input to this model. The output of neural network of this model is considered as an x-vector so that it can be used in the decision stage. In the decision stage, PLDA was used for scoring and comparison. In order to increase accuracy and reduce EER, the training dataset is a combination of relatively clean VoxCeleb 1,2 dataset and Callhome telephone dataset, as well as noise and telephone dataset obtained from the data augmentation method. The results of using this method for EER in the clean state are 3.09%, which has improved about 0.15% (3.24% has been obtained in previous works) in the worst case and 6.93% (10.2% has been obtained in previous works) in the best case compared to the base models. When training with Voxceleb1,2 and Callhome datasets was used as an adaptation, the EER was 4.95%. In the worst case, when only the Voxceleb1 data is converted to a telephone, the EER is 14.34%.
Full-Text [PDF 72 kb]   (303 Downloads)    
Type of Study: Research | Subject: Sonophysics
Received: 2021/12/15 | Accepted: 2022/12/28 | Published: 2023/03/19

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.