Improving the average precision of Persian vowel classification from speech signal by using convolutional neural network (Research Article)

Asgari, M.; Akbari, N.

EN FA

year 8, Issue 2 (Journal of Acoustical Engineering Society of Iran 2021) مجله انجمن علوم صوتی ایران (مهندسی صوتیات سابق) 2021, 8(2): 51-59 | Back to browse issues page

‎ 20.1001.1.23455748.1399.8.2.3.0

Mendeley

Zotero

RefWorks

Asgari M, Akbari N. Improving the average precision of Persian vowel classification from speech signal by using convolutional neural network (Research Article). مجله انجمن علوم صوتی ایران (مهندسی صوتیات سابق) 2021; 8 (2) :51-59
URL: http://joasi.ir/article-1-173-en.html

Improving the average precision of Persian vowel classification from speech signal by using convolutional neural network (Research Article)

M. Asgari ^*

, N. Akbari

Abstract: (2975 Views)

One approach to speech recognition is to model speech based on a number of phonetic units. Because the frequency and temporal characteristics of vowels are more stable than other phonems, it is important to recognize vowels to distinguish speech. In this research, the aim is to present a model using modern methods, such as deep neural network to improve the accuracy of vowel recognition and increase its applications. 30 speakers (15 females and 15 males) read all the combinations of consonants with 6 Persian vowels. After preprocessing, the speech data is segmented into frames containing only the vowels and its spectrogram is extracted. These spectrogram are given as input to the neural network with two hidden layers. Speech of 25 speakers were used for training and speech of 5 speakers were used for testing. The average of accuracy of 6 Persian vowels for the proposed model was 93.17% (total average of vowel detection error is 6.83%). In previous works the average of vowel detection error was 9.7% to 19.6% that in proposed model improved from 2.87% to 12.77%.

Keywords: Persian vowel recognition, Classification, Convolutional neural network, Persian vowel dataset.

Full-Text [PDF 85 kb] (1488 Downloads)

Type of Study: Applicable | Subject: Signal Processing
Received: 2020/02/16 | Accepted: 2021/03/2 | Published: 2021/03/10

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.