Publications internationales
Résumé: The increased need for foreign language learning, along with advances in speech technology have heightened interest in computer-assisted pronunciation teaching (CAPT) applications. Herein, the automatic diagnosis of pronunciation errors is essential, it allows language learners to identify their mispronunciations and thus improve their oral skills. Meanwhile, the emergence of deep learning algorithms for speech processing led to the use of deep neural networks at several stages of the mispronunciation detection and diagnosis process. Therefore, an overview of the state-of-the-art in deep learning algorithms for mispronunciation diagnosis is needed, for which we performed a systematic literature review. This study aims to provide an overview of the recent use of deep neural networks for mispronunciation detection and diagnosis (MDD). A thorough statistical analysis is provided in this review which was conducted by extracting specific information from 53 papers published between the years 2015 and 2023. This review indicates that the diagnosis of pronunciation errors is a highly active area of research. Quite a few deep learning models and approaches have been proposed in this area, but there are still some important open issues and limitations to be addressed in future works.
Résumé: Computer-assisted language learning (CALL) systems increasingly arouse a significant interest and establish a presencein automated foreign language learning. They enhance traditional learning methods by providing access to various accentsand spoken language styles through websites, mobile applications, and social media. Herein, mispronunciation detectionis a key component mainly addressed as a classification problem. Meanwhile, deep learning (DL) advances have promotedthese systems by training deep neural networks (DNN) to classify a pronunciation as correct or incorrect. However, theeffectiveness of the DL models is hindered by many shortcomings, such as the scarcity of labeled data. To address this issue,the paper assumes an anomaly detection-based mispronunciation detection approach. It utilizes a variational autoencoder(VAE) relying on a density-based method to model the “normal data.” The VAE is a generative model trained in a self-supervised way to learn the distribution of the correct pronunciations, standing for “normal data,” and is expected to detectmispronunciations, standing for “abnormal data” during the test stage. Our proposition was evaluated in the context of Arabicpronunciation learning through the ASMDD Arabic dataset. The obtained results are promising, with an accuracy of about98%. The proposed VAE outperformed the standard autoencoder as well as the state-of-the-art convolution neural networksused for Arabic mispronunciation detection.
Résumé: The world has become a global village, and mastering multiple languages has become necessary; therefore, the need for computer-assisted language learning (CALL) applications is growing. Speech being the most spontaneous and widespread mode of communication, learning pronunciation occupies a considerable place, and computer-assisted pronunciation teaching (CAPT) is increasingly integrated into CALL systems. Herein, pronunciation assessment is a critical component that aims to detect and diagnose mispronunciation to provide informative feedback for the learners. On the other hand, Arabic is among the top five languages to learn, but it is sorely lacking in resources for CAPT. This paper aims to bridge the gap between Arabic and high-resource languages in CAPT. As neither a review nor a survey on pronunciation assessment for Arabic exists, this paper provides an overview of the existing …
Communications internationales
Résumé: In today’s interconnected world, humans live in a closely connected community, where it is essential to acquire proficiency in multiple languages to interact with others effectively. Therefore, research in computer-assisted language learning (CALL) is a dynamic study area, focusing on pronunciation mastery as the most challenging aspect. Pronunciation assessment is a keystone in computer-assisted pronunciation teaching (CAPT) systems. It aims to detect mispronunciations and provide informative feedback to learners. This task has been approached as a classification problem. Herein, the amount of available data is of great importance during the training stage. However, datasets are more likely to be imbalanced. This paper tackles the issue of imbalanced datasets by suggesting a semi-supervised approach using the one-class classification (OCC) method. We trained a convolutional neural network (CNN) to detect mispronounced words, whilst the CNN is exclusively trained based on well-pronounced ones. The experiments were conducted on the freely accessible Arabic speech mispronunciation detection dataset (ASMDD). The obtained results show an accuracy of about 84% on unseen pronunciations.