AI Based Speech Analysis Framework

W.A.D.Perera; Mr. Jeewaka Perera; Mr. Tharaniyawarma.K

doi:https://doi.org/10.47001/IRJIET/2024.801013

AI Based Speech Analysis Framework

W.A.D.PereraFaculty of Computing, Sri Lanka Institute of Information Technology, Sri LankaMr. Jeewaka PereraFaculty of Computing, Sri Lanka Institute of Information Technology, Sri LankaMr. Tharaniyawarma.KFaculty of Computing, Sri Lanka Institute of Information Technology, Sri Lanka

Vol 8 No 1 (2024): Volume 8, Issue 1, January 2024 | Pages: 94-104

International Research Journal of Innovations in Engineering and Technology

OPEN ACCESS | Research Article | Published Date: 30-01-2024

doi.org/10.47001/IRJIET/2024.801013

Full Text PDF

Abstract

The pioneering AI-Based Speech Analysis Framework presented in this research paper was painstakingly created to help people overcome linguistic obstacles, notably in the context of English language communication. Through a thorough speech analysis, the framework's multimodal approach enables real-time evaluation of emotional state, fluency, stress levels, and even identification recognition. This framework delivers a sophisticated and perceptive interpretation of spoken language by utilizing cutting-edge artificial intelligence approaches, hence promoting an enhanced and successful communication experience. The study is focused on four significant sub-objectives, each of which advances the main objective of encouraging increased self-awareness and communication: First, by detecting subtly emotional indicators embedded in the voice, the framework transforms emotional assessment. The AI algorithms identify emotional patterns, such as enthusiasm, trepidation, or tranquility. This in-the-moment emotional analysis creates opportunities for tailored communication techniques and a greater understanding of the speaker's feelings. The framework also introduces a novel way for assessing fluency levels using voice analysis. It analyzes various facets of speech, such as pace, intonation, and lexical decisions, giving language learners immediate feedback on their level of linguistic proficiency. This makes it easier to make focused improvements and to move more easily toward effective communication. The framework also discusses the complex relationship between stress and good communication. It measures stress levels through vocal pattern analysis, offering light on instances of heightened tension or anxiety when speaking. Such knowledge enables people to overcome stress-related hurdles and enhance communication. The framework's capacity to accurately identify people based on distinctive voice traits lies at the heart of its innovation. Language limitations are no obstacle to this identity recognition technology, which provides an effective and secure method of identification in a variety of settings. Voice-based identification detection accelerates procedures and promotes inclusion in a variety of settings, including work settings and public services. The development of an AI-Based Speech Analysis Framework that reveals fresh angles in language evaluation and communication improvement is the culmination of this research. It not only encourages self-improvement but also highlights the revolutionary potential of AI in redefining language landscapes and promoting true connections by merging emotional, fluency, stress analysis, and identity identification through voice.

Keywords

AI-Based Speech Analysis, Emotional Assessment, Fluency Evaluation, Stress Detection, Identity Recognition, Language Proficiency

Citation of this Article

W.A.D.Perera, Mr. Jeewaka Perera, Mr. Tharaniyawarma.K, “AI Based Speech Analysis Framework” Published in International Research Journal of Innovations in Engineering and Technology - IRJIET, Volume 8, Issue 1, pp 94-104, January 2024. Article DOI https://doi.org/10.47001/IRJIET/2024.801013

This work is licensed under Creative common Attribution Non Commercial 4.0 Internation Licence

References

R. A. Khalil, E. Jones, M. I. Babar, T. Jan, M. H. Zafar, and T. Alhussain, “Speech Emotion Recognition Using Deep Learning Techniques: A Review,” IEEE Access, vol. 7, pp. 117327–117345, 2019.
F. Eyben, M. Wöllmer, and B. Schuller, “Opensmile,” Proceedings of the international conference on Multimedia - MM ’10, 2010.
A.Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM and other neural network architectures,” Neural Networks, vol. 18, no. 5–6, pp. 602–610, Jul. 2005.
G. Hinton et al., “Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82–97, Nov. 2012.
S. Mirsamadi, E. Barsoum, and C. Zhang, “Automatic speech emotion recognition using recurrent neural networks with local attention,” 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Mar. 2017.
S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, Oct. 2010.
H. Xu, H. Zhang, K. Han, Y. Wang, Y. Peng, and X. Li, “Learning Alignment for Multimodal Emotion Recognition from Speech.” Available: https://arxiv.org/pdf/1909.05645.pdf
T. M. Wani, T. S. Gunawan, S. A. A. Qadri, M. Kartiwi, and E. Ambikairajah, “A Comprehensive Review of Speech Emotion Recognition Systems,” IEEE Access, vol. 9, pp. 47795–47814, 2021.
B. Schuller et al., “The INTERSPEECH 2010 Paralinguistic Challenge *.” Accessed: Nov. 17, 2019. [Online]. Available: https://sail.usc.edu/publications/files/schuller2010_interspeech.pdf
F. Liu et al., “Deep Learning for Community Detection: Progress, Challenges and Opportunities,” Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, pp. 4981–4987, Jul. 2020.

For Authors

Publication Archives

Volume 1 - 2017

Volume 2 - 2018

Volume 3 - 2019

Volume 4 - 2020

Volume 5 - 2021

Volume 6 - 2022

Volume 7 - 2023

Volume 8 - 2024

Volume 9 - 2025

Volume 10 - 2026

For Board Members

Downloads

Research Areas

AI Based Speech Analysis Framework

Abstract

Keywords

Citation of this Article

References

International Research Journal of Innovations in Engineering
and Technology - IRJIET

Editorial Policies

Quick Links