EMOSENSE – Multi-Modal Emotion Recognition to Identify Emotions

De Silva J.A.D.P.R; Lanka P.A.C; Jayawardena R.D.T.M; Nandakumara K.S.S; Lakmini Abeywardhana; Dilshan De Silva

doi:https://doi.org/10.47001/IRJIET/2023.710057

EMOSENSE – Multi-Modal Emotion Recognition to Identify Emotions

De Silva J.A.D.P.RFaculty of Computing (FoC), Sri Lanka Institute of Information Technology (SLIIT), Malabe, Sri LankaLanka P.A.CFaculty of Computing (FoC), Sri Lanka Institute of Information Technology (SLIIT), Malabe, Sri LankaJayawardena R.D.T.MFaculty of Computing (FoC), Sri Lanka Institute of Information Technology (SLIIT), Malabe, Sri LankaNandakumara K.S.SFaculty of Computing (FoC), Sri Lanka Institute of Information Technology (SLIIT), Malabe, Sri LankaLakmini AbeywardhanaFaculty of Computing (FoC), Sri Lanka Institute of Information Technology (SLIIT), Malabe, Sri LankaDilshan De SilvaFaculty of Computing (FoC), Sri Lanka Institute of Information Technology (SLIIT), Malabe, Sri Lanka

Vol 7 No 10 (2023): Volume 7, Issue 10, October 2023 | Pages: 428-436

International Research Journal of Innovations in Engineering and Technology

OPEN ACCESS | Research Article | Published Date: 03-11-2023

doi.org/10.47001/IRJIET/2023.710057

Full Text PDF

Abstract

An extended study has been done over the past years to better comprehend human emotions. The embracement of technology to recognize and react to human emotions has become a required component of society. We present a fully functional multi-modal emotion recognition system in this study that integrates data from text, voice, facial expressions, and body language. In this study, the automatic classification of anger, fear, joy, sadness, surprise, disgust, and neutral emotions from text, facial expressions, voice, and body movements have been studied on the TESS, MELD, FER2013, and EDNLP datasets. Random Forest Classifier has been used for the classification of emotions using body language, VGG16 pre-trained model for facial emotion classification, Logistic Resgression for text emotion classification, and CNN for voice emotion classification. The logistic regression model for text emotion prediction leverages natural language processing (NLP) techniques to extract emotions from textual data. The CNN-based voice model utilizes speech recognition and emotion recognition algorithms to analyze audio signals and detect emotional cues in the speaker's voice. The facial expression model employs a combination of CNN-based VGG16 pre-trained model and modified convolutional layers to detect emotions. Meanwhile, the Random Forest Clasifier model is used to capture and interpret non-verbal cues, such as gestures, posture, and overall body movements, to enrich the emotion detection process. The real strength of our proposed system lies in its ability to synergistically combine information from multiple modalities.

Keywords

multi-modal emotion detection, facial expressions, voice, text, body language, human emotions, machine learning

Citation of this Article

De Silva J.A.D.P.R, Lanka P.A.C, Jayawardena R.D.T.M, Nandakumara K.S.S, Lakmini Abeywardhana, Dilshan De Silva, “EMOSENSE – Multi-Modal Emotion Recognition to Identify Emotions” Published in International Research Journal of Innovations in Engineering and Technology - IRJIET, Volume 7, Issue 10, pp 428-436, October 2023. Article DOI https://doi.org/10.47001/IRJIET/2023.710057

This work is licensed under Creative common Attribution Non Commercial 4.0 Internation Licence

References

[1]	A. G. R. S. Amit Pandey, "FACIAL EMOTION DETECTION AND RECOGNITION," International Journal of Engineering Applied Sciences and Technology, vol. 7, no. 1, pp. 176-179, 2022.
[2]	K. S. R. B. Boddepalli Kiran Kumar, "Facial Emotion Recognition and Detection Using CNN," Turkish Journal of Computer and Mathematics Education, vol. 12, no. 14, pp. 5960-5968, 2021.
[3]	Z. Y. P. C. S. N. Jianhua Zhang, "Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review," 2020.
[4]	N. C. K. P. N. B. J. A. Gokul Subramanian, "Multimodal Emotion Recognition Using Different Fusion Techniques," 2021.
[5]	M. Shaikh, H. Prendinger, and M. Ishizuka, Emotion sensitive.
[6]	M. Shaikh, H. Prendinger, and M. Ishizuka, SenseNet: A linguistic.
[7]	M. Shaikh, H. Prendinger, and M. Ishizuka, A cognitively based.
[8]	H. Liu, H. Lieberman, and T. Selker, A model of textual affect.
[9]	A. C. Boucouvalas and X. Zhe, Text-to-Emotion. Engine for Real.
[10]	M. S. A. A. G. R. K. K. H. J. &. K. D. Karg, "Body movements for affective expression: A survey of automatic recognition and generation.," IEEE Transactions on Affective Computing, vol. 4, no. 4, pp. 341 - 359, 2013.
[11]	A. &. B.-B. N. Kleinsmith, "Affective body expression perception and recognition: A survey.," IEEE Transactions on Affective Computing, vol. 4, no. 1, pp. 15-33, 2013.
[12]	A. Q. Y. &. P. R. W. Kapoor, "Fully automatic upper facial action recognition.," In Proceedings of the IEEE International Workshop on Analysis and Modeling of Faces and Gestures, pp. 195-202, 2003.
[13]	S. E. B. X. L. P. G. C. M. V. K. K. .. &. B. Y. Kahou, "Emonets: Multimodal deep learning approaches for emotion recognition in video.," Journal on Multimodal User Interfaces, vol. 7, no. 1, pp. 99-111, 2013.
[14]	R. A. &. D. S. Calvo, "Affect detection: An interdisciplinary review of models, methods, and their applications.," IEEE Transactions on Affective Computing, vol. 1, no. 1, pp. 18-37, 2010.
[15]	S. C. E. B. R. &. H. A. Poria, "A review of affective computing: From unimodal analysis to multimodal fusion.," Information Fusion, vol. 37, pp. 98-125, 2017.
[16]	C. B. T. C. Amil Khanzada, "Facial Expression Recognition with Deep Learning".
[17]	D. H. M. N. C. M. Soujanya Poria, "MELD:A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations," vol. 2019, June.
[18]	D. J. RENJU RENJITH, "Emotion detection using facial expression recognition based on VGG16 network," Journal of Emerging Technologies and Innovative Research (JETIR), vol. 8, no. 7, pp. b934-b938, 2021.
[19]	https://www.kaggle.com/datasets/praveengovi/emotions-dataset-for-nlp.
[20]	Asghar MZ, Subhan F, Imran M, Kundi FM, Shamshirband S, Mosavi A, Csiba P, Várkonyi-Kóczy AR (2019) Performance evaluation of supervised machine learning techniques for efficient detection of emotions from online content. arXiv preprint arXiv:190801587.
[21]	"Toronto emotional speech set (TESS)," [Online]. Available: https://www.kaggle.com/datasets/ejlok1/toronto-emotional-speech-set-tess.
[22]	Zhang, Y., Ishibuchi, H., and Wang, S. (2018). Deep Takagi–sugeno–kang fuzzy classifier with shared linguistic fuzzy rules. IEEE Trans. Fuzzy Syst. 26, 1535–1549. doi: 10.1109/TFUZZ.2017.2729507.

For Authors

Publication Archives

Volume 1 - 2017

Volume 2 - 2018

Volume 3 - 2019

Volume 4 - 2020

Volume 5 - 2021

Volume 6 - 2022

Volume 7 - 2023

Volume 8 - 2024

Volume 9 - 2025

Volume 10 - 2026

For Board Members

Downloads

Research Areas

EMOSENSE – Multi-Modal Emotion Recognition to Identify Emotions

Abstract

Keywords

Citation of this Article

References

International Research Journal of Innovations in Engineering
and Technology - IRJIET

Editorial Policies

Quick Links