Recognition of Facial and Voice Emotional States Using Deep-BEL Model

Document Type : Computer Article

Authors

Department of Computer Engineering, Fouman and Shaft Branch, Islamic Azad University, Fouman, Iran

Abstract

In recent years, emotion recognition as a new method for natural human-computer interaction has attracted the attention of many researchers. Because the automatic recognition of emotion from speech or facial expressions alone has uncertainties, it is expected that emotion recognition based on the fusion of audio-visual information can be done with better accuracy. The purpose of this article is to present an effective method for emotion recognition from emotional speech and images of visible facial expressions and infrared images, based on a hybrid model. For this purpose, in the proposed model, the deep learning model is used to represent the visual-auditory features and the brain emotional learning (BEL) model, inspired by the limbic system of the brain, is used for the fusion of three-modality information. In the proposed model, the existing audio-visual database in the field of multimodal emotion recognition, Enterface'05, has been used for various experiments. The recognition accuracy of the presented model in the best case for this database is 94.20%, which has the highest efficiency compared to other fusion methods.

Keywords

Main Subjects



Articles in Press, Accepted Manuscript
Available Online from 06 December 2025
  • Receive Date: 09 December 2023
  • Revise Date: 22 September 2025
  • Accept Date: 06 December 2025