Despite the recent achievements made in the multi-modal emotion recognition task, two problems still exist and have not been well investigated: 1) the relationship between different emotion categories are not utilized, which leads to sub-optimal performance; and 2) current models fail to cope well with low-resource emotions, especially for unseen emotions. In this paper, we propose a modality-transferable model with emotion embeddings to tackle the aforementioned issues. We use pre-trained word embeddings to represent emotion categories for textual data. Then, two mapping functions are learned to transfer these embedding into visual and acoustic spaces. For each modality, the model calculates the representation distance between the input sequence and target emotions and makes predictions based on the distances. By doing so, our model can directly adapt to the unseen emotions in any modality since we have their pre-trained embeddings and modality mapping functions. Experiments show that our model achieves stateof-the-art performance on most of the emotion categories. In addition, our model also outperforms existing baselines in the zero-shot and few-shot scenarios for unseen emotions.

We use CMU-Multimodal SDK for downloading and pre-processing the datasets. It helps to do data alignment and early stage feature extraction for each modality. The textual data is tokenized in word level and represented using GloVe (Pennington et al., 2014) embedding. Facial action units are extracted by the Facet (iMotions, 2017) to indicate muscle movements and expressions. These are a commonly used type of feature for facial expression recognition. For acoustic data, COVAREP is used to extract https://github.com/A2Zadeh/CMU-MultimodalSDK fundamental features, such as mel-frequency cepstral coefficients (MFCCs), pitch tracking, glottal source parameters, etc.

Have you done Research with iMotions?

We want to do more for researchers. Please contact us if you have done research using the iMotions Software Platform and would like to be featured here on our publications list and promoted to our community.

How to Measure the 4 Types of Attention – with Biosensors

Enhancing Safety in Road-Based Transportation through Human Factors R&D

Modality-Transferable Emotion Embeddings for Low-Resource Multimodal Emotion Recognition

Have you done Research with iMotions?

Learn more about the technologies used

Scientific Publications from Researchers Using iMotions

Investigating the impact of greenery elements in office environments on cognitive performance, visual attention and distraction: An eye-tracking pilot-study in virtual realityUnveiling impact dynamics: Discriminatory brand advertisements, stress response, and the call for ethical marketing practices

Inclusive Design Insights from a Preliminary Image-Based Conversational Search Systems Evaluation

Fixation-related potentials during mobile map assisted navigation in the real world: The effect of landmark visualization style

Joy is reciprocally transmitted between teachers and students: Evidence on facial mimicry in the classroom

Related Posts

Enhancing Safety in Road-Based Transportation through Human Factors R&D

Human Factors in Automotive Human-Machine Interface (HMI) Design

Exploring Human Behavior: Why do We All React in Different Ways?

iMotions EduLabs: Democratizing Biometric Technology

🍪 Use of cookies

Settings