Abstract: Computational modeling of human multimodal language is an emerging research area in natural language processing spanning the language, visual and acoustic modalities. Comprehending multimodal language requires modeling not only the interactions within each modality (intra-modal interactions) but more importantly the interactions between modalities (cross-modal interactions). In this paper, we propose the Recurrent Multistage Fusion Network (RMFN) which decomposes the fusion problem into multiple stages, each of them focused on a subset of multimodal signals for specialized, effective fusion. Crossmodal interactions are modeled using this multistage fusion approach which builds upon intermediate representations of previous stages. Temporal and intra-modal interactions are modeled by integrating our proposed fusion approach with a system of recurrent neural networks. The RMFN displays state-of-the-art performance in modeling human multimodal language across three public datasets relating to multimodal sentiment analysis, emotion recognition, and speaker traits recognition. We provide visualizations to show that each stage of fusion focuses on a different subset of multimodal signals, learning increasingly discriminative multimodal representations.

Neurogaming: Bridging the Mind and Machine in the Gaming Universe

Neuroeconomics: The Best of Neuroscience, Psychology, and Economics

Multimodal Language Analysis with Recurrent Multistage Fusion

Learn more about the technologies used

Scientific Publications from Researchers Using iMotions

My money—My problem: How fear-of-missing-out appeals can hinder sustainable investment decisions

Are pie charts evil? An assessment of the value of pie and donut charts compared to bar charts

Being facially expressive is socially advantageous

In-Lab and Remote webcam-based Respiration: A promising candidate for neuromarketing?

Related Posts

Neurogaming: Bridging the Mind and Machine in the Gaming Universe

Neuroeconomics: The Best of Neuroscience, Psychology, and Economics

What is Attribution Theory?

What is the Observer Effect?

🍪 Use of cookies

Settings