MECG: modality-enhanced convolutional graph for unbalanced multimodal representations


연구 분야: Databases



학회: The Journal of Supercomputing


초록

In multimodal sentiment analysis tasks, it is very challenging to model the relationships between different modalities and fuse them. The problem in this area is the unbalance of sentiment representation and distribution across the different modalities, resulting in a fusion process that deviates from the multimodal sentiment-semantic space. We propose a novel fusion framework, MECG, based on graph convolutional neural networks, which provides an efficient approach for fusing unaligned multimodal sequences. With the help of text modalities, we first use the multimodal enhancement module to enhance visual and acoustic modalities to obtain more discriminative modalities, thus assisting the subsequent aggregation process. In addition, we construct text-driven multimodal feature graphs for modality fusion, which can effectively deal with the unbalanced issue among modalities in the graph convolution aggregation process. Finally, we integrate the fused information extracted by MECG into the verbal representation, thus dynamically transforming the original word representations toward the most accurate multimodal sentiment-semantic space. Our model proves its effectiveness and superiority on two publicly available datasets: CMU-MOSI and CMU-MOSEI.


Author Profile
Jiajia Tang

College of Computer Science Hangzhou Dianzi University No.2 Street Hangzhou 310000 China

China
Author Profile
Binbin Ni

College of Computer Science Hangzhou Dianzi University No.2 Street Hangzhou 310000 China

China
Author Profile
Yutao Yang

College of Computer Science Hangzhou Dianzi University No.2 Street Hangzhou 310000 China

China

📄 논문 정보

발행 연도 2024년
인용수 0
출판 국가 Anguilla, China
사이트 Springer
좋아요 수 0

연관 논문 목록 (55건)