ITRT(IT Research Trends)

Advancing multimodal emotion recognition in big data through prompt engineering and deep adaptive learning

연구 분야: Databases

논문 키워드: #computational #computing #affective #noisy #enhanced

학회: Journal of Big Data

초록

Emotion recognition in dynamic and real-world environments presents significant challenges due to the complexity and variability of multimodal data. This paper introduces an innovative Multimodal Emotion Recognition (MER) framework that seamlessly integrates text, audio, video, and motion data using advanced machine learning techniques. To address challenges such as class imbalance, the framework employs Generative Adversarial Networks (GANs) for synthetic sample generation and Dynamic Prompt Engineering (DPE) for enhanced feature extraction across modalities. Text features are processed with Mistral-7B, audio with HuBERT, video with TimeSformer and LLaVA, and motion with MediaPipe Pose. The system efficiently fuses these inputs using Hierarchical Attention-based Graph Neural Networks (HAN-GNN) and Cross-Modality Transformer Fusion (XMTF), further improved by contrastive learning with Prototypical Networks to enhance class separation. The framework demonstrates exceptional performance, achieving training accuracies of 99.92% on IEMOCAP and 99.95% on MELD, with testing accuracies of 99.82% and 99.81%, respectively. High precision, recall, and specificity further highlight the robustness of the model. While trained on batch-processed datasets, the framework has been optimized for real-time applications, demonstrating computational efficiency with training completed in just 5 min and inference times under 0.4 ms per sample. This makes the system well-suited for real-time emotion recognition tasks despite being trained on batch data. It also generalizes effectively to noisy and multilingual settings, achieving strong results on SAVEE and CMU-MOSEAS, thereby confirming its resilience in diverse real-world scenarios. This research advances the field of MER, offering a scalable and efficient solution for affective computing. The findings emphasize the importance of refining these systems for real-world applications, particularly in complex, multimodal big data environments.

📄 논문 정보

발행 연도	2025년
인용수	0
출판 국가	Andorra
사이트	Springer
좋아요 수	0

Advancing multimodal emotion recognition in big data through prompt engineering and deep adaptive learning

Advancing multimodal emotion recognition in big data through prompt engineering and deep adaptive learning

Abeer A. Wafa

Mai M. Eldefrawi

Marwa S. Farhan

📄 논문 정보

연관 논문 목록 (89건)

Advancing multimodal emotion recognition in big data through prompt engineering and deep adaptive learning

Advancing multimodal emotion recognition in big data through prompt engineering and deep adaptive learning

📄 논문 정보

연관 논문 목록 (89건) 내 서재 담기

연관 논문 목록 (89건)