연구 분야: Infrastructure
학회: Multimedia Systems
Automated fact-checking is crucial for minimizing the effect of misleading information in the vast development of social networks era. However, fact-checking is a challenging task since the claim can have multiple evidence and the information in the evidence is diverse and complex, which needs an efficient method to combine and process to exploit valuable insight for verification of the truthfulness of the claim. This paper proposes MVCE—a fusion framework that combines the claim in text form with evidence in both image and text to solve the claim verification task and generate the ruling sentence for the input claim. As an important aspect, we conduct various experiments with different text and image pre-trained models to investigate their impact on extracting useful features. The experimental results show that MCVE with multimodality fusion from different encoders, attention mechanism, and convolutional modules is efficient for the automated fact-checking task when evaluating the claim verification task with Mocheg and FACTIFY datasets (better than approximately 2% compared with the CLIP as the baseline in the Mocheg dataset, and 15% with the ResNET50+SBERT as baseline in the FACTIFY dataset) and enhance the generation model’s performance for the explanation task (better than approximately 2–3% compared with the BART-large model as baseline in the Mocheg). In addition, MCVE shows results that are comparable with LLM models such as LLaVA for the claim verification task with fewer model sizes. Finally, ablation studies have also been performed to show the efficiency of integration between text and image features in boosting the performance of the MCVE framework for the fact-checking task.
| 발행 연도 | 2025년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Vietnam, Andorra |
| 사이트 | Springer |
| 좋아요 수 | 0 |