ITRT(IT Research Trends)

Multi-level video captioning method based on semantic space

연구 분야: Strategies

논문 키워드: #graph #interactive #experiments #datasets #video

학회: Multimedia Tools and Applications

초록

Video captioning is designed to generate natural language descriptions based on video content. Traditional methods extract visual features and interactive relationship features between objects, but the problem of video feature isolation and semantic hierarchy is ignored. This paper proposes a Multi-Level Video Captioning Method based on semantic space (S-MLM) to solve the above problems. S-MLM extracts different levels of visual elements and visual relationships, and the visual information of different levels is aggregated layer by layer to complete the generation of low-level to high-level visual features. The multi-level structure semantic graph is constructed from the semantic point of view. It does not rely on external knowledge bases, and uses its own information as guidance to enhance feature representation and improve semantic understanding. We conduct experiments on MSVD and MSR-VTT datasets, and the experimental results show that the performance of video captioning is further improved.

📄 논문 정보

발행 연도	2024년
인용수	0
출판 국가	British Indian Ocean Territory, China
사이트	Springer
좋아요 수	0

Multi-level video captioning method based on semantic space

Multi-level video captioning method based on semantic space

Xiao Yao

Yuanlin Zeng

Min Gu

Ruxi Yuan

Jie Li

Junyi Ge

📄 논문 정보

연관 논문 목록 (47건)

Multi-level video captioning method based on semantic space

Multi-level video captioning method based on semantic space

📄 논문 정보

연관 논문 목록 (47건) 내 서재 담기

연관 논문 목록 (47건)