Dynamical semantic enhancement network for continuous sign language recognition


연구 분야: Strategies



학회: Multimedia Systems


초록

In the field of sign language recognition, effective interpretation of semantic information, which is primarily conveyed through facial and hand gestures, poses significant challenges. Previous methods often struggle to simultaneously capture semantic areas and accurately assess the varying importance of different motion frames, which hampers recognition accuracy. We propose the Dynamical Semantic Enhancement (DSE) Network which integrates the Long-Short Dependence Attention (LSDA) and Global Interaction Conv2d (GIConv2d) to address these challenges. LSDA is designed to form long-short-range spatial dependencies by advanced large-kernel convolutions coupled with small-kernel convolutions, which effectively capture synchronous facial and hand semantic contents. Meanwhile, GIConv2d adaptively learns the motion semantic contents by dynamically generating calibrated weights, focusing on the reasoning of frame-level contributions of rapid movement frames and static frames. Our DSE achieves competitive performances on three widely used datasets: PHOENIX14, PHOENIX14-T, and CSL-Daily. Additionally, visualization experiments confirm the DSE’s superior capability in reinforcing semantic extraction both spatially and temporally.


Author Profile
Suyang Wang

School of Computer Science and Engineering Tianjin University of Technology 391 Binshui West Road Tianjin 300384 Tianjin China

Andorra
Author Profile
Leming Guo

School of Computer Science and Engineering Tianjin University of Technology 391 Binshui West Road Tianjin 300384 Tianjin China

Andorra
Author Profile
Wanli Xue

School of Computer Science and Engineering Tianjin University of Technology 391 Binshui West Road Tianjin 300384 Tianjin China

Andorra

📄 논문 정보

발행 연도 2024년
인용수 0
출판 국가 Andorra
사이트 Springer
좋아요 수 0

연관 논문 목록 (32건)