Learning Common and Specific Visual Prompts for Domain Generalization


연구 분야: Verification



학회: Asian Conference on Computer Vision


초록

Although fine-tuning a pre-trained large-scale model has become an effective method for domain generalization, domain shifts still issue a huge challenge for successfully transferring models to unseen test domains. In this paper, we study how to effectively adapt pre-trained vision Transformers for domain generalization problems in image classification. To this end, this paper proposes a novel Common-Specific Visual Prompt Tuning (CSVPT) method to transfer large-scale vision Transformer models to unknown test domains. Different from existing methods which learn fixed visual prompts for each task, CSVPT jointly learns domain-common prompts to capture the task context and sample-specific prompts to capture information about data distribution, which are generated for each sample through a trainable prompt-generating module (PGM). Combining the domain-common prompts and the sample-specific prompts, visual prompts learned by CSVPT are conditioned on each input sample rather than fixed once learned, which helps out-of-distribution generalization. Extensive experimental results show the effectiveness of CSVPT, and CSVPT with the backbone ViT-L/14 achieves state-of-the-art (SOTA) performance on five widely used benchmark datasets.


Author Profile
Aodi Li

University of Science and Technology of China Hefei 230026 China

Andorra
Author Profile
Liansheng Zhuang

University of Science and Technology of China Hefei 230026 China

Andorra
Author Profile
Shuo Fan

University of Science and Technology of China Hefei 230026 China

Andorra

📄 논문 정보

발행 연도 2023년
인용수 0
출판 국가 Andorra, China
사이트 Springer
좋아요 수 0

연관 논문 목록 (43건)