Single-Image Driven 3D Viewpoint Training Data Augmentation for Effective Label Recognition


연구 분야: Artificial Intelligence



학회: International Conference on Pattern Recognition


초록

Confronting the critical challenge of insufficient training data in the field of complex image recognition, this paper introduces a novel 3D viewpoint transformation technique initially tailored for label recognition. This technique can be used not only for data augmentation by generating synthetic data from any perspective but also transform photos taken from any angle into a frontal view, thereby reducing the complexity of the recognition task. Given the extensive use of wine-related applications with over 20 million users and the continuous publication of wine label datasets, we decided to focus this study on wine labels. This method enhances deep learning model performance by generating visually realistic training samples from a single real-world label image, overcoming the challenges posed by the intricate combinations of text and logos. Unlike classical Generative Adversarial Network (GAN) methods, which fall short in synthesizing such intricate content combinations and require a large amount of training data to become effective, our proposed solution leverages time-tested computer vision and image processing strategies. By using just a single monocular wine label image, we can expand our training dataset, thereby broadening the range of training samples for deep learning applications. This innovative approach to data augmentation circumvents the constraints of limited training resources. We then utilize the augmented training images through the Vision Transformer (ViT) architecture, performing one-shot recognition of existing wine labels in the training classes or future newly collected wine labels unavailable in the training. Experimental results show a significant increase in recognition accuracy over conventional 2D data augmentation techniques, indicating the potential for broader application in various labeling scenarios.


Author Profile
Yueh-Cheng Huang

Department of Computer Science National Yang Ming Chiao Tung University Hsinchu Taiwan

Taiwan
Author Profile
Hsin-Yi Chen

Department of Computer Science National Yang Ming Chiao Tung University Hsinchu Taiwan

Taiwan
Author Profile
Cheng-Jui Hung

Department of Computer Science National Yang Ming Chiao Tung University Hsinchu Taiwan

Taiwan

📄 논문 정보

발행 연도 2024년
인용수 0
출판 국가 Taiwan, Andorra
사이트 Springer
좋아요 수 0

연관 논문 목록 (303건)