Semantic scene segmentation for indoor autonomous vision systems: leveraging an enhanced and efficient U-NET architecture


연구 분야: Artificial Intelligence



학회: Multimedia Tools and Applications


초록

Advancements in indoor autonomous vision systems (IAVSs) underscore the need to bridge the gap between their capabilities and human perception of real-world scenes. This paper introduces a novel semantic segmentation framework called EADFL-UNet, based on the U-Net architecture. It incorporates EfficientNetB3 as the encoder for improved feature extraction and employs a super attention block, integrating attention gate (AG) and spatial and channel SE (scSE) mechanisms, to refine segmentation by prioritizing relevant areas and features. Additionally, a modified loss function merging Diceloss (DL) and Class-Balanced Weights Focalloss (CBW-FL) addresses data imbalance, especially in liver segmentation and indoor environments. Evaluation of the NYUv2 Dataset and augmented datasets compared the performance of EADFL-UNet with various U-Net encoder configurations, demonstrating its superiority. Further analysis focused on integrating attention blocks at different stages of the U-Net architecture, revealing significant improvements in segmentation accuracy. The proposed method, even without depth information, outperforms conventional structures by 10% in mean Intersection over Union (mIOU), showing promise for applications in diverse IAVSs such as robotic vision, GPS, sports, and security.


Author Profile
Thu A. N. Le

Department of Software Engineering FPT University Can Tho 94000 Vietnam

Canada
Author Profile
Nghi V. Nguyen

Department of Software Engineering FPT University Can Tho 94000 Vietnam

Canada
Author Profile
Nguyen T. Nguyen

Department of Software Engineering FPT University Can Tho 94000 Vietnam

Canada

📄 논문 정보

발행 연도 2024년
인용수 2
출판 국가 Canada
사이트 Springer
좋아요 수 0

연관 논문 목록 (33건)