연구 분야: Artificial Intelligence
학회: Multimedia Tools and Applications
Advancements in indoor autonomous vision systems (IAVSs) underscore the need to bridge the gap between their capabilities and human perception of real-world scenes. This paper introduces a novel semantic segmentation framework called EADFL-UNet, based on the U-Net architecture. It incorporates EfficientNetB3 as the encoder for improved feature extraction and employs a super attention block, integrating attention gate (AG) and spatial and channel SE (scSE) mechanisms, to refine segmentation by prioritizing relevant areas and features. Additionally, a modified loss function merging Diceloss (DL) and Class-Balanced Weights Focalloss (CBW-FL) addresses data imbalance, especially in liver segmentation and indoor environments. Evaluation of the NYUv2 Dataset and augmented datasets compared the performance of EADFL-UNet with various U-Net encoder configurations, demonstrating its superiority. Further analysis focused on integrating attention blocks at different stages of the U-Net architecture, revealing significant improvements in segmentation accuracy. The proposed method, even without depth information, outperforms conventional structures by 10% in mean Intersection over Union (mIOU), showing promise for applications in diverse IAVSs such as robotic vision, GPS, sports, and security.
| 발행 연도 | 2024년 |
|---|---|
| 인용수 | 2 |
| 출판 국가 | Canada |
| 사이트 | Springer |
| 좋아요 수 | 0 |