연구 분야: Strategies
학회: Journal of Real-Time Image Processing
One of the benefits of the videofluoroscopic swallow study (VFSS) is the visualization of the bolus transit during the swallowing process. This X-ray imaging technique allows clinicians to observe the occurrence of penetration and aspiration of a bolus into the airway, and to characterize possible post-swallow residue. This study aims to develop and analyze deep learning models for bolus segmentation in videofluoroscopic swallow study. This study utilized various encoder–decoder-based deep learning models to automatically segment a bolus. The models were developed with 6424 VFSS images from 270 swallow studies obtained from 28 patients (15 males, mean age: 59.87 ± 14.88 years; 13 females, mean age: 57.08 ± 17.21 years) suspected of dysphagia (swallowing difficulties). The data were split at patient level with a proportion of 80%, 10%, and 10% for training, validation, and testing, respectively. Model performance was mainly evaluated by dice score coefficient (DSC) and intersection-over-union (IoU). The InceptionResNetV2 encoder in the UNet + + architecture achieved the best performance with 81.16% of DSC and 68.29% of IoU, while the inference speed was 49.34 ms per image on a designated device. In addition, the UNet + + with MobileNetV2 encoder achieved a considerably faster inference speed of 10.08 ms per image and slightly lower performance of 80.98% and 68.04% for DSC and IOU, respectively. Our study demonstrated effective and accurate methods of segmenting and tracking a bolus on all frames of VFSS exams in real time, indicating the potential to reduce human error and contribute objective analysis to early dysphagia diagnosis and management.
| 발행 연도 | 2024년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Andorra, Canada |
| 사이트 | Springer |
| 좋아요 수 | 0 |