연구 분야: Software Development
학회: Neural Computing and Applications
Handwritten documents generated in our day-to-day office work, class room and other sectors of society carry vital information. Automatic processing of these documents is a pipeline of many challenging steps. The very first and crucial step is to identify text separately from the non-text as any OCR (optical character recognition) engine can only process the textual content. Separating text from non-text in unconstrained handwritten documents is a very complex task. In addition to other challenges, touching component is one of the major issues for text non-text separation in unconstrained handwritten documents. Detection of text, non-text along with touching component in such documents is an unexplored area of research. To address this issue, in this work, we develop a dilated spatial attention-based network for text, non-text and touching component detection. Additionally, in this work, we also prepare a realistic dataset for the said task. In the proposed dataset, the present model obtains overall accuracy of 87.85%. The performance of the present model is compared with seven feature-engineering-based methods and six deep learning-based methods. In most of the cases, the proposed model outperforms the comparing methods in the proposed dataset. The codes of our method are available here https://github.com/Showmik-Bhowmik/DSANet-Dilated-Spatial-Attention-.git.
| 발행 연도 | 2024년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Andorra |
| 사이트 | Springer |
| 좋아요 수 | 0 |