Cited 1 time in
Diffusion-Denoising Process with Gated U-Net for High-Quality Document Binarization
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Han, Sangkwon | - |
| dc.contributor.author | Ji, Seungbin | - |
| dc.contributor.author | Rhee, Jongtae | - |
| dc.date.accessioned | 2024-08-08T12:00:40Z | - |
| dc.date.available | 2024-08-08T12:00:40Z | - |
| dc.date.issued | 2023-10 | - |
| dc.identifier.issn | 2076-3417 | - |
| dc.identifier.issn | 2076-3417 | - |
| dc.identifier.uri | https://scholarworks.dongguk.edu/handle/sw.dongguk/21922 | - |
| dc.description.abstract | The binarization of degraded documents represents a crucial preprocessing task for various document analyses, including optical character recognition and historical document analysis. Various convolutional neural network models and generative models have been used for document binarization. However, these models often struggle to deliver generalized performance on noise types the model has not encountered during training and may have difficulty extracting intricate text strokes. We herein propose a novel approach to address these challenges by introducing the use of the latent diffusion model, a well-known high-quality image-generation model, into the realm of document binarization for the first time. By leveraging an iterative diffusion-denoising process within the latent space, our approach excels at producing high-quality, clean, binarized images and demonstrates excellent generalization using both data distribution and time steps during training. Furthermore, we enhance our model's ability to preserve text strokes by incorporating a gated U-Net into the backbone network. The gated convolution mechanism allows the model to focus on the text region by combining gating values and features, facilitating the extraction of intricate text strokes. To maximize the effectiveness of our proposed model, we use a combination of the latent diffusion model loss and pixel-level loss, which aligns with the model's structure. The experimental results on the Handwritten Document Image Binarization Contest and Document Image Binarization Contest benchmark datasets showcase the superior performance of our proposed model compared to existing methods. | - |
| dc.format.extent | 20 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | MDPI | - |
| dc.title | Diffusion-Denoising Process with Gated U-Net for High-Quality Document Binarization | - |
| dc.type | Article | - |
| dc.publisher.location | 스위스 | - |
| dc.identifier.doi | 10.3390/app132011141 | - |
| dc.identifier.scopusid | 2-s2.0-85192355240 | - |
| dc.identifier.wosid | 001090595700001 | - |
| dc.identifier.bibliographicCitation | Applied Sciences, v.13, no.20, pp 1 - 20 | - |
| dc.citation.title | Applied Sciences | - |
| dc.citation.volume | 13 | - |
| dc.citation.number | 20 | - |
| dc.citation.startPage | 1 | - |
| dc.citation.endPage | 20 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Chemistry | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalResearchArea | Materials Science | - |
| dc.relation.journalResearchArea | Physics | - |
| dc.relation.journalWebOfScienceCategory | Chemistry, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Materials Science, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Physics, Applied | - |
| dc.subject.keywordPlus | COMPETITION | - |
| dc.subject.keywordAuthor | document binarization | - |
| dc.subject.keywordAuthor | deep learning | - |
| dc.subject.keywordAuthor | gated convolution | - |
| dc.subject.keywordAuthor | generative model | - |
| dc.subject.keywordAuthor | latent diffusion models | - |
| dc.subject.keywordAuthor | text stroke | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114
Copyright(c) 2023 DONGGUK UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
