Detailed Information

Cited 1 time in webofscience Cited 1 time in scopus
Metadata Downloads

Diffusion-Denoising Process with Gated U-Net for High-Quality Document Binarization

Full metadata record
DC Field Value Language
dc.contributor.authorHan, Sangkwon-
dc.contributor.authorJi, Seungbin-
dc.contributor.authorRhee, Jongtae-
dc.date.accessioned2024-08-08T12:00:40Z-
dc.date.available2024-08-08T12:00:40Z-
dc.date.issued2023-10-
dc.identifier.issn2076-3417-
dc.identifier.issn2076-3417-
dc.identifier.urihttps://scholarworks.dongguk.edu/handle/sw.dongguk/21922-
dc.description.abstractThe binarization of degraded documents represents a crucial preprocessing task for various document analyses, including optical character recognition and historical document analysis. Various convolutional neural network models and generative models have been used for document binarization. However, these models often struggle to deliver generalized performance on noise types the model has not encountered during training and may have difficulty extracting intricate text strokes. We herein propose a novel approach to address these challenges by introducing the use of the latent diffusion model, a well-known high-quality image-generation model, into the realm of document binarization for the first time. By leveraging an iterative diffusion-denoising process within the latent space, our approach excels at producing high-quality, clean, binarized images and demonstrates excellent generalization using both data distribution and time steps during training. Furthermore, we enhance our model's ability to preserve text strokes by incorporating a gated U-Net into the backbone network. The gated convolution mechanism allows the model to focus on the text region by combining gating values and features, facilitating the extraction of intricate text strokes. To maximize the effectiveness of our proposed model, we use a combination of the latent diffusion model loss and pixel-level loss, which aligns with the model's structure. The experimental results on the Handwritten Document Image Binarization Contest and Document Image Binarization Contest benchmark datasets showcase the superior performance of our proposed model compared to existing methods.-
dc.format.extent20-
dc.language영어-
dc.language.isoENG-
dc.publisherMDPI-
dc.titleDiffusion-Denoising Process with Gated U-Net for High-Quality Document Binarization-
dc.typeArticle-
dc.publisher.location스위스-
dc.identifier.doi10.3390/app132011141-
dc.identifier.scopusid2-s2.0-85192355240-
dc.identifier.wosid001090595700001-
dc.identifier.bibliographicCitationApplied Sciences, v.13, no.20, pp 1 - 20-
dc.citation.titleApplied Sciences-
dc.citation.volume13-
dc.citation.number20-
dc.citation.startPage1-
dc.citation.endPage20-
dc.type.docTypeArticle-
dc.description.isOpenAccessY-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaChemistry-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaMaterials Science-
dc.relation.journalResearchAreaPhysics-
dc.relation.journalWebOfScienceCategoryChemistry, Multidisciplinary-
dc.relation.journalWebOfScienceCategoryEngineering, Multidisciplinary-
dc.relation.journalWebOfScienceCategoryMaterials Science, Multidisciplinary-
dc.relation.journalWebOfScienceCategoryPhysics, Applied-
dc.subject.keywordPlusCOMPETITION-
dc.subject.keywordAuthordocument binarization-
dc.subject.keywordAuthordeep learning-
dc.subject.keywordAuthorgated convolution-
dc.subject.keywordAuthorgenerative model-
dc.subject.keywordAuthorlatent diffusion models-
dc.subject.keywordAuthortext stroke-
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Engineering > Department of Industrial and Systems Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE