A Survey of Training-free Diffusion-based Image Generation with Free-form Mask

Park, Yoonseo; Jo, Hyeongseob; Cho, Sung In

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

A Survey of Training-free Diffusion-based Image Generation with Free-form Mask

Full metadata record

DC Field	Value	Language
dc.contributor.author	Park, Yoonseo	-
dc.contributor.author	Jo, Hyeongseob	-
dc.contributor.author	Cho, Sung In	-
dc.date.accessioned	2025-09-25T06:30:13Z	-
dc.date.available	2025-09-25T06:30:13Z	-
dc.date.issued	2025	-
dc.identifier.issn	2997-7401	-
dc.identifier.issn	2997-741X	-
dc.identifier.uri	https://scholarworks.dongguk.edu/handle/sw.dongguk/61611	-
dc.description.abstract	Layout-to-image generation is a task that generates realistic images based on given layouts and corresponding textual descriptions. The layout provides structural information about the image, such as descriptions, positions, and sizes of objects. Traditional methods for layout-to-image generation relied on bounding boxes, which represent only fixed-form layouts. Recently, approaches using free-form masks have gained attention, as they enable more flexible control over the shapes and positions of objects. Among these, training-free methods have been proposed that leverage pre-trained diffusion models without additional training. These methods adjust modified attention and guidance mechanisms to steer the image generation process during the inference phase of the diffusion model. In this paper, we review training-free diffusion-based image generation methods that utilize free-form masks. We focus on three representative methods: Paint-with-Words, MultiDiffusion, and Zero-Painter. We analyze their generation strategies and key mechanisms, as well as their limitations regarding spatial accuracy and consistency in object placement. © 2025 IEEE.	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	IEEE	-
dc.title	A Survey of Training-free Diffusion-based Image Generation with Free-form Mask	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1109/ITC-CSCC66376.2025.11137628	-
dc.identifier.scopusid	2-s2.0-105016391894	-
dc.identifier.bibliographicCitation	2025 International Technical Conference on Circuits/Systems, Computers, and Communications	-
dc.citation.title	2025 International Technical Conference on Circuits/Systems, Computers, and Communications	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	foreign	-
dc.subject.keywordAuthor	cross-attention	-
dc.subject.keywordAuthor	diffusion models	-
dc.subject.keywordAuthor	free-form mask	-
dc.subject.keywordAuthor	layout-to-image generation	-
dc.subject.keywordAuthor	training-free	-

Files in This Item: There are no files associated with this item.

Appears in Collections: ETC > 1. Journal Articles

Show simple item record

qrcode

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE