MelodyDiffusion: Chord-Conditioned Melody Generation Using a Transformer-Based Diffusion Modelopen access
- Authors
- Li, Shuyu; Sung, Yunsick
- Issue Date
- Apr-2023
- Publisher
- MDPI
- Keywords
- melody generation; conditional generation; diffusion model; transformer
- Citation
- Mathematics, v.11, no.8, pp 1 - 15
- Pages
- 15
- Indexed
- SCIE
SCOPUS
- Journal Title
- Mathematics
- Volume
- 11
- Number
- 8
- Start Page
- 1
- End Page
- 15
- URI
- https://scholarworks.dongguk.edu/handle/sw.dongguk/19857
- DOI
- 10.3390/math11081915
- ISSN
- 2227-7390
2227-7390
- Abstract
- Artificial intelligence, particularly machine learning, has begun to permeate various real-world applications and is continually being explored in automatic music generation. The approaches to music generation can be broadly divided into two categories: rule-based and data-driven methods. Rule-based approaches rely on substantial prior knowledge and may struggle to handle large datasets, whereas data-driven approaches can solve these problems and have become increasingly popular. However, data-driven approaches still face challenges such as the difficulty of considering long-distance dependencies when handling discrete-sequence data and convergence during model training. Although the diffusion model has been introduced as a generative model to solve the convergence problem in generative adversarial networks, it has not yet been applied to discrete-sequence data. This paper proposes a transformer-based diffusion model known as MelodyDiffusion to handle discrete musical data and realize chord-conditioned melody generation. MelodyDiffusion replaces the U-nets used in traditional diffusion models with transformers to consider the long-distance dependencies using attention and parallel mechanisms. Moreover, a transformer-based encoder is designed to extract contextual information from chords as a condition to guide melody generation. MelodyDiffusion can automatically generate diverse melodies based on the provided chords in practical applications. The evaluation experiments, in which Hits@k was used as a metric to evaluate the restored melodies, demonstrate that the large-scale version of MelodyDiffusion achieves an accuracy of 72.41% (k = 1).
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Advanced Convergence Engineering > Department of Computer Science and Artificial Intelligence > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.