Enhancing MusicGen with Prompt Tuning

Research output: Contribution to journalArticlepeer-review

Abstract

Generative AI has been gaining attention across various creative domains. In particular, MusicGen stands out as a representative approach capable of generating music based on text or audio inputs. However, it has limitations in producing high-quality outputs for specific genres and fully reflecting user intentions. This paper proposes a prompt tuning technique that effectively adjusts the output quality of MusicGen without modifying its original parameters and optimizes its ability to generate music tailored to specific genres and styles. Experiments were conducted to compare the performance of the traditional MusicGen with the proposed method and evaluate the quality of generated music using the Contrastive Language-Audio Pretraining (CLAP) and Kullback–Leibler Divergence (KLD) scoring approaches. The results demonstrated that the proposed method significantly improved the output quality and musical coherence, particularly for specific genres and styles. Compared with the traditional model, the CLAP score was increased by 0.1270, and the KLD score was increased by 0.00403 on average. The effectiveness of prompt tuning in optimizing the performance of MusicGen validated the proposed method and highlighted its potential for advancing generative AI-based music generation tools.

Original languageEnglish
Article number8504
JournalApplied Sciences (Switzerland)
Volume15
Issue number15
DOIs
StatePublished - Aug 2025

Keywords

  • generative AI
  • music generation
  • MusicGen
  • prompt tuning

Fingerprint

Dive into the research topics of 'Enhancing MusicGen with Prompt Tuning'. Together they form a unique fingerprint.

Cite this