Bytecomposer: A Human-like Melody Composition Method Based On Language Model Agent

Liang Xia, Du Xingjian, Lin Jiaju, Zou Pei, Wan Yuan, Zhu Bilei. Arxiv 2024

[Paper]
Agentic GPT Model Architecture Multimodal Models RAG Tools

Large Language Models (LLM) have shown encouraging progress in multimodal understanding and generation tasks. However, how to design a human-aligned and interpretable melody composition system is still under-explored. To solve this problem, we propose ByteComposer, an agent framework emulating a human’s creative pipeline in four separate steps : “Conception Analysis - Draft Composition - Self-Evaluation and Modification - Aesthetic Selection”. This framework seamlessly blends the interactive and knowledge-understanding features of LLMs with existing symbolic music generation models, thereby achieving a melody composition agent comparable to human creators. We conduct extensive experiments on GPT4 and several open-source large language models, which substantiate our framework’s effectiveness. Furthermore, professional music composers were engaged in multi-dimensional evaluations, the final results demonstrated that across various facets of music composition, ByteComposer agent attains the level of a novice melody composer.

The Large Language Model Bible

Bytecomposer: A Human-like Melody Composition Method Based On Language Model Agent

Liang Xia, Du Xingjian, Lin Jiaju, Zou Pei, Wan Yuan, Zhu Bilei. Arxiv 2024

Similar Work