Generating Piano Music with Transformers: A Comparative Study of Scale, Data, and Metrics

Jonathan Lehmkuhl,Ábel Ilyés-Kun,Nico Bremes,Cemhan Kaan Özaltan,Frederik Muthers,Jiayi Yuan

Published 2025 in arXiv.org

ABSTRACT

Although a variety of transformers have been proposed for symbolic music generation in recent years, there is still little comprehensive study on how specific design choices affect the quality of the generated music. In this work, we systematically compare different datasets, model architectures, model sizes, and training strategies for the task of symbolic piano music generation. To support model development and evaluation, we examine a range of quantitative metrics and analyze how well they correlate with human judgment collected through listening studies. Our best-performing model, a 950M-parameter transformer trained on 80K MIDI files from diverse genres, produces outputs that are often rated as human-composed in a Turing-style listening survey.

PUBLICATION RECORD

Publication year
2025
Venue
arXiv.org
Publication date
2025-11-10
Fields of study
Computer Science, Engineering
Identifiers
DOI 10.48550/arXiv.2511.07268 arXiv 2511.07268
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Aria-MIDI: A Dataset of Piano MIDI Files for Symbolic Music Modeling
2025influential reference
Moonbeam: A MIDI Foundation Model Using Both Absolute and Relative Music Attributes
2025cited by this paper
Frechet Music Distance: A Metric For Generative Symbolic Music Evaluation
2024cited by this paper
PiJAMA: Piano Jazz with Automatic MIDI Annotations
2023cited by this paper
FIGARO: Controllable Music Generation using Learned and Expert Features
2023cited by this paper
ATEPP: A Dataset of Automatically Transcribed Expressive Piano Performance
2022cited by this paper
Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation
2022cited by this paper
A Domain-Knowledge-Inspired Music Embedding Space and a Novel Attention Mechanism for Symbolic Music Modeling
2022cited by this paper
A transformer generative adversarial network for multi-track music generation
2021cited by this paper
GiantMIDI-Piano: A large-scale MIDI dataset for classical piano music
2020cited by this paper
Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions
2020cited by this paper
Computer-Generated Music for Tabletop Role-Playing Games
2020cited by this paper
On the evaluation of generative models in music
2018cited by this paper
Music Transformer: Generating Music with Long-Term Structure
2018cited by this paper
Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset
2018influential reference
This time with feeling: learning expressive musical performance
2018cited by this paper
MidiNet: A Convolutional Generative Adversarial Network for Symbolic-Domain Music Generation
2017cited by this paper
Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching
2016cited by this paper
Computing Machinery and Intelligence
1950cited by this paper

CITED BY

No citing papers are available for this paper.