
Code, dataset, and trained models for generating traditional Chinese guzheng music using hierarchical patch-character language models. Compares a fine-tune of NotaGen-medium (pre-trained on ~1M Western scores) against an identical-architecture model trained from scratch on the same corpus, isolating the contribution of pre-training. Includes 26 hand-curated guzheng repertoire pieces, 99 pieces from the Guzheng-Tech99 dataset, and 1,875 ABC samples (125 pieces × 15 keys).
