# symbolic-music-diffusion **Repository Path**: xuuu3/symbolic-music-diffusion ## Basic Information - **Project Name**: symbolic-music-diffusion - **Description**: .。。。。。。。。。。。。。。。。。。。。。。 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-09-27 - **Last Updated**: 2022-09-27 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Symbolic Music Generation with Diffusion Models Supplementary code release for our work [Symbolic Music Generation with Diffusion Models](https://archives.ismir.net/ismir2021/paper/000058.pdf). ## Installation All code is written in Python 3 ([Anaconda](https://www.anaconda.com/) recommended). To install the dependencies: ``` pip install -r requirements.txt ``` A copy of the [Magenta](https://github.com/magenta/magenta) codebase is required for access to MusicVAE and related components. [Installation instructions](https://github.com/magenta/magenta#installation) can be found on the Magenta public repository. You will also need to download [pretrained MusicVAE checkpoints](https://github.com/magenta/magenta/tree/master/magenta/models/music_vae). For our experiments, we use the [2-bar melody model](https://storage.googleapis.com/magentadata/models/music_vae/checkpoints/cat-mel_2bar_big.tar). ## Datasets We use the [Lakh MIDI Dataset](https://colinraffel.com/projects/lmd/) to train our models. Follow [these instructions](https://github.com/magenta/magenta/blob/master/magenta/scripts/README.md) to download and build the Lakh MIDI Dataset. To encode the Lakh dataset with MusicVAE, use `scripts/generate_song_data_beam.py`: ``` python scripts/generate_song_data_beam.py \ --checkpoint=/path/to/musicvae-ckpt \ --input=/path/to/lakh_tfrecords \ --output=/path/to/encoded_tfrecords ``` To preprocess and generate fixed-length latent sequences for training diffusion and autoregressive models, refer to `scripts/transform_encoded_data.py`: ``` python scripts/transform_encoded_data.py \ --encoded_data=/path/to/encoded_tfrecords \ --output_path =/path/to/preprocess_tfrecords \ --mode=sequences \ --context_length=32 ``` ## Training #### Diffusion ```python train_ncsn.py --flagfile=configs/ddpm-mel-32seq-512.cfg``` #### TransformerMDN ```python train_mdn.py --flagfile=configs/mdn-mel-32seq-512.cfg``` ## Sampling and Generation #### Diffusion ``` python sample_ncsn.py \ --flagfile=configs/ddpm-mel-32seq-512.cfg \ --sample_seed=42 \ --sample_size=1000 \ --sampling_dir=/path/to/latent-samples ``` #### TransformerMDN ``` python sample_ncsn.py \ --flagfile=configs/mdn-mel-32seq-512.cfg \ --sample_seed=42 \ --sample_size=1000 \ --sampling_dir=/path/to/latent-samples ``` #### Decoding sequences To convert sequences of embeddings (generated by diffusion or TransformerMDN models) to sequences of MIDI events, refer to `scripts/sample_audio.py`. ``` python scripts/sample_audio.py --input=/path/to/latent-samples/[ncsn|mdn] \ --output=/path/to/audio-midi \ --n_synth=1000 \ --include_wav=True ``` ## Citing If you use this code please cite it as: ``` @inproceedings{ mittal2021symbolicdiffusion, title={Symbolic Music Generation with Diffusion Models}, author={Gautam Mittal and Jesse Engel and Curtis Hawthorne and Ian Simon}, booktitle={Proceedings of the 22nd International Society for Music Information Retrieval Conference}, year={2021}, url={https://archives.ismir.net/ismir2021/paper/000058.pdf} } ``` ## Note This is not an official Google product.