# symbolic-music-diffusion

**Repository Path**: xuuu3/symbolic-music-diffusion

## Basic Information

- **Project Name**: symbolic-music-diffusion
- **Description**: .。。。。。。。。。。。。。。。。。。。。。。
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2022-09-27
- **Last Updated**: 2022-09-27

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Symbolic Music Generation with Diffusion Models
Supplementary code release for our work [Symbolic Music Generation with Diffusion Models](https://archives.ismir.net/ismir2021/paper/000058.pdf).

## Installation
All code is written in Python 3 ([Anaconda](https://www.anaconda.com/) recommended). To install the dependencies:

```
pip install -r requirements.txt
```

A copy of the [Magenta](https://github.com/magenta/magenta) codebase is required for access to MusicVAE and related components. [Installation instructions](https://github.com/magenta/magenta#installation) can be found on the Magenta public repository. You will also need to download [pretrained MusicVAE checkpoints](https://github.com/magenta/magenta/tree/master/magenta/models/music_vae). For our experiments, we use the [2-bar melody model](https://storage.googleapis.com/magentadata/models/music_vae/checkpoints/cat-mel_2bar_big.tar).

## Datasets
We use the [Lakh MIDI Dataset](https://colinraffel.com/projects/lmd/) to train our models. Follow [these instructions](https://github.com/magenta/magenta/blob/master/magenta/scripts/README.md) to download and build the Lakh MIDI Dataset.


To encode the Lakh dataset with MusicVAE, use `scripts/generate_song_data_beam.py`:
```
python scripts/generate_song_data_beam.py \
  --checkpoint=/path/to/musicvae-ckpt \
  --input=/path/to/lakh_tfrecords \
  --output=/path/to/encoded_tfrecords
``` 

To preprocess and generate fixed-length latent sequences for training diffusion and autoregressive models, refer to `scripts/transform_encoded_data.py`:
```
python scripts/transform_encoded_data.py \
  --encoded_data=/path/to/encoded_tfrecords \
  --output_path =/path/to/preprocess_tfrecords \
  --mode=sequences \
  --context_length=32
```
## Training
#### Diffusion
```python train_ncsn.py --flagfile=configs/ddpm-mel-32seq-512.cfg```

#### TransformerMDN
```python train_mdn.py --flagfile=configs/mdn-mel-32seq-512.cfg```

## Sampling and Generation
#### Diffusion
```
python sample_ncsn.py \
  --flagfile=configs/ddpm-mel-32seq-512.cfg \
  --sample_seed=42 \
  --sample_size=1000 \
  --sampling_dir=/path/to/latent-samples 
```

#### TransformerMDN
```
python sample_ncsn.py \
  --flagfile=configs/mdn-mel-32seq-512.cfg \
  --sample_seed=42 \
  --sample_size=1000 \
  --sampling_dir=/path/to/latent-samples 
```

#### Decoding sequences
To convert sequences of embeddings (generated by diffusion or TransformerMDN models) to sequences of MIDI events, refer to `scripts/sample_audio.py`.

```
python scripts/sample_audio.py
  --input=/path/to/latent-samples/[ncsn|mdn] \
  --output=/path/to/audio-midi \
  --n_synth=1000 \
  --include_wav=True
```

## Citing
If you use this code please cite it as:

```
@inproceedings{
  mittal2021symbolicdiffusion,
  title={Symbolic Music Generation with Diffusion Models},
  author={Gautam Mittal and Jesse Engel and Curtis Hawthorne and Ian Simon},
  booktitle={Proceedings of the 22nd International Society for Music Information Retrieval Conference},
  year={2021},
  url={https://archives.ismir.net/ismir2021/paper/000058.pdf}
}
```

## Note
This is not an official Google product.