LQVAE-separation

Paper

Code for "Unsupervised Source Separation via Bayesian inference in the latent domain" - Link

Samples

GT Compressed Separated
Drums GT Compressed Drums Separated Drums
Bass GT Compressed Bass Separated Bass
Mix GT Compressed Mix Separated Mix

The separation is performed on a x64 compressed latent domain. The results can be upsampled via Jukebox upsamplers in order to increment perceptive quality (WIP).

https://user-images.githubusercontent.com/7619886/151544181-482e687d-538a-4886-b690-999dd4d64c42.mp4

https://user-images.githubusercontent.com/7619886/151544196-d43b0341-4214-4083-8b52-46574c02d800.mp4

https://user-images.githubusercontent.com/7619886/151544205-442793a6-b886-4c66-bc1a-6108abe260a7.mp4

Install

Install the conda package manager from https://docs.conda.io/en/latest/miniconda.html

conda create --name lqvae-separation python=3.7.5
conda activate lqvae-separation
sudo apt install mpich 
conda install mpi4py==3.0.3
pip install ffmpeg-python==0.2.0
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch 
(if problems try pip install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2)
pip install -r requirements.txt
pip install -e .

# for training
pip install av=7.0.1
pip install ./tensorboardX

Checkpoints

Separation with checkpoints

Training

LQ-VAE

PYTHONPATH=. mpiexec -n 1 python jukebox/train.py --hps=vqvae --sample_length=131072 --bs=8 
--audio_files_dir=data/train/ --labels=False --train --test --aug_shift --aug_blend --name=lq_vae --test_audio_files_dir=data/test

Priors

PYTHONPATH=. mpiexec -n 1 python jukebox/train.py --hps=vqvae,small_prior,all_fp16,cpu_ema --name=pior_source
 --audio_files_dir=data/source/train --test_audio_files_dir=data/source/test --labels=False --train --test --aug_shift
  --aug_blend --prior --levels=3 --level=2 --weight_decay=0.01 --save_iters=1000 --min_duration=24 --sample_length=1048576 
  --bs=16 --n_ctx=8192 --sample=True --sample_iters=1000 --restore_vqvae=logs/lq_vae/checkpoint_lq_vae.pth.tar

Codebook sums

PYTHONPATH=.. python codebook_precalc.py --save_path=checkpoints/codebook_sum_precalc.pt 
--restore_vqvae=../logs/lq_vae/checkpoint_lq_vae.pth.tar` --raw_to_tokens=64 --l_bins=2048
--sample_rate=22050 --alpha=[0.5, 0.5] --downs_t=(2, 2, 2) --commit=1.0 --emb_width=64

Separation with trained checkpoints

Evaluation

Future work

Citations

If you find the code useful for your research, please consider citing

@article{mancusi2021unsupervised,
  title={Unsupervised Source Separation via Bayesian Inference in the Latent Domain},
  author={Mancusi, Michele and Postolache, Emilian and Fumero, Marco and Santilli, Andrea and Cosmo, Luca and Rodol{\`a}, Emanuele},
  journal={arXiv preprint arXiv:2110.05313},
  year={2021}
}

as well as the Jukebox baseline: