Clone of the official FreeV, mel-to-wave vocoders with Pseudo Inversed Mel Filter.
One liner code:
model_input = mel_spec @ mel_filter.pinverse().abs().clamp_min(1e-5)Official demo page.
git clone https://github.com/BakerBunker/FreeV.git
cd FreeV
pip install -r requirements.txtI tried using PGHI(Phase Gradient Heap Integration) as phase spec initialization. But sadly it didn't work.
| Model | Config File | Train Script |
|---|---|---|
| APNet2 | config.json | train.py |
| APNet2 w/pghi | config_pghi.json | train_pghi.py |
| FreeV | config2.json | train2.py |
| FreeV w/pghi | config2_pghi.json | train2_pghi.py |
Model checkpoints and tensorboard training logs available at: huggingface
python <train-script>
Checkpoints and copy of the configuration file are saved in the checkpoint_path directory in config.json.
Modify the training and inference configuration by modifying the parameters in the config.json.
Download pretrained model on LJSpeech dataset at huggingface.
Modify the inference.py to inference.
@misc{2406.08196,
Author = {Yuanjun Lv and Hai Li and Ying Yan and Junhui Liu and Danming Xie and Lei Xie},
Title = {FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter},
Year = {2024},
Eprint = {arXiv:2406.08196},
}