mirror of https://github.com/liuhaozhe6788/voice-cloning-collab.git synced 2025-12-16 11:48:12 +01:00

Go to file

liuhaozhe6788 0577fcee5a Merge pull request #21 from liuhaozhe6788/develop

Develop

2022-12-01 14:21:43 +08:00

encoder

修改gitignore

2022-11-18 11:22:55 +08:00

samples

修改gitignore

2022-11-18 11:28:27 +08:00

synthesizer

修改误差文件存储的路径

2022-11-18 11:57:07 +08:00

toolbox

initial commit

2022-11-04 16:38:13 +08:00

utils

initial commit

2022-11-04 16:38:13 +08:00

vocoder

修改误差文件存储的路径

2022-11-18 11:57:07 +08:00

.gitignore

git commit -m "Untrack files in .gitignore"

2022-11-18 14:33:35 +08:00

attention.png

文本划分中处理小数的特例，以及变速中处理输出音频速度为0的异常

2022-11-10 16:45:57 +08:00

demo_cli.py

merge the singal senteces to input

2022-12-01 10:27:37 +08:00

demo_toolbox.py

initial commit

2022-11-04 16:38:13 +08:00

encoder_preprocess.py

encoder预处理数据集中增加librispeech

2022-11-17 20:41:22 +08:00

encoder_train.py

修改模型的路径

2022-11-17 21:36:21 +08:00

fixSpeed.py

文本划分中处理小数的特例，以及变速中处理输出音频速度为0的异常

2022-11-10 16:45:57 +08:00

LICENSE.md

initial commit

2022-11-04 16:38:13 +08:00

myspsolution.praat

initial commit

2022-11-04 16:38:13 +08:00

README.md

修改reademe

2022-12-01 09:55:28 +08:00

requirements.txt

initial commit

2022-11-04 16:38:13 +08:00

synthesizer_preprocess_audio.py

initial commit

2022-11-04 16:38:13 +08:00

synthesizer_preprocess_embeds.py

修改模型的路径

2022-11-17 21:36:21 +08:00

synthesizer_train.py

修改模型的路径

2022-11-17 21:36:21 +08:00

text.txt

文本划分中处理小数的特例，以及变速中处理输出音频速度为0的异常

2022-11-10 16:45:57 +08:00

update_plot.py

initial commit

2022-11-04 16:38:13 +08:00

vocoder_preprocess.py

修改模型的路径

2022-11-17 21:36:21 +08:00

vocoder_train.py

修改误差文件存储的路径

2022-11-18 11:57:07 +08:00

voxceleb1test_preprocess.py

initial commit

2022-11-04 16:38:13 +08:00

voxceleb1test.py

initial commit

2022-11-04 16:38:13 +08:00

README.md

Voice-Cloning

Installation

pip install -r requirements.txt

Training Commands

Encoder Preprocessing：

python encoder_preprocess.py <datasets_root>

Encoder Training：

python encoder_train.py my_run <datasets_root>/SV2TTS/encoder

Synthesizer Preprocessing:

python synthesizer_preprocess_audio.py <datasets_root>
python synthesizer_preprocess_embeds.py <datasets_root>/SV2TTS/synthesizer

Synthesizer Training:

python synthesizer_train.py my_run <datasets_root>/SV2TTS/synthesizer

Vocoder Preprocessing:

python vocoder_preprocess.py <datasets_root>

Vocoder Training:

python vocoder_train.py my_run <datasets_root>

Inference Commands

Terminal:

python demo_cli.py

GUI:

python demo_toolbox.py

Version updates

2022.05.19： We calculated GE2E loss in encoder with cuda rather than originally-configured CPU. It speeds up the encoder training speed.
2022.07.15： We added Loss animation plot for synthesizer and vocoder.
2022.07.19： We added response time and Griffin-Lim vocoder results for demo_toolbox.
2022.07.29： We added model validation for encoder, synthesizer and vocoder.
2022.08.02： We added voxceleb train and dev data for encoder. We added noise reduce method for the output wav from vocoder.
noisereduce reference: https://github.com/timsainb/noisereduce
2022.08.06： We split the long text into short sentences using spacy for input of synthesizer. Make sure to install English dataset en_core_web_sm, say by python -m spacy download en_core_web_sm
2022.09.02： We set prop_decrease=0.6 for male and 0.9 for female in noisereduce function.(输出滤波，男女声使用不同的滤波参数)
2022.09.26： We added speed adjustment(声音变速) for output audios using praat, install parselmouth using pip: pip install praat-parselmouth
2022.10.10： We added voice filter functioning(声音美颜) for input audios, the weight ratio of the input audio embed and the standard audio embed is 7: 3.
2022.10.25： We set small values(<0.06) to zeros in embed.(对嵌入向量较小值置零)
2022.10.26： The split frequency for input audio is 170Hz. The split frequency for output noise reduce is 165Hz.
2022.12.01： merge the single sentences to input.

README.md Unescape Escape

Voice-Cloning

Installation

Training Commands

Inference Commands

Version updates

README.md