4668 Commits

Author SHA1 Message Date
Eren Gölge
7442bcefa5 Remove deprecated files (#1873)
- samplers.py is moved
- distribute.py is replaces by the 👟Trainer
2022-08-15 12:16:37 +02:00
Eren Gölge
4333492341 Fix BCE loss issue (#1872)
* Fix BCE loss issue

* Remove import
2022-08-15 11:27:21 +02:00
jchai.me
c30b6485ea updates to dataset analysis notebooks for compatibility with latest version of TTS (#1853) 2022-08-15 11:11:07 +02:00
manmay nakhashi
e4db7c51b5 Update capacitron_layers.py (#1664)
crashing because of dimension miss match   at line no. 57
[batch, 256] vs [batch , 1, 512]
enc_out = torch.cat([enc_out, speaker_embedding], dim=-1)
2022-08-15 11:08:50 +02:00
Eren Gölge
bfc63829ac Implement bucketed weighted sampling for VITS (#1871) 2022-08-15 11:08:11 +02:00
Eren Gölge
d46fbc240c Introduce numpy and torch transforms (#1705)
* Refactor audio processing functions

* Add tests for numpy transforms

* Fix imports

* Fix imports2
2022-08-08 11:57:50 +02:00
manmay nakhashi
7fd9b89ebf fix get_random_embeddings --> get_random_embedding (#1726)
* fix get_random_embeddings --> get_random_embedding

function typo leads to training crash, no such function

* fix typo

get_random_embedding
2022-08-07 14:06:03 +02:00
rbaraglia
75ac9e3f0c Fix language flags generated by espeak-ng phonemizer (#1801)
* fix language flags generated by espeak-ng phonemizer

* Style

* Updated language flag regex to consider all language codes alike
2022-08-07 13:57:40 +02:00
Lars Kiesow
8c645080ac Adjust default to be able to process longer sentences (#1835)
Running `tts --text "$text" --out_path …` with a somewhat longer
sentences in the text will lead to warnings like “Decoder stopped with
max_decoder_steps 500” and the sentences just being cut off in the
resulting WAV file.

This happens quite frequently when feeding longer texts (e.g. a blog
post) to `tts`. It's particular frustrating since the error is not
always obvious in the output. You have to notice that there are missing
parts. This is something other users seem to have run into as well [1].

This patch simply increases the maximum number of steps allowed for the
tacotron decoder to fix this issue, resulting in a smoother default
behavior.

[1] https://github.com/mozilla/TTS/issues/734
2022-08-07 13:51:29 +02:00
p0p4k
903a77c197 Update wavenet.py (#1796)
* Update wavenet.py

Current version does not use "in_channels" argument. 
In glowTTS, we use normalizing flows and so "input dim" == "ouput dim" (channels and length). So, the existing code just uses hidden_channel sized tensor as input to first layer as well as outputs hidden_channel sized tensor. 
However, since it is a generic implementation, I believe it is better to update it for a more general use.

* "in_channels -> hidden_channels"
2022-08-01 12:20:37 +02:00
p0p4k
4fe50801b5 Update README.md; download progress bar in CLI. (#1797)
* Update README.md

- minor PR
- added model_info usage guide based on #1623 in README.md .

* "added tqdm bar for model download"

* Update manage.py

* fixed style

* fixed style

* sort imports
2022-08-01 12:17:47 +02:00
p0p4k
d9bad91a66 Update requirements.txt; inflect==5.6 (#1809)
New inflect version (6.0) depends on pydantic which has some issues irrelevant to 🐸 TTS. #1808 
Force inflect==5.6 (pydantic free) install to solve dependency issue.
2022-08-01 11:48:02 +02:00
Eren G??lge
7d8b1665c8 Fix rand_segment edge case (input_len == seg_len - 1) 2022-08-01 11:37:45 +02:00
vanIvan
5094499eba Fix & update WaveRNN vocoder model (#1749)
* Fixes KeyError bug. Adding logging to dashboard.

* Make pep8 compliant

* Make style compliant

* Still fixing style
2022-07-26 15:05:11 +02:00
Yuri Pourre
1a065fa6ed Update README.md (#1776)
Fix typo in different and code sample
2022-07-26 13:28:21 +02:00
p0p4k
669966d963 Update requirements.txt (#1791)
Support for #1775
2022-07-26 13:06:40 +02:00
p0p4k
10195c4eba Update decoder.py (#1792)
Minor comment correction.
2022-07-26 13:06:06 +02:00
Tsai Meng-Ting
9d32cbc3db Fix type in download_vctk.sh (#1739)
typo in comment
2022-07-20 12:27:42 +02:00
ivan provalov
903d9c791a Fix for FloorDiv Function Warning (#1760)
* Fix for Floor Function Warning

Fix for Floor Function Warning

* Adding double quotes to fix formatting

Adding double quotes to fix formatting

* Update glow_tts.py

* Update glow_tts.py
2022-07-20 11:31:22 +02:00
WeberJulian
4f31402227 Fix aux tests (#1753)
* Set n_jobs to 1 for resample script

* Delete resample test

* Set n_jobs 1 in vad test

* delete vad test

* Revert "Delete resample test"

This reverts commit bb7c8466af.

* Remove tests with resample
2022-07-19 10:06:31 +02:00
Eren Gölge
f7587fc134 Fix SSIM loss correction 2022-07-13 10:47:12 +02:00
Eren Gölge
bc1f93c299 Fix device allocation 2022-07-12 19:05:25 +02:00
Eren Gölge
49bac724c0 Implement VitsAudioConfig (#1556)
* Implement VitsAudioConfig

* Update VITS LJSpeech recipe

* Update VITS VCTK recipe

* Make style

* Add missing decorator

* Add missing param

* Make style

* Update recipes

* Fix test

* Bug fix

* Exclude tests folder

* Make linter

* Make style
2022-07-12 18:49:58 +02:00
a-froghyar
34b80e0280 feat: updated recipes and lr fix (#1718)
- updated the recipes activating more losses for more stable training
- re-enabling guided attention loss
- fixed a bug about not the correct lr fetched for logging
2022-07-12 15:00:53 +02:00
Eren G??lge
48a4f3647f Make lint 2022-07-12 14:58:26 +02:00
WeberJulian
c614f21982 Add durations as aux input for VITS (#1694)
* Add durations as aux input for VITS

* Make style

* Fix tts_tests

* Fix test_get_aux_input
2022-07-12 14:25:21 +02:00
Eren G??lge
2cf89b88c9 Make style 2022-07-12 14:12:57 +02:00
Eren G??lge
a6f73a18cb Fix BCELoss adressing #1192 2022-07-12 14:11:34 +02:00
Eren G??lge
eefd482f51 Separate loss tests 2022-07-12 12:35:46 +02:00
Eren G??lge
c17ff17a18 Fix SSIM loss 2022-07-12 12:35:24 +02:00
Eren G??lge
f1e35596e8 Remove redundant config field 2022-07-11 13:39:41 +02:00
WeberJulian
5cef6facb0 Fix tokenizer for punc only (#1717) 2022-07-06 22:59:41 +02:00
WeberJulian
9e00e31e37 Fix Publish CI (#1597)
* Try out manylinux

* temporary removal of useless pipeline

* remove check and use only manylinux

* Try --plat-name

* Add install requirements

* Add back other actions

* Add PR trigger

* Remove conditions

* Fix sythax

* Roll back some changes

* Add other python versions

* Add test pypi upload

* Add username

* Add back __token__ as username

* Modify name of entry to testpypi

* Set it to release only

* Fix version checking
2022-07-05 11:07:33 +02:00
camillem
5c821d9fa1 Fix the --model_name and --vocoder_name arguments need a <model_type> element (#1469)
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-06-27 10:32:43 +02:00
manmay nakhashi
577ec406f4 Fix checkpointing GAN models (#1641)
* checkpoint sae step crash fix

* checkpoint save step crash fix

* Update gan.py

updated requested changes

* crash fix
2022-06-22 12:07:46 +02:00
Eren G??lge
00e67092d8 Bump up to v0.7.1 2022-06-21 14:12:55 +02:00
Eren G??lge
3328be7a8e Remove GL message 2022-06-21 12:39:31 +02:00
WeberJulian
30c72e0d05 Add Thorsten VITS model (#1675)
Co-authored-by: Eren Gölge <egolge@coqui.ai>
2022-06-21 11:39:49 +02:00
p0p4k
71281ff1e4 Add support for model_info in CLI (#1623)
* model_info

* model_info

* model_info_by_idx and name

* model_info_by_idx and name

* model_info

* Update manage.py

* fixed linter

* fixed linter

* fixed linter

* fixed linter

* fixed return style checks

* fixed linter

* fixed linter

* fixed idx always positive

* added comments

* fix parser.args check

* fix parser.args check

* Make style

Co-authored-by: Eren G??lge <egolge@coqui.ai>
2022-06-20 23:28:17 +02:00
Eren G??lge
8b75e8be9c Bump up to v0.7.0 2022-06-20 13:50:09 +02:00
WeberJulian
6126c23498 Add synpaflex formatter (#1616)
* Add synpaflex formatter

* Fix formatter

* Make style
2022-06-20 13:36:26 +02:00
klotlabs
c44e39d9d6 Update training_a_model.md (#1620)
edited `...servers our needs.` to `...serves your needs.`
2022-06-08 23:40:05 +02:00
WeberJulian
f09ea11c71 Internal formatter (#1629)
* Add coqui formatter

* Make style
2022-06-08 14:31:03 +02:00
Aya-AlJafari
68cef28a88 Adding TTS Tutorials (#1584)
* Adding inferencing notebook

* added multispeaker explanation and usecase and renamed the file

* Adding training tutorial

* fixed dummy paths

* fixed review comments

* fixed metadata extension

Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-06-02 12:23:00 +02:00
Eren Gölge
f70e82cd19 Use fsspec and torch for embedding file IO (#1581)
* Use fsspec and torch for embedding file

* Fixup

* Fix load and save files

* Fix compute embedding script

* Set use_cuda to true if available

* Add dummy speakers.pth file

* Make style

* Change default speakers file extension

Co-authored-by: WeberJulian <julian.weber@hotmail.fr>
2022-06-01 13:49:42 +02:00
Ryan Le-Nguyen
b6bd74a9a9 fix invalid json (#1599) 2022-05-31 10:20:10 +02:00
Noran Raskin
a790df4e94 Training recipes for thorsten dataset (#1020)
* Fix style

* Fix isort

* Remove tensorboardX from requirements

Co-authored-by: logan hart <72301874+loganhart420@users.noreply.github.com>
Co-authored-by: Eren Gölge <egolge@coqui.ai>
speaker_encoder_model
2022-05-30 12:07:31 +02:00
Eren Gölge
71111d14e4 Merge pull request #1587 from ribeiromiranda/patch-1
Fixed use_cuda issue in compute_embeddings.py
2022-05-29 14:51:08 +02:00
André R. de Miranda
3b84ef9524 Fixed use_cuda issue in compute_embeddings.py
Added use_cuda argument in self.init_encoder method
2022-05-20 12:46:46 -03:00
a-froghyar
8be21ec387 Capacitron (#977)
* new CI config

* initial Capacitron implementation

* delete old unused file

* fix empty formatting changes

* update losses and training script

* fix previous commit

* fix commit

* Add Capacitron test and first round of test fixes

* revert formatter change

* add changes to the synthesizer

* add stepwise gradual lr scheduler and changes to the recipe

* add inference script for dev use

* feat: add posterior inference arguments to synth methods
- added reference wav and text args for posterior inference
- some formatting

* fix: add espeak flag to base_tts and dataset APIs
- use_espeak_phonemes flag was not implemented in those APIs
- espeak is now able to be utilised for phoneme generation
- necessary phonemizer for the Capacitron model

* chore: update training script and style
- training script includes the espeak flag and other hyperparams
- made style

* chore: fix linting

* feat: add Tacotron 2 support

* leftover from dev

* chore:rename parser args

* feat: extract optimizers
- created a separate optimizer class to merge the two optimizers

* chore: revert arbitrary trainer changes

* fmt: revert formatting bug

* formatting again

* formatting fixed

* fix: log func

* fix: update optimizer
- Implemented load_state_dict for continuing training

* fix: clean optimizer init for standard models

* improvement: purge espeak flags and add training scripts

* Delete capacitronT2.py

delete old training script, new one is pushed

* feat: capacitron trainer methods
- extracted capacitron specific training  operations from the trainer into custom
methods in taco1 and taco2 models

* chore: renaming and merging capacitron and gst style args

* fix: bug fixes from the previous commit

* fix: implement state_dict method on CapacitronOptimizer

* fix: call method

* fix: inference naming

* Delete train_capacitron.py

* fix: synthesize

* feat: update tests

* chore: fix style

* Delete capacitron_inference.py

* fix: fix train tts t2 capacitron tests

* fix: double forward in T2 train step

* fix: double forward in T1 train step

* fix: run make style

* fix: remove unused import

* fix: test for T1 capacitron

* fix: make lint

* feat: add blizzard2013 recipes

* make style

* fix: update recipes

* chore: make style

* Plot test sentences in Tacotron

* chore: make style and fix import

* fix: call forward first before problematic floordiv op

* fix: update recipes

* feat: add min_audio_len to recipes

* aux_input["style_mel"]

* chore: make style

* Make capacitron T2 recipe more stable

* Remove T1 capacitron Ljspeech

* feat: implement new grad clipping routine and update configs

* make style

* Add pretrained checkpoints

* Add default vocoder

* Change trainer package

* Fix grad clip issue for tacotron

* Fix scheduler issue with tacotron

Co-authored-by: Eren Gölge <egolge@coqui.ai>
Co-authored-by: WeberJulian <julian.weber@hotmail.fr>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2022-05-20 16:17:11 +02:00