Skip to content

Pretrained models

Corentin Jemine edited this page Jun 21, 2019 · 11 revisions

Pretrained models come as an archive that contains all three models (speaker encoder, synthesizer, vocoder). The archive comes with the same directory structure as the repo, and you're expected to merge its contents with the root of the repository.

Initial commit (latest release) [Google drive]

Encoder: trained 1.56M steps (20 days with a single GPU) with a batch size of 64
Synthesizer: trained 256k steps (1 week with 4 GPUs) with a batch size of 144
Vocoder: trained 428k steps (4 days with a single GPU) with a batch size of 100