forked from yl4579/StyleTTS2
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
12 changed files
with
213 additions
and
54 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -66,5 +66,5 @@ | |
"use_spectral_norm": false, | ||
"gin_channels": 256 | ||
}, | ||
"version": "1.2" | ||
"version": "1.3" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# CLI | ||
|
||
|
||
|
||
## Dataset | ||
|
||
`Dataset.bat` webui (`python webui_dataset.py`) consists of **slice audio** and **transcribe wavs**. | ||
|
||
### Slice audio | ||
|
||
```bash | ||
python slice.py -i <input_dir> -o <output_dir> -m <min_sec> -M <max_sec> | ||
``` | ||
|
||
Required: | ||
- `input_dir`: Path to the directory containing the audio files to slice. | ||
- `output_dir`: Path to the directory where the sliced audio files will be saved. | ||
|
||
Optional: | ||
- `min_sec`: Minimum duration of the sliced audio files in seconds (default 2). | ||
- `max_sec`: Maximum duration of the sliced audio files in seconds (default 12). | ||
|
||
### Transcribe wavs | ||
|
||
```bash | ||
python transcribe.py -i <input_dir> -o <output_file> --speaker_name <speaker_name> | ||
``` | ||
|
||
Required: | ||
- `input_dir`: Path to the directory containing the audio files to transcribe. | ||
- `output_file`: Path to the file where the transcriptions will be saved. | ||
- `speaker_name`: Name of the speaker. | ||
|
||
Optional | ||
- `--initial_prompt`: Initial prompt to use for the transcription (default value is specific to Japanese). | ||
- `--device`: `cuda` or `cpu` (default: `cuda`). | ||
- `--language`: `jp`, `en`, or `en` (default: `jp`). | ||
- `--model`: Whisper model, default: `large-v3` | ||
- `--compute_type`: default: `bfloat16` | ||
|
||
## Train | ||
|
||
`Train.bat` webui (`python webui_train.py`) consists of the following. | ||
|
||
### Preprocess audio | ||
```bash | ||
python resample.py -i <input_dir> -o <output_dir> [--normalize] [--trim] | ||
``` | ||
|
||
Required: | ||
- `input_dir`: Path to the directory containing the audio files to preprocess. | ||
- `output_dir`: Path to the directory where the preprocessed audio files will be saved. | ||
|
||
TO BE WRITTEN (WIP) | ||
|
||
これいる? |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.