This commit is contained in:
vegu-ai-tools
2025-12-05 16:55:14 +02:00
parent 61632464bd
commit bdbd028bc8
3 changed files with 73 additions and 6 deletions

View File

@@ -2,6 +2,15 @@
Local zero shot voice cloning from .wav files. Local zero shot voice cloning from .wav files.
!!! warning "FFmpeg Required"
Chatterbox requires FFmpeg for audio processing. If you encounter errors like `Could not load libtorchcodec` or `FFmpeg version 8: Could not load this library`, you need to install FFmpeg.
**Windows:** Run `install-ffmpeg.bat` from the Talemate root directory.
**Linux/macOS:** Install FFmpeg using your system package manager (versions 4-8 supported).
See the [TTS Troubleshooting Guide](troubleshooting.md#ffmpeg-not-found) for more details.
![Chatterbox API settings](/talemate/img/0.32.0/chatterbox-api-settings.png) ![Chatterbox API settings](/talemate/img/0.32.0/chatterbox-api-settings.png)
##### Device ##### Device

View File

@@ -13,14 +13,18 @@ In 0.32.0 Talemate's TTS (Text-to-Speech) agent has been completely refactored t
## Supported APIs ## Supported APIs
### Local APIs ### Local APIs
- **Kokoro** - Fastest generation with predefined voice models and mixing - **[Kokoro](kokoro.md)** - Fastest generation with predefined voice models and mixing
- **F5-TTS** - Fast voice cloning with occasional mispronunciations - **[F5-TTS](f5tts.md)** - Fast voice cloning with occasional mispronunciations
- **Chatterbox** - High-quality voice cloning (slower generation) - **[Chatterbox](chatterbox.md)** - High-quality voice cloning (slower generation)
### Remote APIs ### Remote APIs
- **ElevenLabs** - Professional voice synthesis with voice cloning - **[ElevenLabs](elevenlabs.md)** - Professional voice synthesis with voice cloning
- **Google Gemini-TTS** - Google's text-to-speech service - **[Google Gemini-TTS](google.md)** - Google's text-to-speech service
- **OpenAI** - OpenAI's TTS-1 and TTS-1-HD models - **[OpenAI](openai.md)** - OpenAI's TTS-1 and TTS-1-HD models
## Troubleshooting
Having issues with TTS? See the [TTS Troubleshooting Guide](troubleshooting.md) for common problems and solutions, including FFmpeg installation and audio playback issues.
## Enable the Voice agent ## Enable the Voice agent

View File

@@ -0,0 +1,54 @@
# TTS Troubleshooting
Common issues and solutions for Text-to-Speech functionality in Talemate.
## FFmpeg Not Found
Several TTS providers (including [Chatterbox](chatterbox.md) and potentially others) require FFmpeg for audio processing.
### Symptoms
You may encounter errors like:
```
Could not load libtorchcodec. Likely causes:
1. FFmpeg is not properly installed in your environment. We support
versions 4, 5, 6, 7, and 8.
2. The PyTorch version is not compatible with this version of TorchCodec.
```
Or:
```
FFmpeg version 8: Could not load this library
```
Or generic FFmpeg-related import/loading errors.
### Solution
#### Windows
Run the included `install-ffmpeg.bat` script from the Talemate root directory:
```batch
install-ffmpeg.bat
```
This will automatically download and install FFmpeg 8.0.1 into your virtual environment.
#### Linux/macOS
Install FFmpeg using your system's package manager. FFmpeg versions 4, 5, 6, 7, or 8 are supported.
### Verification
After installing FFmpeg, verify it's accessible by running:
```bash
ffmpeg -version
```
You should see output showing the FFmpeg version (4.x, 5.x, 6.x, 7.x, or 8.x are all supported).
**Important:** Restart Talemate after installing FFmpeg for the changes to take effect.