docs

2025-12-16 11:47:48 +01:00 · 2025-12-05 16:55:14 +02:00
parent 61632464bd
commit bdbd028bc8
3 changed files with 73 additions and 6 deletions
--- a/docs/user-guide/agents/voice/chatterbox.md
+++ b/docs/user-guide/agents/voice/chatterbox.md
@@ -2,6 +2,15 @@

 Local zero shot voice cloning from .wav files.

+!!! warning "FFmpeg Required"
+    Chatterbox requires FFmpeg for audio processing. If you encounter errors like `Could not load libtorchcodec` or `FFmpeg version 8: Could not load this library`, you need to install FFmpeg.
+
+    **Windows:** Run `install-ffmpeg.bat` from the Talemate root directory.
+
+    **Linux/macOS:** Install FFmpeg using your system package manager (versions 4-8 supported).
+
+    See the [TTS Troubleshooting Guide](troubleshooting.md#ffmpeg-not-found) for more details.
+
 ![Chatterbox API settings](/talemate/img/0.32.0/chatterbox-api-settings.png)

 ##### Device
--- a/docs/user-guide/agents/voice/index.md
+++ b/docs/user-guide/agents/voice/index.md
@@ -13,14 +13,18 @@ In 0.32.0 Talemate's TTS (Text-to-Speech) agent has been completely refactored t
 ## Supported APIs

 ### Local APIs
- **Kokoro** - Fastest generation with predefined voice models and mixing
- **F5-TTS** - Fast voice cloning with occasional mispronunciations
- **Chatterbox** - High-quality voice cloning (slower generation)
+- **[Kokoro](kokoro.md)** - Fastest generation with predefined voice models and mixing
+- **[F5-TTS](f5tts.md)** - Fast voice cloning with occasional mispronunciations
+- **[Chatterbox](chatterbox.md)** - High-quality voice cloning (slower generation)

 ### Remote APIs
- **ElevenLabs** - Professional voice synthesis with voice cloning
- **Google Gemini-TTS** - Google's text-to-speech service
- **OpenAI** - OpenAI's TTS-1 and TTS-1-HD models
+- **[ElevenLabs](elevenlabs.md)** - Professional voice synthesis with voice cloning
+- **[Google Gemini-TTS](google.md)** - Google's text-to-speech service
+- **[OpenAI](openai.md)** - OpenAI's TTS-1 and TTS-1-HD models
+
+## Troubleshooting
+
+Having issues with TTS? See the [TTS Troubleshooting Guide](troubleshooting.md) for common problems and solutions, including FFmpeg installation and audio playback issues.

 ## Enable the Voice agent

--- a/docs/user-guide/agents/voice/troubleshooting.md
+++ b/docs/user-guide/agents/voice/troubleshooting.md
@@ -0,0 +1,54 @@
+# TTS Troubleshooting
+
+Common issues and solutions for Text-to-Speech functionality in Talemate.
+
+## FFmpeg Not Found
+
+Several TTS providers (including [Chatterbox](chatterbox.md) and potentially others) require FFmpeg for audio processing.
+
+### Symptoms
+
+You may encounter errors like:
+
+```
+Could not load libtorchcodec. Likely causes:
+ 1. FFmpeg is not properly installed in your environment. We support
+ versions 4, 5, 6, 7, and 8.
+ 2. The PyTorch version is not compatible with this version of TorchCodec.
+```
+
+Or:
+
+```
+FFmpeg version 8: Could not load this library
+```
+
+Or generic FFmpeg-related import/loading errors.
+
+### Solution
+
+#### Windows
+
+Run the included `install-ffmpeg.bat` script from the Talemate root directory:
+
+```batch
+install-ffmpeg.bat
+```
+
+This will automatically download and install FFmpeg 8.0.1 into your virtual environment.
+
+#### Linux/macOS
+
+Install FFmpeg using your system's package manager. FFmpeg versions 4, 5, 6, 7, or 8 are supported.
+
+### Verification
+
+After installing FFmpeg, verify it's accessible by running:
+
+```bash
+ffmpeg -version
+```
+
+You should see output showing the FFmpeg version (4.x, 5.x, 6.x, 7.x, or 8.x are all supported).
+
+**Important:** Restart Talemate after installing FFmpeg for the changes to take effect.