docs

2025-12-16 11:47:48 +01:00 · 2025-12-05 16:55:14 +02:00
parent 61632464bd
commit bdbd028bc8
3 changed files with 73 additions and 6 deletions
--- a/docs/user-guide/agents/voice/chatterbox.md
+++ b/docs/user-guide/agents/voice/chatterbox.md
@@ -2,6 +2,15 @@
 Local zero shot voice cloning from .wav files.
 !!! warning "FFmpeg Required"
    Chatterbox requires FFmpeg for audio processing. If you encounter errors like `Could not load libtorchcodec` or `FFmpeg version 8: Could not load this library`, you need to install FFmpeg.
    **Windows:** Run `install-ffmpeg.bat` from the Talemate root directory.
    **Linux/macOS:** Install FFmpeg using your system package manager (versions 4-8 supported).
    See the [TTS Troubleshooting Guide](troubleshooting.md#ffmpeg-not-found) for more details.
 ![Chatterbox API settings](/talemate/img/0.32.0/chatterbox-api-settings.png)
 ##### Device
--- a/docs/user-guide/agents/voice/index.md
+++ b/docs/user-guide/agents/voice/index.md
@@ -13,14 +13,18 @@ In 0.32.0 Talemate's TTS (Text-to-Speech) agent has been completely refactored t
 ## Supported APIs
 ### Local APIs
- **Kokoro** - Fastest generation with predefined voice models and mixing
+- **[Kokoro](kokoro.md)** - Fastest generation with predefined voice models and mixing
- **F5-TTS** - Fast voice cloning with occasional mispronunciations
+- **[F5-TTS](f5tts.md)** - Fast voice cloning with occasional mispronunciations
- **Chatterbox** - High-quality voice cloning (slower generation)
+- **[Chatterbox](chatterbox.md)** - High-quality voice cloning (slower generation)
 ### Remote APIs
- **ElevenLabs** - Professional voice synthesis with voice cloning
+- **[ElevenLabs](elevenlabs.md)** - Professional voice synthesis with voice cloning
- **Google Gemini-TTS** - Google's text-to-speech service
+- **[Google Gemini-TTS](google.md)** - Google's text-to-speech service
- **OpenAI** - OpenAI's TTS-1 and TTS-1-HD models
+- **[OpenAI](openai.md)** - OpenAI's TTS-1 and TTS-1-HD models
 ## Troubleshooting
 Having issues with TTS? See the [TTS Troubleshooting Guide](troubleshooting.md) for common problems and solutions, including FFmpeg installation and audio playback issues.
 ## Enable the Voice agent
--- a/docs/user-guide/agents/voice/troubleshooting.md
+++ b/docs/user-guide/agents/voice/troubleshooting.md
@@ -0,0 +1,54 @@
 # TTS Troubleshooting
 Common issues and solutions for Text-to-Speech functionality in Talemate.
 ## FFmpeg Not Found
 Several TTS providers (including [Chatterbox](chatterbox.md) and potentially others) require FFmpeg for audio processing.
 ### Symptoms
 You may encounter errors like:
 ```
 Could not load libtorchcodec. Likely causes:
 1. FFmpeg is not properly installed in your environment. We support
 versions 4, 5, 6, 7, and 8.
 2. The PyTorch version is not compatible with this version of TorchCodec.
 ```
 Or:
 ```
 FFmpeg version 8: Could not load this library
 ```
 Or generic FFmpeg-related import/loading errors.
 ### Solution
 #### Windows
 Run the included `install-ffmpeg.bat` script from the Talemate root directory:
 ```batch
 install-ffmpeg.bat
 ```
 This will automatically download and install FFmpeg 8.0.1 into your virtual environment.
 #### Linux/macOS
 Install FFmpeg using your system's package manager. FFmpeg versions 4, 5, 6, 7, or 8 are supported.
 ### Verification
 After installing FFmpeg, verify it's accessible by running:
 ```bash
 ffmpeg -version
 ```
 You should see output showing the FFmpeg version (4.x, 5.x, 6.x, 7.x, or 8.x are all supported).
 **Important:** Restart Talemate after installing FFmpeg for the changes to take effect.