0.35.0 (#242)
Major features: - Autonomous scene direction via director agent (replaces auto-direct) - Inline image display in scene feed - Character visuals tab for portrait/cover image management - Character message avatars with dynamic portrait selection - Pocket TTS and llama.cpp client support - Message appearance overhaul with configurable markdown display Improvements: - KoboldCpp: adaptive-p, min-p, presence/frequency penalty support - Setup wizard for initial configuration - Director chat action toggles - Visual agent: resolution presets, prompt revision, auto-analysis - Experimental concurrent requests for hosted LLM clients - Node editor alignment shortcuts (X/Y) and color picker Bugfixes: - Empty response retry loop - Client system prompt display - Character detail pins loading - ComfyUI workflow charset encoding - Various layout and state issues Breaking: Removed INSTRUCTOR embeddings
2
.gitattributes
vendored
Normal file
@@ -0,0 +1,2 @@
|
||||
# Force Unix line endings for shell scripts (prevents CRLF issues in Docker)
|
||||
*.sh text eol=lf
|
||||
2
.github/workflows/test.yml
vendored
@@ -11,7 +11,7 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
matrix:
|
||||
python-version: ['3.10', '3.11', '3.12', '3.13']
|
||||
python-version: ['3.11']
|
||||
fail-fast: false
|
||||
|
||||
steps:
|
||||
|
||||
3
.gitignore
vendored
@@ -32,4 +32,5 @@ scenes/
|
||||
tts_voice_samples/*.wav
|
||||
third-party-docs/
|
||||
legacy-state-reinforcements.yaml
|
||||
CLAUDE.md
|
||||
CLAUDE.md
|
||||
docs/update-progress
|
||||
|
||||
18
Dockerfile
@@ -48,6 +48,7 @@ RUN apt-get update && apt-get install -y \
|
||||
wget \
|
||||
tar \
|
||||
xz-utils \
|
||||
gettext-base \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install uv in the final stage
|
||||
@@ -77,6 +78,9 @@ COPY --from=backend-build /app/src /app/src
|
||||
# Copy Node.js build artifacts from frontend-build stage
|
||||
COPY --from=frontend-build /app/dist /app/talemate_frontend/dist
|
||||
|
||||
# Preserve index.html as template for runtime envsubst substitution
|
||||
COPY --from=frontend-build /app/dist/index.html /app/talemate_frontend/dist/index.template.html
|
||||
|
||||
# Copy the frontend WSGI file if it exists
|
||||
COPY frontend_wsgi.py /app/frontend_wsgi.py
|
||||
|
||||
@@ -84,7 +88,14 @@ COPY frontend_wsgi.py /app/frontend_wsgi.py
|
||||
COPY config.example.yaml /app/config.yaml
|
||||
|
||||
# Copy essentials
|
||||
COPY scenes templates chroma* /app/
|
||||
COPY scenes/ /app/scenes/
|
||||
COPY templates/ /app/templates/
|
||||
COPY chroma* /app/
|
||||
COPY tts/ /app/tts/
|
||||
|
||||
# Copy entrypoint script for runtime environment variable substitution
|
||||
COPY docker-entrypoint.sh /app/docker-entrypoint.sh
|
||||
RUN chmod +x /app/docker-entrypoint.sh
|
||||
|
||||
# Set PYTHONPATH to include the src directory
|
||||
ENV PYTHONPATH=/app/src:$PYTHONPATH
|
||||
@@ -93,5 +104,6 @@ ENV PYTHONPATH=/app/src:$PYTHONPATH
|
||||
EXPOSE 5050
|
||||
EXPOSE 8080
|
||||
|
||||
# Use bash as the shell, activate the virtual environment, and run backend server
|
||||
CMD ["uv", "run", "src/talemate/server/run.py", "runserver", "--host", "0.0.0.0", "--port", "5050", "--frontend-host", "0.0.0.0", "--frontend-port", "8080"]
|
||||
# Use entrypoint for runtime config, CMD for the actual server
|
||||
ENTRYPOINT ["/app/docker-entrypoint.sh"]
|
||||
CMD ["uv", "run", "src/talemate/server/run.py", "runserver", "--host", "0.0.0.0", "--port", "5050", "--frontend-host", "0.0.0.0", "--frontend-port", "8080"]
|
||||
|
||||
@@ -9,13 +9,9 @@ creator:
|
||||
- an epic sci-fi adventure
|
||||
|
||||
## Long-term memory
|
||||
## Embeddings are now managed through the application UI
|
||||
## See: Presets -> Embeddings in the application settings
|
||||
|
||||
#chromadb:
|
||||
# embeddings: instructor
|
||||
# instructor_device: cuda
|
||||
# instructor_model: hkunlp/instructor-xl
|
||||
# openai_model: text-embedding-3-small
|
||||
|
||||
## Remote LLMs
|
||||
|
||||
#openai:
|
||||
|
||||
@@ -17,4 +17,5 @@ services:
|
||||
environment:
|
||||
- PYTHONUNBUFFERED=1
|
||||
- PYTHONPATH=/app/src:$PYTHONPATH
|
||||
- VITE_TALEMATE_BACKEND_WEBSOCKET_URL=${VITE_TALEMATE_BACKEND_WEBSOCKET_URL:-}
|
||||
command: ["uv", "run", "src/talemate/server/run.py", "runserver", "--host", "0.0.0.0", "--port", "5050", "--frontend-host", "0.0.0.0", "--frontend-port", "8080"]
|
||||
@@ -15,4 +15,5 @@ services:
|
||||
environment:
|
||||
- PYTHONUNBUFFERED=1
|
||||
- PYTHONPATH=/app/src:$PYTHONPATH
|
||||
- VITE_TALEMATE_BACKEND_WEBSOCKET_URL=${VITE_TALEMATE_BACKEND_WEBSOCKET_URL:-}
|
||||
command: ["uv", "run", "src/talemate/server/run.py", "runserver", "--host", "0.0.0.0", "--port", "5050", "--frontend-host", "0.0.0.0", "--frontend-port", "8080"]
|
||||
38
docker-entrypoint.sh
Normal file
@@ -0,0 +1,38 @@
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
#
|
||||
# Talemate Docker Entrypoint
|
||||
#
|
||||
# Replaces environment variable placeholders in the frontend bundle
|
||||
# at container startup, enabling runtime configuration.
|
||||
#
|
||||
# Environment Variables:
|
||||
# VITE_TALEMATE_BACKEND_WEBSOCKET_URL - WebSocket URL for backend connection
|
||||
# Leave empty/unset for auto-detection
|
||||
#
|
||||
|
||||
TEMPLATE_FILE="/app/talemate_frontend/dist/index.template.html"
|
||||
OUTPUT_FILE="/app/talemate_frontend/dist/index.html"
|
||||
|
||||
echo "============================================"
|
||||
echo "Talemate Docker Entrypoint"
|
||||
echo "============================================"
|
||||
|
||||
if [ -f "$TEMPLATE_FILE" ]; then
|
||||
echo "Substituting environment variables..."
|
||||
echo " VITE_TALEMATE_BACKEND_WEBSOCKET_URL: ${VITE_TALEMATE_BACKEND_WEBSOCKET_URL:-<not set, will auto-detect>}"
|
||||
|
||||
# Use envsubst to replace ${VAR} placeholders with actual values
|
||||
envsubst < "$TEMPLATE_FILE" > "$OUTPUT_FILE"
|
||||
|
||||
echo "Runtime config applied to index.html"
|
||||
else
|
||||
echo "Warning: Template file not found, skipping envsubst"
|
||||
fi
|
||||
|
||||
echo "Starting Talemate..."
|
||||
echo "============================================"
|
||||
|
||||
# Execute the main command
|
||||
exec "$@"
|
||||
@@ -1,5 +1,6 @@
|
||||
nav:
|
||||
- 1. Installation: installation
|
||||
- 2. Connect a client: connect-a-client.md
|
||||
- 3. Load a scene: load-a-scene.md
|
||||
- 2. Setup Wizard: setup-wizard.md
|
||||
- 3. Connect a client: connect-a-client.md
|
||||
- 4. Load a scene: load-a-scene.md
|
||||
- ...
|
||||
@@ -1,3 +1,4 @@
|
||||
nav:
|
||||
- change-host-and-port.md
|
||||
- debug-logging.md
|
||||
- ...
|
||||
@@ -96,4 +96,33 @@ Start the backend and frontend as usual.
|
||||
|
||||
```batch
|
||||
start_custom.bat
|
||||
```
|
||||
```
|
||||
|
||||
## Docker Runtime Configuration
|
||||
|
||||
For Docker deployments, you can configure the WebSocket URL at container startup without rebuilding the image.
|
||||
|
||||
### Setting WebSocket URL via Environment Variable
|
||||
|
||||
```yaml
|
||||
# docker-compose.yml
|
||||
services:
|
||||
talemate:
|
||||
environment:
|
||||
- VITE_TALEMATE_BACKEND_WEBSOCKET_URL=ws://your-backend-host:5050/ws
|
||||
```
|
||||
|
||||
Or via command line:
|
||||
|
||||
```bash
|
||||
VITE_TALEMATE_BACKEND_WEBSOCKET_URL=ws://192.168.1.100:5050/ws docker compose up
|
||||
```
|
||||
|
||||
### Configuration Priority
|
||||
|
||||
The WebSocket URL is determined in this order:
|
||||
|
||||
1. **Runtime environment variable** (`VITE_TALEMATE_BACKEND_WEBSOCKET_URL` at container start)
|
||||
2. **Auto-detection** (`ws://<current-browser-hostname>:5050/ws`)
|
||||
|
||||
This means you can use a single Docker image across different environments (staging, production) by simply changing the environment variable.
|
||||
51
docs/getting-started/advanced/debug-logging.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# Debug Logging
|
||||
|
||||
By default, Talemate logs at the `INFO` level. To enable more verbose `DEBUG` logging, set the `TALEMATE_DEBUG` environment variable to `1` before starting the server.
|
||||
|
||||
This will output detailed debug information from all components, which is useful for troubleshooting issues or reporting bugs.
|
||||
|
||||
#### :material-linux: Linux
|
||||
|
||||
Prefix the start command with the environment variable:
|
||||
|
||||
```bash
|
||||
TALEMATE_DEBUG=1 ./start.sh
|
||||
```
|
||||
|
||||
Or if running manually:
|
||||
|
||||
```bash
|
||||
TALEMATE_DEBUG=1 uv run src/talemate/server/run.py runserver --host 0.0.0.0 --port 5050
|
||||
```
|
||||
|
||||
#### :material-microsoft-windows: Windows
|
||||
|
||||
Set the environment variable before running the start script:
|
||||
|
||||
```batch
|
||||
SET TALEMATE_DEBUG=1
|
||||
start.bat
|
||||
```
|
||||
|
||||
## Disabling debug logging
|
||||
|
||||
To return to normal logging, unset the variable or set it to `0`:
|
||||
|
||||
#### :material-linux: Linux
|
||||
|
||||
```bash
|
||||
unset TALEMATE_DEBUG
|
||||
```
|
||||
|
||||
Or simply run without the prefix:
|
||||
|
||||
```bash
|
||||
./start.sh
|
||||
```
|
||||
|
||||
#### :material-microsoft-windows: Windows
|
||||
|
||||
```batch
|
||||
SET TALEMATE_DEBUG=0
|
||||
start.bat
|
||||
```
|
||||
@@ -1,5 +1,8 @@
|
||||
# Connect a client
|
||||
|
||||
!!! note "First time setup?"
|
||||
If this is your first time launching Talemate, the [Setup Wizard](setup-wizard.md) will guide you through adding your first client and configuring essential settings. This page covers manual client configuration for adding additional clients or if you skipped the wizard.
|
||||
|
||||
Once Talemate is up and running and you are connected, you will see a notification in the corner instructing you to configured a client.
|
||||
|
||||

|
||||
@@ -36,20 +39,23 @@ Select the API you want to use and click through to follow the instructions to c
|
||||
- [Anthropic](/talemate/user-guide/clients/types/anthropic/)
|
||||
- [mistral.ai](/talemate/user-guide/clients/types/mistral/)
|
||||
- [Cohere](/talemate/user-guide/clients/types/cohere/)
|
||||
- [DeepSeek](/talemate/user-guide/clients/types/deepseek/)
|
||||
- [Groq](/talemate/user-guide/clients/types/groq/)
|
||||
- [Google Gemini](/talemate/user-guide/clients/types/google/)
|
||||
- [OpenRouter](/talemate/user-guide/clients/types/openrouter/)
|
||||
|
||||
##### Local APIs
|
||||
|
||||
- [KoboldCpp](/talemate/user-guide/clients/types/koboldcpp/)
|
||||
- [Text-Generation-WebUI](/talemate/user-guide/clients/types/text-generation-webui/)
|
||||
- [llama.cpp](/talemate/user-guide/clients/types/llamacpp/)
|
||||
- [Ollama](/talemate/user-guide/clients/types/ollama/)
|
||||
- [Text-Generation-WebUI](/talemate/user-guide/clients/types/text-generation-webui/)
|
||||
- [LMStudio](/talemate/user-guide/clients/types/lmstudio/)
|
||||
- [TabbyAPI](/talemate/user-guide/clients/types/tabbyapi/)
|
||||
|
||||
##### Unofficial OpenAI API implementations
|
||||
|
||||
- [DeepInfra](/talemate/user-guide/clients/types/openai-compatible/#deepinfra)
|
||||
- llamacpp with the `api_like_OAI.py` wrapper
|
||||
|
||||
## Assign the client to the agents
|
||||
|
||||
|
||||
@@ -16,10 +16,54 @@ This happens because we mount the config file directly as a docker volume, and i
|
||||
|
||||
This will eventually be fixed, for now please make sure to copy the example config file before running the docker compose command.
|
||||
|
||||
### Configuring WebSocket URL at Runtime
|
||||
|
||||
If you need to connect the frontend to a backend running on a different host or port (e.g., behind a reverse proxy), you can configure this at container startup without rebuilding the image.
|
||||
|
||||
Set the `VITE_TALEMATE_BACKEND_WEBSOCKET_URL` environment variable:
|
||||
|
||||
```bash
|
||||
# Using docker run
|
||||
docker run -e VITE_TALEMATE_BACKEND_WEBSOCKET_URL=wss://api.example.com/ws ghcr.io/vegu-ai/talemate:latest
|
||||
|
||||
# Using docker-compose.yml
|
||||
services:
|
||||
talemate:
|
||||
environment:
|
||||
- VITE_TALEMATE_BACKEND_WEBSOCKET_URL=wss://api.example.com/ws
|
||||
```
|
||||
|
||||
**URL Format:**
|
||||
|
||||
- Use `ws://` for unencrypted connections
|
||||
- Use `wss://` for SSL/TLS connections (required if behind HTTPS proxy)
|
||||
- Include the `/ws` path suffix
|
||||
|
||||
**If not set**, the frontend automatically connects to `ws://<current-hostname>:5050/ws`.
|
||||
|
||||
## General
|
||||
|
||||
### Running behind reverse proxy with ssl
|
||||
### Running behind reverse proxy with SSL
|
||||
|
||||
Personally i have not been able to make this work yet, but its on my list, issue stems from some vue oddities when specifying the base urls while running in a dev environment. I expect once i start building the project for production this will be resolved.
|
||||
To run Talemate behind a reverse proxy with SSL:
|
||||
|
||||
If you do make it work, please reach out to me so i can update this documentation.
|
||||
1. Configure your reverse proxy to forward WebSocket connections to the backend (port 5050)
|
||||
2. Set the WebSocket URL to use your proxy's public address:
|
||||
|
||||
```yaml
|
||||
# docker-compose.yml
|
||||
environment:
|
||||
- VITE_TALEMATE_BACKEND_WEBSOCKET_URL=wss://your-domain.com/ws
|
||||
```
|
||||
|
||||
3. Ensure your proxy is configured to handle WebSocket upgrades. Example nginx config:
|
||||
|
||||
```nginx
|
||||
location /ws {
|
||||
proxy_pass http://talemate:5050/ws;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
proxy_set_header Host $host;
|
||||
}
|
||||
```
|
||||
@@ -29,7 +29,7 @@ Once sent, its now the AI's turn to respond - depending on the service and model
|
||||
|
||||
### Scenario tools
|
||||
|
||||
Above the chat input there is a set of tools to help you interact with the scenario.
|
||||
Above the chat input there is a set of tools to help you interact with the scenario. You may also notice an **agent activity bar** appearing above the tools when agents are working - this shows which agents are currently processing in the background.
|
||||
|
||||

|
||||
|
||||
|
||||
110
docs/getting-started/setup-wizard.md
Normal file
@@ -0,0 +1,110 @@
|
||||
# Setup Wizard
|
||||
|
||||
When you launch Talemate for the first time, a setup wizard will guide you through the initial configuration. This wizard helps you connect to an AI model and configure essential settings so you can start creating stories right away.
|
||||
|
||||
## Step 1: Choose Your AI Provider Type
|
||||
|
||||
The first decision is how you want to connect to an AI model. Talemate offers two options:
|
||||
|
||||
### Self-hosted
|
||||
|
||||
Choose this if you are running your own AI inference service on your computer or a server you control. This includes:
|
||||
|
||||
- KoboldCpp
|
||||
- Text-Generation-WebUI
|
||||
- llama.cpp
|
||||
- LMStudio
|
||||
- TabbyAPI
|
||||
|
||||
### Hosted API
|
||||
|
||||
Choose this if you want to use a cloud-based AI service. This requires an API key from the provider. Supported services include:
|
||||
|
||||
- OpenRouter
|
||||
- Google
|
||||
- Anthropic
|
||||
- And others
|
||||
|
||||

|
||||
|
||||
## Step 2: Add Your Client
|
||||
|
||||
After selecting your provider type, you will choose a specific client to configure.
|
||||
|
||||
For **Hosted API** users, OpenRouter is selected by default. For **Self-hosted** users, KoboldCpp is selected by default.
|
||||
|
||||
Select your preferred client from the dropdown and click **Add Client** to open the client configuration dialog.
|
||||
|
||||

|
||||
|
||||
Once you complete the client configuration, the wizard will automatically advance to the next step.
|
||||
|
||||
!!! info "Reasoning Models"
|
||||
If you are using a reasoning model (like DeepSeek R1 or GLM), you will need to enable reasoning in the client settings after setup. For OpenRouter users, the default model selected during the wizard (Gemini 3 Flash) has reasoning enabled automatically, but if you switch to a different reasoning-capable model, you'll need to enable it manually in the client settings.
|
||||
|
||||
## Step 3: Configure Long-term Memory
|
||||
|
||||
Talemate uses embeddings to store and retrieve story details over time. This includes character names, relationships, locations, and other facts that should persist as your story grows.
|
||||
|
||||
### Embeddings Model
|
||||
|
||||
You have two options:
|
||||
|
||||
**Better (Recommended)**
|
||||
|
||||
Uses the `Alibaba-NLP/gte-base-en-v1.5` model. This provides more capable memory recall and is recommended for the best experience. The model weights are approximately 550MB and will be downloaded when you first load a scene. If using CUDA, plan for roughly 1GB or more of free VRAM.
|
||||
|
||||
**Standard**
|
||||
|
||||
A smaller, less capable model. Choose this only if you cannot spare the VRAM for the recommended model.
|
||||
|
||||
### Device
|
||||
|
||||
Choose where the embeddings model should run:
|
||||
|
||||
**CUDA**
|
||||
|
||||
Faster performance on NVIDIA GPUs. Recommended if you have an NVIDIA GPU with sufficient free VRAM. The wizard will detect your CUDA availability and display your GPU information.
|
||||
|
||||
**CPU**
|
||||
|
||||
Works on any system. Use this if you do not have an NVIDIA GPU or prefer not to use GPU memory for embeddings.
|
||||
|
||||

|
||||
|
||||
The wizard will automatically detect your system capabilities and suggest appropriate defaults based on your available hardware.
|
||||
|
||||
!!! note "Changing these settings later"
|
||||
You can change the embeddings model and device at any time in **Agents -> Memory**.
|
||||
|
||||
## Step 4: Configure Visual Agent (Optional)
|
||||
|
||||
This step only appears if you selected **Google** or **OpenRouter** as your client in Step 2.
|
||||
|
||||
Since these providers support image generation and analysis, you can configure the Visual Agent to use them for:
|
||||
|
||||
- Creating images
|
||||
- Editing images
|
||||
- Analyzing images
|
||||
|
||||
Choose **Enable** to automatically configure the Visual Agent, or **Skip** to configure it manually later.
|
||||
|
||||

|
||||
|
||||
## Completing the Wizard
|
||||
|
||||
After completing all steps, click **Apply & finish** to save your settings and close the wizard. You are now ready to start using Talemate.
|
||||
|
||||
## Skipping the Wizard
|
||||
|
||||
If you prefer to configure Talemate manually, you can click **Skip setup for now** at any time. The wizard will close and you can configure clients and settings through the regular interface.
|
||||
|
||||
The wizard will not appear again once you have added at least one client. If you skipped the wizard without adding a client, it will appear again the next time you launch Talemate.
|
||||
|
||||
## Re-accessing Setup Options
|
||||
|
||||
The setup wizard is designed for first-time setup only. However, all the settings it configures can be accessed through the regular interface:
|
||||
|
||||
- **Client configuration**: Click **Add Client** in the clients panel on the right side of the screen
|
||||
- **Memory settings**: Navigate to **Agents -> Memory** and open the settings
|
||||
- **Visual Agent settings**: Navigate to **Agents -> Visual** and open the settings
|
||||
|
Before Width: | Height: | Size: 128 KiB |
|
Before Width: | Height: | Size: 933 KiB |
|
Before Width: | Height: | Size: 13 KiB |
|
Before Width: | Height: | Size: 3.5 KiB |
|
Before Width: | Height: | Size: 1.8 KiB |
|
Before Width: | Height: | Size: 3.1 KiB |
|
Before Width: | Height: | Size: 28 KiB |
|
Before Width: | Height: | Size: 3.0 KiB |
|
Before Width: | Height: | Size: 3.0 KiB |
|
Before Width: | Height: | Size: 19 KiB |
|
Before Width: | Height: | Size: 3.2 KiB |
|
Before Width: | Height: | Size: 3.4 KiB |
|
Before Width: | Height: | Size: 10 KiB |
|
Before Width: | Height: | Size: 14 KiB |
|
Before Width: | Height: | Size: 38 KiB |
|
Before Width: | Height: | Size: 36 KiB |
|
Before Width: | Height: | Size: 44 KiB |
|
Before Width: | Height: | Size: 63 KiB |
|
Before Width: | Height: | Size: 1.6 KiB |
|
Before Width: | Height: | Size: 63 KiB |
|
Before Width: | Height: | Size: 39 KiB |
|
Before Width: | Height: | Size: 44 KiB |
|
Before Width: | Height: | Size: 32 KiB |
|
Before Width: | Height: | Size: 54 KiB |
|
Before Width: | Height: | Size: 22 KiB |
|
Before Width: | Height: | Size: 53 KiB |
BIN
docs/img/0.35.0/add-pocket-tts-voice.png
Normal file
|
After Width: | Height: | Size: 25 KiB |
BIN
docs/img/0.35.0/agent-activity-bar.png
Normal file
|
After Width: | Height: | Size: 8.9 KiB |
BIN
docs/img/0.35.0/app-settings-appearance-messages.png
Normal file
|
After Width: | Height: | Size: 93 KiB |
BIN
docs/img/0.35.0/app-settings-appearance-visuals.png
Normal file
|
After Width: | Height: | Size: 63 KiB |
BIN
docs/img/0.35.0/character-visuals-overview.png
Normal file
|
After Width: | Height: | Size: 481 KiB |
BIN
docs/img/0.35.0/character-visuals-portrait.png
Normal file
|
After Width: | Height: | Size: 314 KiB |
BIN
docs/img/0.35.0/client-deepseek-no-api-key.png
Normal file
|
After Width: | Height: | Size: 9.5 KiB |
BIN
docs/img/0.35.0/client-deepseek-ready.png
Normal file
|
After Width: | Height: | Size: 7.8 KiB |
BIN
docs/img/0.35.0/client-deepseek.png
Normal file
|
After Width: | Height: | Size: 59 KiB |
BIN
docs/img/0.35.0/client-llamacpp-could-not-connect.png
Normal file
|
After Width: | Height: | Size: 7.7 KiB |
BIN
docs/img/0.35.0/client-llamacpp-ready.png
Normal file
|
After Width: | Height: | Size: 9.0 KiB |
BIN
docs/img/0.35.0/client-llamacpp.png
Normal file
|
After Width: | Height: | Size: 50 KiB |
BIN
docs/img/0.35.0/debug-tools-edit-variable.png
Normal file
|
After Width: | Height: | Size: 4.7 KiB |
BIN
docs/img/0.35.0/debug-tools-open.png
Normal file
|
After Width: | Height: | Size: 2.5 KiB |
BIN
docs/img/0.35.0/debug-tools-vars-tab.png
Normal file
|
After Width: | Height: | Size: 16 KiB |
BIN
docs/img/0.35.0/deepseek-settings.png
Normal file
|
After Width: | Height: | Size: 40 KiB |
BIN
docs/img/0.35.0/director-action-node-example.png
Normal file
|
After Width: | Height: | Size: 151 KiB |
BIN
docs/img/0.35.0/director-actions-menu.png
Normal file
|
After Width: | Height: | Size: 63 KiB |
BIN
docs/img/0.35.0/director-character-management-settings.png
Normal file
|
After Width: | Height: | Size: 52 KiB |
BIN
docs/img/0.35.0/director-chat-actions-menu.png
Normal file
|
After Width: | Height: | Size: 50 KiB |
BIN
docs/img/0.35.0/director-chat-create-image.png
Normal file
|
After Width: | Height: | Size: 554 KiB |
BIN
docs/img/0.35.0/director-console-direction-tab.png
Normal file
|
After Width: | Height: | Size: 144 KiB |
BIN
docs/img/0.35.0/director-prompt-user-dialog.png
Normal file
|
After Width: | Height: | Size: 27 KiB |
BIN
docs/img/0.35.0/director-scene-direction-settings.png
Normal file
|
After Width: | Height: | Size: 84 KiB |
BIN
docs/img/0.35.0/gamestate-add-watch.png
Normal file
|
After Width: | Height: | Size: 11 KiB |
BIN
docs/img/0.35.0/inline-visual-context-menu.png
Normal file
|
After Width: | Height: | Size: 25 KiB |
BIN
docs/img/0.35.0/inline-visuals-example.png
Normal file
|
After Width: | Height: | Size: 1.1 MiB |
BIN
docs/img/0.35.0/pin-condition-tabs.png
Normal file
|
After Width: | Height: | Size: 15 KiB |
BIN
docs/img/0.35.0/pin-gamestate-conditions.png
Normal file
|
After Width: | Height: | Size: 27 KiB |
BIN
docs/img/0.35.0/pocket-tts-api-settings.png
Normal file
|
After Width: | Height: | Size: 61 KiB |
BIN
docs/img/0.35.0/pocket-tts-parameters.png
Normal file
|
After Width: | Height: | Size: 9.6 KiB |
BIN
docs/img/0.35.0/scene-tools-visual-menu.png
Normal file
|
After Width: | Height: | Size: 55 KiB |
BIN
docs/img/0.35.0/setup-wizard-step1-provider-type.png
Normal file
|
After Width: | Height: | Size: 41 KiB |
BIN
docs/img/0.35.0/setup-wizard-step2-add-client.png
Normal file
|
After Width: | Height: | Size: 37 KiB |
BIN
docs/img/0.35.0/setup-wizard-step3-memory.png
Normal file
|
After Width: | Height: | Size: 103 KiB |
BIN
docs/img/0.35.0/setup-wizard-step4-visual.png
Normal file
|
After Width: | Height: | Size: 48 KiB |
BIN
docs/img/0.35.0/visual-library-cover-crop-1.png
Normal file
|
After Width: | Height: | Size: 1.5 MiB |
BIN
docs/img/0.35.0/visual-library-cover-crop-2.png
Normal file
|
After Width: | Height: | Size: 1.1 MiB |
BIN
docs/img/0.35.0/world-editor-gamestate-tab.png
Normal file
|
After Width: | Height: | Size: 71 KiB |
BIN
docs/img/0.35.0/world-state-character-portraits-settings.png
Normal file
|
After Width: | Height: | Size: 53 KiB |
@@ -15,6 +15,7 @@ Roleplay with AI with a focus on strong narration and consistent world and game
|
||||
|
||||
### First steps
|
||||
|
||||
- [Setup Wizard](getting-started/setup-wizard.md) - Initial configuration on first launch
|
||||
- [Connect a client](getting-started/connect-a-client.md)
|
||||
- [Load a scene](getting-started/load-a-scene.md)
|
||||
- [Interact with the scene](user-guide/interacting)
|
||||
@@ -62,9 +62,19 @@ If > 0 will offset the instructions for the actor (both broad and character spec
|
||||
|
||||

|
||||
|
||||
Enable this setting to apply a writing style to the generated content.
|
||||
Content settings control what contextual information is included in the prompts sent to the AI when generating character dialogue.
|
||||
|
||||
Make sure the a writing style is selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) to apply the writing style to the generated content.
|
||||
##### Use Scene Intent
|
||||
|
||||
When enabled (default), the [scene intent](/talemate/user-guide/world-editor/scene/direction) (overall intention) will be included in the conversation prompt. This helps the AI generate dialogue that aligns with your story goals and the current scene direction.
|
||||
|
||||
Disable this if you want the AI to generate dialogue without being influenced by the scene direction settings.
|
||||
|
||||
##### Use Writing Style
|
||||
|
||||
When enabled (default), the writing style selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) will be applied to the generated dialogue.
|
||||
|
||||
Disable this if you want the AI to generate dialogue without following the scene's writing style template.
|
||||
|
||||
## Long Term Memory
|
||||
|
||||
|
||||
@@ -39,9 +39,13 @@ The director can help you with many tasks:
|
||||
- Create or modify characters, world entries, and story configuration
|
||||
- Advance time in your story
|
||||
- Manage game state variables (if your story uses them)
|
||||
- Generate images and illustrations (if the [Visualizer Agent](/talemate/user-guide/agents/visualizer) is configured)
|
||||
|
||||
Simply describe what you want in natural language, and the director will figure out how to accomplish it.
|
||||
|
||||
!!! tip "Visual Generation"
|
||||
When asking the director to create images, the generated visuals can appear in your scene feed as [inline visuals](/talemate/user-guide/inline-visuals). This is controlled by the **Auto-attach visuals** setting in the scene tools visualizer menu.
|
||||
|
||||
### Viewing action details
|
||||
|
||||
When the director performs an action, you can expand it to see exactly what was done:
|
||||
@@ -112,6 +116,55 @@ When rejected, the director acknowledges and waits for your next instruction:
|
||||
|
||||

|
||||
|
||||
## Enabling and Disabling Actions
|
||||
|
||||
!!! info "New in 0.35.0"
|
||||
Action toggles were introduced in version 0.35.0.
|
||||
|
||||
The director has access to many different actions for querying information, making changes, and progressing your story. You can control which actions the director is allowed to use by enabling or disabling them through the Actions menu.
|
||||
|
||||
### Accessing the Actions Menu
|
||||
|
||||
Click the **Actions** button in the chat toolbar to open the actions menu.
|
||||
|
||||

|
||||
|
||||
### Available Actions
|
||||
|
||||
The menu lists all actions grouped by category. Each action has a checkbox indicating whether it is enabled (checked) or disabled (unchecked).
|
||||
|
||||
Actions are organized into groups:
|
||||
|
||||
| Group | Actions |
|
||||
|-------|---------|
|
||||
| **Direct Scene** | Advance time, Create new narration, Direct character action, Yield to User |
|
||||
| **Query** | Query World Information, Retrieve Context Directly, Query Game State, Query Scene Direction |
|
||||
| **Update Context** | Existing Characters, World Information, Story Configuration, Static History, Character Creation |
|
||||
| **Gamestate** | Make Changes (game state variables) |
|
||||
| **User Interaction** | Prompt for text input (Scene Direction only) |
|
||||
| **Visuals** | Create new Image(s), Edit Image(s) |
|
||||
| **Misc** | Directly Retrieve Context |
|
||||
|
||||
!!! note "Scene Direction Only Actions"
|
||||
Some actions are only available during autonomous Scene Direction, not in Director Chat. The **Prompt for text input** action is one example - it allows the director to request information from you during autonomous direction but is not used during interactive chat sessions. See [Prompting the User for Input](/talemate/user-guide/agents/director/scene-direction/#prompting-the-user-for-input) for more details.
|
||||
|
||||
### Toggling Actions
|
||||
|
||||
Click on an action in the list to toggle it on or off:
|
||||
|
||||
- **Enabled** (checked): The director can use this action when responding to your requests
|
||||
- **Disabled** (unchecked): The director will not use this action, even if it would be helpful
|
||||
|
||||
When you disable an action, the director will work around it by using other available actions or by informing you that it cannot complete the requested task.
|
||||
|
||||
### Locked Actions
|
||||
|
||||
Some actions may be marked as "locked" and cannot be disabled. These are core actions required for the director to function properly. Locked actions appear grayed out in the menu and cannot be toggled.
|
||||
|
||||
### Persistence
|
||||
|
||||
Your action toggle settings are saved with the scene. When you reload the scene later, your enabled and disabled actions will be restored automatically.
|
||||
|
||||
## Director personas
|
||||
|
||||
You can customize the director's personality and initial greeting by assigning a persona:
|
||||
|
||||
@@ -1,16 +1,30 @@
|
||||
# Overview
|
||||
The director agent is responsible for guiding the scene progression and generating dynamic actions.
|
||||
|
||||
In the future it will shift / expose more of a game master role, controlling the progression of the story.
|
||||
The director agent serves as the game master for your scenes, guiding story progression and helping manage the creative experience. It provides several key features that work together to enhance your storytelling.
|
||||
|
||||
## Features
|
||||
|
||||
### Autonomous Scene Direction
|
||||
|
||||
!!! info "New in 0.35.0"
|
||||
Autonomous Scene Direction replaces the previous Auto Direction feature with a more capable implementation.
|
||||
|
||||
Allows the director to autonomously progress your scene using the same actions available through Director Chat. The director analyzes the scene context and decides when and how to move the story forward.
|
||||
|
||||
A strong LLM (100B+) with reasoning capabilities is highly recommended for best results.
|
||||
|
||||
See the [Autonomous Scene Direction](/talemate/user-guide/agents/director/scene-direction) page for detailed information.
|
||||
|
||||
### Director Chat
|
||||
|
||||
A conversational interface for interacting with the director directly. You can ask questions, request changes, and guide story progression through natural language.
|
||||
|
||||
See the [Director Chat](/talemate/user-guide/agents/director/chat) page for more information.
|
||||
|
||||
### Dynamic Actions
|
||||
Will occasionally generate clickable choices for the user during scene progression. This can be used to allow the user to make choices that will affect the scene or the story in some way without having to manually type out the choice.
|
||||
|
||||
Generates clickable choices for the user during scene progression. This allows you to make decisions that affect the scene or story without manually typing out your choice.
|
||||
|
||||
### Guide Scene
|
||||
Will use the summarizer agent's scene analysis to guide characters and the narrator for the next generation, hopefully improving the quality of the generated content.
|
||||
|
||||
### Auto Direction
|
||||
A very experimental feature that will cause the director to attempt to direct the scene automatically, instructing actors or the narrator to move the scene forward according to the story and scene intention.
|
||||
|
||||
!!! note "Experimental"
|
||||
This is the first iteration of this feature and is very much a work in progress. It will likely change substantially in the future.
|
||||
Uses the summarizer agent's scene analysis to guide characters and the narrator for the next generation, helping improve the quality and coherence of generated content.
|
||||
246
docs/user-guide/agents/director/scene-direction.md
Normal file
@@ -0,0 +1,246 @@
|
||||
# Autonomous Scene Direction
|
||||
|
||||
!!! info "New in 0.35.0"
|
||||
Autonomous Scene Direction is a new feature introduced in version 0.35.0 that replaces the previous Auto Direction feature.
|
||||
|
||||
Autonomous Scene Direction allows the director agent to progress your scene automatically, using the same actions available through the [Director Chat](/talemate/user-guide/agents/director/chat). Instead of manually requesting the director to take actions, the director will analyze the scene and decide what actions to take on its own.
|
||||
|
||||
## Requirements
|
||||
|
||||
!!! warning "Strong LLM Recommended"
|
||||
A strong language model (100B+ parameters) with reasoning capabilities is **highly recommended** for meaningful autonomous scene direction. While you may have some success with smaller 32B reasoning models, larger models will produce significantly better results.
|
||||
|
||||
See [Reasoning Model Support](/talemate/user-guide/clients/reasoning/) for information on enabling reasoning capabilities.
|
||||
|
||||
### Scene Intentions
|
||||
|
||||
Autonomous Scene Direction relies on having both a **story intention** and a **scene phase intention** set. Without these, the director lacks the context needed to make meaningful decisions about scene progression.
|
||||
|
||||
Set these in the [World Editor - Scene Direction](/talemate/user-guide/world-editor/scene/direction) section:
|
||||
|
||||
- **Overall Intention**: The big-picture goal and expectations for your story
|
||||
- **Current Phase Intention**: The specific goal and context for the current scene
|
||||
|
||||
## Enabling Scene Direction
|
||||
|
||||
Scene Direction is disabled by default. To enable it:
|
||||
|
||||
1. Open the **Director** agent settings from the agents panel
|
||||
2. Find the **Scene Direction** section
|
||||
3. Toggle the feature **On**
|
||||
|
||||

|
||||
|
||||
!!! tip "Quick Toggle"
|
||||
Scene Direction has a quick toggle in the agent settings panel, making it easy to turn on and off during play.
|
||||
|
||||
## How It Works
|
||||
|
||||
When Scene Direction is enabled, the director analyzes the scene after each turn and decides whether to take action. The director can:
|
||||
|
||||
- Instruct actors on what to do or say next
|
||||
- Guide the narrator to progress the story
|
||||
- Generate new content based on the scene's current state
|
||||
- Prompt the user for information when needed
|
||||
- Manage time progression in the story
|
||||
- Create or modify world entries and characters
|
||||
|
||||
The director uses the same actions available in Director Chat, but makes autonomous decisions about when and how to use them based on:
|
||||
|
||||
- The overall story intention
|
||||
- The current scene phase and intention
|
||||
- The recent scene history
|
||||
- The participation balance of characters and narrator
|
||||
|
||||
### Turn Balance
|
||||
|
||||
When **Maintain turn balance** is enabled (the default), the director tracks how often each character and the narrator have participated in recent scene history. This helps ensure:
|
||||
|
||||
- No single character dominates the conversation
|
||||
- The narrator provides enough scene-setting and descriptions
|
||||
- Neglected characters get opportunities to participate
|
||||
|
||||
### Prompting the User for Input
|
||||
|
||||
The director can prompt you for information during autonomous scene direction using the **Prompt for text input** action. This allows the director to request your input when it needs guidance or information to continue the story.
|
||||
|
||||
When the director uses this action, a text input dialog appears in your scene feed with a title and message explaining what information is needed. You can then type your response and submit it, or cancel the prompt if the input is optional.
|
||||
|
||||

|
||||
|
||||
Common situations where the director might prompt you:
|
||||
|
||||
- Asking what you want to do next after a significant story event
|
||||
- Requesting details for character creation when starting a new scene
|
||||
- Seeking clarification on your intentions when the story could branch in multiple directions
|
||||
|
||||
The dialog includes:
|
||||
|
||||
- **Title**: A brief heading indicating the nature of the prompt
|
||||
- **Body**: The full question or request from the director, which may be presented either in-character or out-of-character depending on the context
|
||||
- **Text Input**: A field where you enter your response (single-line or multi-line depending on the prompt)
|
||||
- **Continue Button**: Submit your response
|
||||
- **Cancel Button**: Dismiss the prompt without responding (only available if the input is optional)
|
||||
|
||||
After you submit your response, the director receives your input and uses it to inform its next actions in the scene.
|
||||
|
||||
This action can be enabled or disabled through the Actions menu if you prefer the director not to prompt you directly. See [Actions Menu](#actions-menu) for details on managing available actions.
|
||||
|
||||
## Settings
|
||||
|
||||
### General Settings
|
||||
|
||||
#### Enable Analysis Step
|
||||
|
||||
When enabled, the director performs an internal analysis step before deciding on actions. This helps the director think through complex situations and plan more carefully.
|
||||
|
||||
**Recommendation**: Keep this enabled for more thoughtful scene direction.
|
||||
|
||||
#### Response Token Budget
|
||||
|
||||
Controls the maximum tokens the director can use for reasoning and response generation. Higher values allow for more detailed analysis. Default is 2048.
|
||||
|
||||
#### Max Actions Per Turn
|
||||
|
||||
The maximum number of actions the director can execute in a single turn. This prevents runaway action chains. Default is 5.
|
||||
|
||||
#### Retries
|
||||
|
||||
How many times to retry if the director produces a malformed response. Default is 1.
|
||||
|
||||
### Context Settings
|
||||
|
||||
#### Scene Context Ratio
|
||||
|
||||
Controls how the director's context budget is divided between scene context and direction history.
|
||||
|
||||
- **Lower values** (e.g., 0.30): 30% for scene context, 70% for direction history
|
||||
- **Higher values** (e.g., 0.70): 70% for scene context, 30% for direction history
|
||||
|
||||
Default is 0.30.
|
||||
|
||||
#### Stale History Share
|
||||
|
||||
When the direction history needs to be compacted (summarized to save tokens), this controls what fraction is treated as "stale" and summarized versus kept verbatim.
|
||||
|
||||
- **Higher values**: Summarize more, keep less verbatim
|
||||
- **Lower values**: Summarize less, keep more recent messages
|
||||
|
||||
Default is 0.70.
|
||||
|
||||
### Turn Balance Settings
|
||||
|
||||
#### Maintain Turn Balance
|
||||
|
||||
When enabled, the director tracks participation of characters and narrator to encourage variety in scene direction.
|
||||
|
||||
### Custom Instructions
|
||||
|
||||
Add global instructions that will be included in all scene direction prompts across all scenes. Use this for preferences that should apply universally to how the director behaves, regardless of which scene is loaded.
|
||||
|
||||
!!! tip "Scene-Specific Instructions"
|
||||
For instructions tailored to a specific story or scene, use the **Director Instructions** field in the [World Editor - Scene Direction](/talemate/user-guide/world-editor/scene/direction) section instead. Those instructions are stored with the scene and are ideal for genre-specific guidance, story-specific rules, or scene-specific constraints.
|
||||
|
||||
### Director Persona
|
||||
|
||||
The director persona selected in the agent settings applies to scene direction as well as director chat. Choosing a different persona can significantly affect the director's tone, decision-making style, and how it approaches scene progression.
|
||||
|
||||
See [Director Personas](/talemate/user-guide/agents/director/chat/#director-personas) for more information on available personas and how to customize them.
|
||||
|
||||
## The Direction Tab
|
||||
|
||||
When a scene is loaded, you can view the director's autonomous actions in the **Director Console**. Click the **Direction** tab (bullhorn icon) to see:
|
||||
|
||||

|
||||
|
||||
### Scene Type and Intention
|
||||
|
||||
At the top of the Direction tab, you can view and modify the current scene type and phase intention. Changes here affect how the director approaches scene progression.
|
||||
|
||||
### Direction Timeline
|
||||
|
||||
Below the scene settings is the **Direction Timeline**, which shows:
|
||||
|
||||
- The director's reasoning and analysis (if analysis step is enabled)
|
||||
- Actions taken by the director
|
||||
- Results of those actions
|
||||
|
||||
This provides full transparency into what the director is doing and why.
|
||||
|
||||
#### Clearing Direction History
|
||||
|
||||
You can clear the direction history using the **Clear** button. This will make the director "forget" previous actions taken during autonomous direction, but will not affect the scene history itself.
|
||||
|
||||
### Actions Menu
|
||||
|
||||
The **Actions** dropdown lets you enable or disable specific actions the director can use during scene direction. This gives you fine-grained control over what the director is allowed to do autonomously.
|
||||
|
||||

|
||||
|
||||
Some actions may be marked as "locked" and cannot be disabled - these are core actions required for scene direction to function.
|
||||
|
||||
For a complete list of available actions and their categories, see [Enabling and Disabling Actions](/talemate/user-guide/agents/director/chat/#enabling-and-disabling-actions) in the Director Chat documentation. The same actions are available for both Director Chat and Scene Direction, and your toggle settings are saved separately for each mode.
|
||||
|
||||
## Creating Custom Director Actions
|
||||
|
||||
Advanced users can create custom director actions using the Node Editor. This allows you to extend what the director can do during autonomous scene direction.
|
||||
|
||||
### Director Action Nodes
|
||||
|
||||
To create a custom action:
|
||||
|
||||
1. Open the **Node Editor** for your scene or module
|
||||
2. Create a new **Director Chat Action** node
|
||||
3. Define the action's:
|
||||
- **Name**: The identifier for the action
|
||||
- **Description**: What the action does (shown to the LLM)
|
||||
- **Instructions**: Detailed instructions for how to use the action
|
||||
|
||||
4. Connect **Director Chat Sub Action** nodes to define specific behaviors within your action
|
||||
5. Use **Director Chat Action Argument** nodes to define parameters the LLM can pass to your action
|
||||
|
||||

|
||||
|
||||
Sub-actions can be configured for:
|
||||
|
||||
- **Both** chat and scene direction modes
|
||||
- **Chat only** - only available when chatting with the director
|
||||
- **Scene Direction only** - only available during autonomous direction
|
||||
|
||||
### Sub-Action Properties
|
||||
|
||||
When creating a sub-action, you can configure:
|
||||
|
||||
- **Group**: Organizational group name for the action
|
||||
- **Action Title**: Display name shown to users
|
||||
- **Action ID**: Unique identifier
|
||||
- **Description (Chat)**: Description shown when used in chat mode
|
||||
- **Description (Scene Direction)**: Description shown when used in autonomous mode
|
||||
- **Availability**: Which modes the action is available in
|
||||
- **Force Enabled**: If true, prevents users from disabling this action
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Director Takes No Actions
|
||||
|
||||
- Verify both story intention and scene phase intention are set
|
||||
- Check that Scene Direction is enabled in agent settings
|
||||
- Ensure Auto Progress is enabled if using the default game loop
|
||||
|
||||
### Actions Seem Random or Unhelpful
|
||||
|
||||
- Consider using a stronger reasoning model
|
||||
- Review and refine your scene intentions
|
||||
- Check if the wrong actions are enabled - disable those you do not want
|
||||
|
||||
### Director Keeps Using Same Character
|
||||
|
||||
- Enable **Maintain turn balance**
|
||||
- Check if other characters are properly set up in the scene
|
||||
- Review character availability settings
|
||||
|
||||
### Performance Is Slow
|
||||
|
||||
- Reduce **Max actions per turn**
|
||||
- Lower **Response token budget**
|
||||
- Consider a faster model or API endpoint
|
||||
@@ -35,6 +35,20 @@ If `Direction` is selected, the actor will be given the direction as a direct in
|
||||
|
||||
If `Inner Monologue` is selected, the actor will be given the direction as a thought.
|
||||
|
||||
###### Direction Stickiness
|
||||
|
||||
!!! info "New in 0.35.0"
|
||||
|
||||
Controls how many scene messages the system looks back when retrieving character directions. This determines how long directions "stick" and continue to influence character behavior.
|
||||
|
||||
- **Range**: 1 to 20
|
||||
- **Default**: 5
|
||||
|
||||
When you direct an actor, that direction doesn't just apply to their next response—it persists across multiple turns based on this setting. For example, with a stickiness of 5, a direction to "act suspiciously" will continue to influence the character's behavior for up to 5 relevant scene messages.
|
||||
|
||||
!!! note "Time passage clears directions"
|
||||
Directions are automatically cleared when time passes in the scene. This ensures that directions given in one scene segment don't inappropriately carry over into a new time period.
|
||||
|
||||
## Long Term Memory
|
||||
|
||||
--8<-- "docs/snippets/tips.md:agent_long_term_memory_settings"
|
||||
@@ -90,71 +104,89 @@ If enabled the director will guide the narrator in the scene.
|
||||
|
||||
The maximum number of tokens for the guidance. (e.g., how long should the guidance be).
|
||||
|
||||
## Auto Direction
|
||||
## Scene Direction
|
||||
|
||||
A very experimental first attempt at giving the reigns to the director to direct the scene automatically.
|
||||
!!! info "New in 0.35.0"
|
||||
Scene Direction replaces the previous Auto Direction feature with significantly enhanced capabilities.
|
||||
|
||||
Currently it can only instruct actors and the narrator, but different actions will be exposed in the future. This is very early in the development cycle and will likely go through substantial changes.
|
||||
Autonomous Scene Direction allows the director to progress scenes automatically using the same actions available in Director Chat.
|
||||
|
||||
!!! note "Both overall and current intent need to be set for auto-direction to be available"
|
||||
If either the overall or current scene intention is not set, the auto-direction feature will not be available.
|
||||
For detailed information, see the dedicated [Autonomous Scene Direction](/talemate/user-guide/agents/director/scene-direction) documentation page.
|
||||
|
||||

|
||||

|
||||
|
||||
Story and scene intentions are set in the [Scene Direction](/talemate/user-guide/world-editor/scene/direction) section of the World Editor.
|
||||
##### Enable Scene Direction
|
||||
|
||||

|
||||
Toggle to enable or disable autonomous scene direction. This feature is disabled by default.
|
||||
|
||||
##### Enable Auto Direction
|
||||
!!! warning "Strong LLM Required"
|
||||
A strong language model (100B+) with reasoning capabilities is highly recommended. See [Reasoning Model Support](/talemate/user-guide/clients/reasoning/).
|
||||
|
||||
Turn auto direction on and off.
|
||||
##### Enable Analysis Step
|
||||
|
||||
!!! note "Auto progress needs to also be enabled"
|
||||
If auto direction is enabled, auto progress needs to be enabled as well.
|
||||
When enabled, the director performs an internal analysis step before deciding on actions.
|
||||
|
||||

|
||||
#### Natural flow
|
||||
##### Response Token Budget
|
||||
|
||||
Will place strict limits on actor turns based on the provided constraints. That means regardless of what the director would like to do, the actor availability will always take precedence.
|
||||
Maximum tokens for director reasoning and response generation. Default is 2048.
|
||||
|
||||
##### Max. Auto turns
|
||||
##### Max Actions Per Turn
|
||||
|
||||
Maximum turns the AI gets in succession (spread accross characters). When this limit is reached, the player will get a turn no matter what.
|
||||
Maximum number of actions the director can execute per turn. Default is 5.
|
||||
|
||||
##### Max. Idle turns
|
||||
##### Retries
|
||||
|
||||
The maximum number of turns a character can go without speaking before they are automatically given a turn by the director. (per character)
|
||||
Retry count for malformed responses. Default is 1.
|
||||
|
||||
##### Max. Repeat Turns
|
||||
##### Scene Context Ratio
|
||||
|
||||
The maximum number of times a character can go in succession without speaking before the director will force them to speak. (per character)
|
||||
Balance between scene context and direction history in the token budget. Default is 0.30 (30% scene, 70% history).
|
||||
|
||||
##### Stale History Share
|
||||
|
||||
#### Instructions
|
||||
When compacting direction history, this fraction is summarized versus kept verbatim. Default is 0.70.
|
||||
|
||||
##### Instruct Actors
|
||||
##### Maintain Turn Balance
|
||||
|
||||
Allow the director to instruct actors.
|
||||
Track character and narrator participation to encourage variety in scene direction.
|
||||
|
||||
##### Instruct Narrator
|
||||
##### Custom Instructions
|
||||
|
||||
Allow the director to instruct the narrator.
|
||||
Custom instructions included in all scene direction prompts to guide the director's behavior.
|
||||
|
||||
##### Instruct Frequency
|
||||
## Character Management
|
||||
|
||||
Only pass on instructions to the actors or the narrator every N turns.
|
||||
The Character Management settings control how the director handles character creation and related tasks.
|
||||
|
||||
!!! note "Evaluation of the scene happens regardless"
|
||||
The director will evaluate the scene after each round regardless of the frequency. This setting merely controls how often the instructions are actually passed on.
|
||||

|
||||
|
||||
##### Evaluate Scene Intention
|
||||
### Character Creation
|
||||
|
||||
Allows the director to evaluate the current scene phase and switch to a different scene type or set a new intention.
|
||||
!!! info "New in 0.35.0"
|
||||
The **Limit character attributes** setting is new in version 0.35.0.
|
||||
|
||||
The number of turns between evaluations. (0 = NEVER)
|
||||
##### Limit character attributes
|
||||
|
||||
!!! note "Recommended to leave at 0 (never)"
|
||||
This isn't really working well at this point, so recommended to leave at 0 (never)
|
||||
Controls the maximum number of attributes that will be generated when creating or updating character sheets. This applies when the director creates new characters or when character sheets are generated through templates.
|
||||
|
||||
- **0** (default): No limit - attributes are generated without restriction
|
||||
- **1-40**: Limits the character sheet to this many attributes
|
||||
|
||||
When a limit is set, the AI is instructed to generate no more than the specified number of attributes, and any excess attributes are trimmed during processing.
|
||||
|
||||
This setting is useful when you want to keep character sheets concise, or when working with characters that might otherwise generate an excessive number of attributes.
|
||||
|
||||
### Persisting Characters
|
||||
|
||||
##### Assign Voice (TTS)
|
||||
|
||||
If enabled, the director will automatically assign a text-to-speech voice when creating a new character. This requires the TTS agent to be enabled and configured with available voices.
|
||||
|
||||
### Generating Visuals
|
||||
|
||||
##### Generate Visuals
|
||||
|
||||
If enabled, the director is allowed to generate visual assets (portraits, cover images) for characters when requested.
|
||||
|
||||
## Director Chat
|
||||
|
||||
|
||||
@@ -6,6 +6,18 @@ You can manage your available embeddings through the application settings.
|
||||
|
||||
In the settings dialogue go to **:material-tune: Presets** and then **:material-cube-unfolded: Embeddings**.
|
||||
|
||||
!!! warning "INSTRUCTOR Embeddings Removed (0.35.0)"
|
||||
INSTRUCTOR embeddings are no longer supported. If you were using INSTRUCTOR embeddings, your configuration has been automatically reset to use the default embedding model (all-MiniLM-L6-v2).
|
||||
|
||||
**Alternatives:**
|
||||
|
||||
- **all-MiniLM-L6-v2** (default) - Fast local embedding, good for most use cases
|
||||
- **Alibaba-NLP/gte-base-en-v1.5** - More accurate local embedding
|
||||
- **OpenAI text-embedding-3-small** - Cloud-based option (requires API key)
|
||||
- **KoboldCpp Client API** - Use an embedding model loaded in KoboldCpp (see [KoboldCpp Embeddings](koboldcpp.md))
|
||||
|
||||
Your existing scene memory databases will be re-imported automatically when you load them with the new embedding configuration.
|
||||
|
||||
<!--- --8<-- [start:embeddings_setup] -->
|
||||
## Pre-configured Embeddings
|
||||
|
||||
@@ -19,16 +31,6 @@ Fast, but the least accurate.
|
||||
|
||||
Sentence transformer model that is decently fast and accurate and will likely become the default for the Memory agent in the future.
|
||||
|
||||
### Instructor Models
|
||||
|
||||
!!! warning "Support of these likely deprecated"
|
||||
Its become increasingly difficult to install support for these while keeping other dependencies up to date.
|
||||
See [this issue](https://github.com/vegu-ai/talemate/issues/176) for more details.
|
||||
|
||||
Use the `Alibaba-NLP/Gte-Base-En-V1.5` embedding instead, its pretty close in accuracy and much smaller.
|
||||
|
||||
Instructor embeddings, coming in three sizes: `base`, `large`, and `xl`. XL is the most accurate but also has the biggest memory footprint and is the slowest. Using `cuda` is recommended for the `xl` and `large` models.
|
||||
|
||||
### OpenAI text-embedding-3-small
|
||||
|
||||
OpenAI's current text embedding model. Fast and accurate, but not free.
|
||||
|
||||
@@ -2,8 +2,36 @@
|
||||
|
||||
The narrator agent handles the generation of narrative text. This could be progressing the story, describing the scene, or providing exposition and answers to questions.
|
||||
|
||||
### :material-script: Content
|
||||
## Scene Intention Awareness
|
||||
|
||||
The narrator agent is the first agent that can be influenced by one of your writing style templates.
|
||||
When you set a **story intention** and **scene phase intention** in the [World Editor - Scene Direction](/talemate/user-guide/world-editor/scene/direction), the narrator automatically incorporates this information into its prompts. This helps the narrator understand both the big-picture goals of your story and the specific objectives of the current scene.
|
||||
|
||||
Make sure the a writing style is selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) to apply the writing style to the generated content.
|
||||
### How It Works
|
||||
|
||||
The narrator receives the following context when generating narrative:
|
||||
|
||||
- **Story Intention**: Your overarching expectations for the experience (tone, themes, pacing)
|
||||
- **Scene Type**: The current mode of play (e.g., roleplay, combat, investigation)
|
||||
- **Scene Phase Intention**: The specific goal and context for what's happening now
|
||||
|
||||
With this information, the narrator can better align its output with your creative vision. For example, if your scene intention indicates building tension before a reveal, the narrator will lean into that atmosphere rather than rushing to resolution.
|
||||
|
||||
### Setup
|
||||
|
||||
To take advantage of scene intention awareness:
|
||||
|
||||
1. Open the **World Editor** and navigate to **Scene > Direction**
|
||||
2. Set an **Overall Intention** describing the story's goals and expectations
|
||||
3. Set a **Scene Type** and **Current Scene Intention** for the current phase
|
||||
|
||||
Both the overall intention and current scene intention should be set for best results. Without them, the narrator generates content without this additional guidance.
|
||||
|
||||
For more details on configuring scene direction, see [World Editor - Scene Direction](/talemate/user-guide/world-editor/scene/direction).
|
||||
|
||||
## Content
|
||||
|
||||
### :material-script: Writing Style
|
||||
|
||||
The narrator agent can be influenced by one of your writing style templates.
|
||||
|
||||
Make sure a writing style is selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) to apply the writing style to the generated content.
|
||||
@@ -23,11 +23,19 @@ If checked and talemate detects a repetitive response (based on a threshold), it
|
||||
|
||||

|
||||
|
||||
The narrator agent is the first agent that can be influenced by one of your writing style templates.
|
||||
Content settings control what contextual information is included in the prompts sent to the AI when generating narration.
|
||||
|
||||
Enable this setting to apply a writing style to the generated content.
|
||||
##### Use Scene Intent
|
||||
|
||||
Make sure the a writing style is selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) to apply the writing style to the generated content.
|
||||
When enabled (default), the [scene intent](/talemate/user-guide/world-editor/scene/direction) (overall intention) will be included in the narration prompt. This helps the AI generate narrative content that aligns with your story goals and the current scene direction.
|
||||
|
||||
Disable this if you want the AI to generate narration without being influenced by the scene direction settings.
|
||||
|
||||
##### Use Writing Style
|
||||
|
||||
When enabled (default), the writing style selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) will be applied to the generated narration.
|
||||
|
||||
Disable this if you want the AI to generate narration without following the scene's writing style template.
|
||||
|
||||
## :material-clock-fast: Narrate time passage
|
||||
|
||||
|
||||
@@ -139,6 +139,20 @@ These nodes should be connected to your prompt encoding nodes (for Qwen Image Ed
|
||||
|
||||

|
||||
|
||||
### Automatic Deactivation of Unused Reference Nodes
|
||||
|
||||
Talemate automatically handles situations where your workflow contains more reference nodes than you provide images for. When you run a generation:
|
||||
|
||||
- If you provide fewer reference images than the workflow supports, the unused reference nodes are automatically disconnected from the workflow graph
|
||||
- If you provide no reference images at all, all reference nodes are disconnected
|
||||
|
||||
This means you can use a single image editing workflow for both text-to-image generation and image editing operations. For example, if you configure `qwen_image_edit.json` as your image editing backend:
|
||||
|
||||
- When you generate with reference images, those images are uploaded and connected to the appropriate reference nodes
|
||||
- When you generate without reference images (pure text-to-image), all reference nodes are disconnected automatically, allowing the workflow to run as a standard text-to-image workflow
|
||||
|
||||
This behavior prevents errors that would otherwise occur if ComfyUI tried to process reference nodes without actual images loaded into them. You do not need to create separate workflows for text-to-image and image editing - a single workflow with reference nodes can serve both purposes, assuming the model supports it (qwen-image-edit 2511 seems to.)
|
||||
|
||||
### Saving and Exporting the Workflow
|
||||
|
||||
Once your workflow is configured, you need to save it and export it in the API format for Talemate to use it.
|
||||
|
||||
@@ -13,6 +13,7 @@ The Visualizer agent supports multiple image generation backends, allowing you t
|
||||
- **Multiple Backend Support**: Works with various image generation services including Google, ComfyUI, AUTOMATIC1111, OpenAI, and more
|
||||
- **Style Templates**: Configure different visual styles for different types of content (character cards, portraits, scene backgrounds, etc.)
|
||||
- **Visual Library Integration**: Generated images are managed through the Visual Library, where you can organize, iterate, and save visual assets
|
||||
- **[Inline Visuals](../../inline-visuals.md)**: Generated images can appear directly in your scene feed alongside messages, providing an immersive visual storytelling experience (new in v0.35.0)
|
||||
- **Automatic Generation**: Optionally allow the agent to automatically generate visual content based on scene context
|
||||
- **Prompt Generation**: Supports both direct prompts and natural language instructions that incorporate character and scene context
|
||||
|
||||
@@ -57,8 +58,11 @@ Quick shortcuts are available through the scenario tools menu, allowing you to q
|
||||
|
||||
- **Visualize Scene**: Generate images of the current scene environment
|
||||
- **Visualize Character**: Generate character portraits or cards
|
||||
- **Visualize Moment**: Generate scene illustrations depicting the current story moment
|
||||
|
||||
These shortcuts support keyboard modifiers: hold **ALT** to generate prompts only (without creating images), or hold **CTRL** to use instruction mode.
|
||||
These shortcuts support keyboard modifiers: hold **ALT** to generate prompts only (without creating images), or hold **CTRL** to provide custom instructions.
|
||||
|
||||
When **Auto-attach visuals** is enabled in the visualizer menu, generated images will automatically appear in your scene feed as [inline visuals](../../inline-visuals.md). You can configure the display size and behavior of these images in the [Appearance Settings](../../app-settings/appearance.md#visuals).
|
||||
|
||||
### Director Chat
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||

|
||||
|
||||
The Visualizer agent settings are organized into two main tabs: **General** and **Styles**.
|
||||
The Visualizer agent settings are organized into three main sections: **General**, **Prompt Generation**, and **Styles**. Additionally, each backend may have its own configuration options, including [resolution presets](#resolution-presets) for local image generation backends.
|
||||
|
||||

|
||||
|
||||
@@ -37,6 +37,10 @@ When enabled, allows the Visualizer agent to automatically generate visual conte
|
||||
|
||||
This setting is disabled by default, giving you full control over when images are generated.
|
||||
|
||||
## Prompt Generation
|
||||
|
||||
The Prompt Generation section contains settings that control how image prompts are created and refined before being sent to the image generation backend.
|
||||
|
||||
### Fallback Prompt Type
|
||||
|
||||
Determines the format used for prompt-only generation when no backends are configured. This setting only affects the output format when generating prompts without actually creating images.
|
||||
@@ -46,6 +50,62 @@ Available options:
|
||||
- **Keywords**: Generates prompts using keyword-based formatting
|
||||
- **Descriptive**: Generates prompts using descriptive text formatting
|
||||
|
||||
### Max. Prompt Generation Length
|
||||
|
||||
Controls the maximum token length for AI-generated image prompts. When you use Instruct mode or any feature that asks the AI to create a prompt for your image, this setting limits how long that generated prompt can be.
|
||||
|
||||
The default is 1024 tokens, and can be adjusted from 512 to 4096 tokens.
|
||||
|
||||
Both keyword-style and descriptive prompts are always generated together, so this limit must accommodate both formats.
|
||||
|
||||
### Automatic Analysis of References
|
||||
|
||||
When enabled, reference images that lack analysis data will be automatically analyzed before being used in prompt generation. This ensures that the AI has detailed information about your reference images when creating prompts, which can lead to better results when generating variations or editing images.
|
||||
|
||||
#### How It Works
|
||||
|
||||
Normally, you analyze images manually using the **Analyze** button in the [Visual Library](visual-library.md#image-analysis). The analysis text captures details about the image content, which can then be used during prompt generation to help the AI understand what your reference images contain.
|
||||
|
||||
With **Automatic Analysis of References** enabled, any reference images that don't already have analysis data will be analyzed on-the-fly when you start a generation. This is particularly useful when:
|
||||
|
||||
- You have uploaded images that haven't been analyzed yet
|
||||
- You're using newly saved images as references before analyzing them
|
||||
- You want to ensure all references have analysis data without manually analyzing each one
|
||||
|
||||
The analysis results are saved to the asset metadata, so each image only needs to be analyzed once. Future generations using the same reference will use the cached analysis.
|
||||
|
||||
#### Interaction with Prompt Revision
|
||||
|
||||
This setting works in conjunction with **Perform Extra Revision of Editing Prompts** (below). When both settings are enabled and you're generating with references:
|
||||
|
||||
1. First, any unanalyzed references are automatically analyzed
|
||||
2. Then, the prompt revision step uses this analysis data to refine your prompt
|
||||
|
||||
If you only enable prompt revision without automatic analysis, the revision step will still work but may have less information about unanalyzed references to work with.
|
||||
|
||||
!!! warning "Additional AI Queries"
|
||||
Enabling this option adds one AI query per unanalyzed reference image. You must have an image analysis backend configured for this feature to work.
|
||||
|
||||
This setting is disabled by default.
|
||||
|
||||
### Perform Extra Revision of Editing Prompts
|
||||
|
||||
When enabled, the AI will refine and simplify image editing prompts based on the provided reference images. This additional processing step can improve generation results by better aligning the prompt with your selected references.
|
||||
|
||||
This revision step analyzes the reference images (using their analysis data or tags) and adjusts the prompt to:
|
||||
|
||||
- Reference characters or elements by their image number instead of re-describing them
|
||||
- Preserve the scene composition and setting from your original prompt
|
||||
- Maintain important context like actions, positioning, and mood
|
||||
- Only describe differences from the reference images when needed
|
||||
|
||||
For example, instead of generating a prompt like "Elena, a tall woman with red hair and green eyes wearing a blue dress, stands in a dimly lit tavern," the revised prompt might become "Elena (IMAGE 1) stands in a dimly lit tavern, looking worried" - letting the reference image provide the character's appearance while your prompt defines the scene.
|
||||
|
||||
!!! warning "Additional AI Query"
|
||||
This adds an extra AI query to the prompt generation process when reference images are provided.
|
||||
|
||||
This setting is enabled by default.
|
||||
|
||||
## Styles Configuration
|
||||
|
||||

|
||||
@@ -72,3 +132,46 @@ Each style template can include:
|
||||
- Instructions (specific generation instructions)
|
||||
|
||||
These styles are applied automatically when generating images based on the visual type you select.
|
||||
|
||||
## Resolution Presets
|
||||
|
||||
Local image generation backends (ComfyUI, SD.Next, and AUTOMATIC1111) include a resolution preset picker that lets you quickly select appropriate image dimensions for your generated images. This feature appears in each backend's configuration section.
|
||||
|
||||
### How It Works
|
||||
|
||||
The resolution preset picker provides settings for three aspect ratios that match the format options available during image generation:
|
||||
|
||||
- **Square**: Used for character portraits and icons (e.g., `CHARACTER_PORTRAIT` visual type)
|
||||
- **Portrait**: Used for tall images like character cards (e.g., `CHARACTER_CARD`, `SCENE_CARD` visual types)
|
||||
- **Landscape**: Used for wide images like scene backgrounds (e.g., `SCENE_BACKGROUND`, `SCENE_ILLUSTRATION` visual types)
|
||||
|
||||
Each resolution setting displays two number fields for width and height, along with a dropdown menu button that reveals available presets.
|
||||
|
||||
### Available Presets
|
||||
|
||||
The preset picker includes resolution options optimized for different model types:
|
||||
|
||||
| Preset | Square | Portrait | Landscape |
|
||||
|--------|--------|----------|-----------|
|
||||
| **SD 1.5** | 512 x 512 | 512 x 768 | 768 x 512 |
|
||||
| **SDXL** | 1024 x 1024 | 832 x 1216 | 1216 x 832 |
|
||||
| **Qwen Image** | 1328 x 1328 | 928 x 1664 | 1664 x 928 |
|
||||
| **Z-Image Turbo** | 2048 x 2048 | 1088 x 1920 | 1920 x 1088 |
|
||||
|
||||
### Selecting a Preset
|
||||
|
||||
To select a resolution preset:
|
||||
|
||||
1. Open the Visualizer agent settings
|
||||
2. Navigate to the backend configuration tab (ComfyUI, SD.Next, or AUTOMATIC1111)
|
||||
3. Find the **Resolutions** section
|
||||
4. Click the dropdown menu button next to the resolution you want to change
|
||||
5. Select the appropriate preset for your model
|
||||
|
||||
You can also manually enter custom width and height values by typing directly into the number fields.
|
||||
|
||||
!!! note "Backend-Specific Settings"
|
||||
Resolution presets are configured separately for text-to-image and image editing operations if you're using a backend that supports both (like ComfyUI or SD.Next). This allows you to use different resolutions for each type of generation.
|
||||
|
||||
!!! note "Cloud Backends"
|
||||
Cloud-based backends (Google, OpenAI, OpenRouter) do not have resolution presets because they use fixed or automatically determined resolutions based on the model's capabilities.
|
||||
|
||||
@@ -234,6 +234,44 @@ Assets can be configured for use as references in future generations. The **Refe
|
||||
|
||||
Tags are particularly useful for organizing large asset libraries and can be used to filter assets in the sidebar.
|
||||
|
||||
### Cover Crop
|
||||
|
||||
The **Cover crop** tab allows you to define a crop region for an image that will be applied whenever the image is used as a cover image (for characters or scenes). This is useful when you have a wide or tall image where only a specific portion should be displayed in the cover image area.
|
||||
|
||||

|
||||
|
||||
Cover images appear at the top of scenes or as character cards, and the crop ensures the most important part of your image is visible in those displays.
|
||||
|
||||
#### Setting Up a Crop Region
|
||||
|
||||
To define a crop region:
|
||||
|
||||
1. Select an asset from the Scene Assets tree to open it for editing
|
||||
2. Click the **Cover crop** tab in the metadata panel
|
||||
3. On the image preview, drag to draw a rectangular crop region
|
||||
4. Adjust the region as needed:
|
||||
- **Move**: Drag inside the crop box to reposition it
|
||||
- **Resize**: Drag any of the four corner handles to resize the crop region
|
||||
5. Click **Save** to save your changes
|
||||
|
||||

|
||||
|
||||
The area outside your crop region appears dimmed, giving you a preview of what will be visible when the image is used as a cover.
|
||||
|
||||
#### Resetting the Crop
|
||||
|
||||
To remove a custom crop and use the full image, click the **Reset** button in the top-right corner of the image preview. This sets the crop region to encompass the entire image.
|
||||
|
||||
#### When is the Crop Applied?
|
||||
|
||||
The crop region is automatically applied when:
|
||||
|
||||
- The image is set as a **scene cover image** and displayed in the scene header
|
||||
- The image is set as a **character cover image** and displayed in the character panel
|
||||
- The image appears in any other context that uses the cover image display
|
||||
|
||||
The original image file is never modified. The crop is applied dynamically when the image is displayed.
|
||||
|
||||
## Image Analysis
|
||||
|
||||
Image analysis uses AI to extract detailed information from images. This is useful for:
|
||||
@@ -246,6 +284,9 @@ Image analysis uses AI to extract detailed information from images. This is usef
|
||||
!!! note "Image Analysis Backend Required"
|
||||
Image analysis requires the Image Analysis backend to be configured and available. If the backend is not configured, the Analyze button will be disabled or unavailable. Make sure you have an image analysis backend set up in your visual agent configuration before attempting to analyze images.
|
||||
|
||||
!!! tip "Automatic Analysis During Generation"
|
||||
If you prefer not to manually analyze each image, you can enable **Automatic Analysis of References** in the [Visualizer settings](settings.md#automatic-analysis-of-references). When enabled, any reference images lacking analysis data will be automatically analyzed before prompt generation.
|
||||
|
||||

|
||||
|
||||
Click **Analyze** to perform a quick analysis with default settings, or hold **Ctrl** (or **Cmd** on Mac) to open a dialog where you can specify custom analysis instructions.
|
||||
|
||||
@@ -16,6 +16,7 @@ In 0.32.0 Talemate's TTS (Text-to-Speech) agent has been completely refactored t
|
||||
- **[Kokoro](kokoro.md)** - Fastest generation with predefined voice models and mixing
|
||||
- **[F5-TTS](f5tts.md)** - Fast voice cloning with occasional mispronunciations
|
||||
- **[Chatterbox](chatterbox.md)** - High-quality voice cloning (slower generation)
|
||||
- **[Pocket TTS](pocket-tts.md)** - CPU-based voice cloning from Kyutai (no GPU required)
|
||||
|
||||
### Remote APIs
|
||||
- **[ElevenLabs](elevenlabs.md)** - Professional voice synthesis with voice cloning
|
||||
|
||||
143
docs/user-guide/agents/voice/pocket-tts.md
Normal file
@@ -0,0 +1,143 @@
|
||||
# Pocket TTS
|
||||
|
||||
Pocket TTS is a local CPU-based text-to-speech model from [Kyutai](https://kyutai.org/) that supports voice cloning from audio files. Unlike other local TTS options that require a GPU, Pocket TTS runs efficiently on your CPU, making it accessible on a wider range of hardware.
|
||||
|
||||

|
||||
|
||||
## Key Features
|
||||
|
||||
- **CPU-only** - No GPU required, runs on standard computer hardware
|
||||
- **Voice cloning** - Clone voices from short audio samples (.wav files)
|
||||
- **Low resource usage** - Uses only 2 CPU cores with a small 100M parameter model
|
||||
- **Built-in voices** - Includes several ready-to-use voice samples
|
||||
- **English only** - Currently supports English language generation
|
||||
|
||||
## First-Time Setup
|
||||
|
||||
The first time you generate audio with Pocket TTS, it will automatically download the model weights. This is a one-time download.
|
||||
|
||||
!!! warning "Voice Cloning Access"
|
||||
Voice cloning requires accepting the model terms on Hugging Face. If voice cloning downloads are blocked:
|
||||
|
||||
1. Visit the [Pocket TTS model page](https://huggingface.co/kyutai/pocket-tts) and accept the terms
|
||||
2. Create a [Hugging Face access token](https://huggingface.co/settings/tokens)
|
||||
3. Set the token in your environment as `HF_TOKEN`
|
||||
4. Restart Talemate
|
||||
|
||||
## Configuration
|
||||
|
||||
##### Variant
|
||||
|
||||
The model variant identifier. The default `b6369a24` is the current recommended version.
|
||||
|
||||
##### Temperature
|
||||
|
||||
Controls voice variation during generation. Higher values (e.g., 1.0) produce more varied but potentially less stable output. Lower values (e.g., 0.5) produce more consistent results. Default is 0.7.
|
||||
|
||||
##### LSD Decode Steps
|
||||
|
||||
Number of decoding steps. Higher values can improve quality but increase generation time. Default is 1.
|
||||
|
||||
##### Noise Clamp
|
||||
|
||||
When set above 0, limits noise sampling to prevent extreme values. 0 disables clamping. Default is 0.
|
||||
|
||||
##### EOS Threshold
|
||||
|
||||
End-of-sequence detection threshold. Controls when the model stops generating audio. Default is -4.0.
|
||||
|
||||
##### Frames After EOS
|
||||
|
||||
Number of additional audio frames to generate after detecting the end of speech. 0 uses automatic detection. Default is 0.
|
||||
|
||||
##### Chunk Size
|
||||
|
||||
Text is split into chunks of this size for processing. Smaller values increase responsiveness but may affect natural flow between chunks. 0 disables chunking. Default is 256.
|
||||
|
||||
## Built-in Voices
|
||||
|
||||
Talemate includes several ready-to-use Pocket TTS voices. These are available immediately without any additional setup:
|
||||
|
||||
| Voice | Description |
|
||||
|-------|-------------|
|
||||
| Eva | Female, calm, mature, thoughtful |
|
||||
| Lisa | Female, energetic, young |
|
||||
| Adam | Male, calm, mature, thoughtful, deep |
|
||||
| Bradford | Male, calm, mature, thoughtful, deep |
|
||||
| Julia | Female, calm, mature |
|
||||
| Zoe | Female |
|
||||
| William | Male, young |
|
||||
|
||||
These voices use audio samples located in the `tts/voice/pocket_tts/` folder within your Talemate installation.
|
||||
|
||||
## Adding Custom Voices
|
||||
|
||||
### Voice Requirements
|
||||
|
||||
Pocket TTS voices use audio files as reference prompts for voice cloning:
|
||||
|
||||
- Audio file in .wav format
|
||||
- Clear speech with minimal background noise
|
||||
- Single speaker throughout the sample
|
||||
|
||||
### Creating a Voice
|
||||
|
||||
1. Open the Voice Library
|
||||
2. Click **:material-plus: New**
|
||||
3. Select "Pocket TTS" as the provider
|
||||
4. Configure the voice:
|
||||
|
||||

|
||||
|
||||
**Label:** A descriptive name for the voice (e.g., "Sarah - Warm Female")
|
||||
|
||||
**Voice ID / Upload File:** You have two options:
|
||||
|
||||
- Upload a .wav file containing the voice sample - the uploaded file becomes the voice ID
|
||||
- Enter a path to a local .wav file (relative to Talemate workspace or absolute path)
|
||||
- Enter a Hugging Face URL in the format `hf://kyutai/tts-voices/...`
|
||||
|
||||
**Tags:** Add descriptive tags (gender, age, style) for organization and filtering
|
||||
|
||||
### Extra Voice Parameters
|
||||
|
||||

|
||||
|
||||
##### Truncate Prompt Audio
|
||||
|
||||
When enabled, truncates the voice prompt audio to 30 seconds when extracting the voice characteristics. This can help prevent memory issues with very long audio samples.
|
||||
|
||||
## Using Hugging Face Voice Catalog
|
||||
|
||||
Kyutai provides a catalog of voices on Hugging Face that you can use directly with Pocket TTS. To use a voice from the catalog:
|
||||
|
||||
1. Visit the [Kyutai voice catalog](https://huggingface.co/kyutai/tts-voices)
|
||||
2. Find a voice you want to use
|
||||
3. Copy the voice path
|
||||
4. In Talemate, create a new Pocket TTS voice and enter the path as the Voice ID in the format: `hf://kyutai/tts-voices/voice-name/file.wav`
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Model Download Issues
|
||||
|
||||
If the model fails to download:
|
||||
|
||||
- Check your internet connection
|
||||
- Verify you have accepted the terms on [Hugging Face](https://huggingface.co/kyutai/pocket-tts)
|
||||
- Make sure your `HF_TOKEN` environment variable is set correctly
|
||||
- Try restarting Talemate
|
||||
|
||||
### Voice Cloning Not Working
|
||||
|
||||
If you can use built-in voices but voice cloning fails:
|
||||
|
||||
- Voice cloning requires accepting additional terms on Hugging Face
|
||||
- Follow the First-Time Setup instructions above to configure your Hugging Face token
|
||||
|
||||
### Generation Quality Issues
|
||||
|
||||
If the generated audio sounds unusual:
|
||||
|
||||
- Try adjusting the Temperature setting - lower values produce more consistent results
|
||||
- Ensure your voice reference audio is clear with minimal background noise
|
||||
- Try using a shorter audio sample (5-15 seconds often works well)
|
||||
@@ -9,6 +9,7 @@ Select which TTS APIs to enable. You can enable multiple APIs simultaneously:
|
||||
- **Kokoro** - Fastest generation with predefined voice models and mixing
|
||||
- **F5-TTS** - Fast voice cloning with occasional mispronunciations
|
||||
- **Chatterbox** - High-quality voice cloning (slower generation)
|
||||
- **Pocket TTS** - CPU-based voice cloning from Kyutai (no GPU required)
|
||||
- **ElevenLabs** - Professional voice synthesis with voice cloning
|
||||
- **Google Gemini-TTS** - Google's text-to-speech service
|
||||
- **OpenAI** - OpenAI's TTS-1 and TTS-1-HD models
|
||||
|
||||
@@ -67,6 +67,13 @@ Check the provider specific documentation for more information on how to configu
|
||||
- Specify reference text for better quality
|
||||
- Adjust speed and other parameters
|
||||
|
||||
**Pocket TTS:**
|
||||
|
||||
- Upload .wav reference files for voice cloning
|
||||
- Use built-in voice samples included with Talemate
|
||||
- Use voices from the Hugging Face voice catalog
|
||||
- Runs on CPU (no GPU required)
|
||||
|
||||
**Kokoro:**
|
||||
|
||||
- Select from predefined voice models
|
||||
|
||||
@@ -47,3 +47,46 @@ If enabled, the proposed changes will be presented as suggestions to the player.
|
||||
##### Player character
|
||||
|
||||
Enable this to have the player character be included in the progression checks.
|
||||
|
||||
## Character Portraits
|
||||
|
||||
!!! info "New in 0.35.0"
|
||||
Character portrait features allow automatic portrait selection and generation based on scene context.
|
||||
|
||||

|
||||
|
||||
The Character Portraits settings control how character avatars are displayed alongside dialogue messages and whether they should change automatically based on the scene context.
|
||||
|
||||
### Portrait Selection
|
||||
|
||||
##### Selection Frequency
|
||||
|
||||
Controls how often the World State Agent evaluates which portrait to use for a character based on the current scene context.
|
||||
|
||||
- **0**: Never automatically select portraits (portraits must be changed manually)
|
||||
- **1**: Evaluate with every new message
|
||||
- **2-10**: Evaluate every N messages
|
||||
|
||||
When a message is generated, the agent examines the content and context of the scene, then compares it against the tags associated with each portrait to find the best match.
|
||||
|
||||
!!! note "Minimum Portraits Required"
|
||||
A character needs at least 2 portraits in their visual configuration for automatic selection to activate. You can manage portraits in the [World Editor under Character > Visuals > Portrait](/talemate/user-guide/world-editor/characters/visuals/#portrait).
|
||||
|
||||
!!! tip "Tag Your Portraits"
|
||||
The selection algorithm relies on portrait tags to make decisions. Portraits without tags cannot be intelligently selected. Add descriptive tags like "happy", "sad", "angry", "combat", "formal" to each portrait using the Visual Library.
|
||||
|
||||
### Generate New Portraits
|
||||
|
||||
##### Generate New Portraits
|
||||
|
||||
When enabled, the World State Agent can request the Director to generate new portraits when no suitable portrait is found for the current scene context.
|
||||
|
||||
For example, if a character is described as wearing formal attire at a party but no existing portrait shows them in formal wear, the system can automatically commission a new portrait showing the appropriate appearance.
|
||||
|
||||
!!! warning "Prerequisites"
|
||||
This feature requires:
|
||||
|
||||
- The Director's **Character Management > Generate Visuals** setting to be enabled
|
||||
- A Visual Agent with an image generation backend configured
|
||||
|
||||
When a new portrait is generated, it is automatically added to the character's portrait collection and tagged based on the scene context.
|
||||
|
||||