Major features:
- Autonomous scene direction via director agent (replaces auto-direct)
- Inline image display in scene feed
- Character visuals tab for portrait/cover image management
- Character message avatars with dynamic portrait selection
- Pocket TTS and llama.cpp client support
- Message appearance overhaul with configurable markdown display

Improvements:
- KoboldCpp: adaptive-p, min-p, presence/frequency penalty support
- Setup wizard for initial configuration
- Director chat action toggles
- Visual agent: resolution presets, prompt revision, auto-analysis
- Experimental concurrent requests for hosted LLM clients
- Node editor alignment shortcuts (X/Y) and color picker

Bugfixes:
- Empty response retry loop
- Client system prompt display
- Character detail pins loading
- ComfyUI workflow charset encoding
- Various layout and state issues

Breaking: Removed INSTRUCTOR embeddings
This commit is contained in:
veguAI
2026-01-27 10:22:41 +02:00
committed by GitHub
parent 20af2a9f4b
commit d0ebe95ca6
493 changed files with 50656 additions and 10683 deletions

2
.gitattributes vendored Normal file
View File

@@ -0,0 +1,2 @@
# Force Unix line endings for shell scripts (prevents CRLF issues in Docker)
*.sh text eol=lf

View File

@@ -11,7 +11,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.10', '3.11', '3.12', '3.13']
python-version: ['3.11']
fail-fast: false
steps:

3
.gitignore vendored
View File

@@ -32,4 +32,5 @@ scenes/
tts_voice_samples/*.wav
third-party-docs/
legacy-state-reinforcements.yaml
CLAUDE.md
CLAUDE.md
docs/update-progress

View File

@@ -48,6 +48,7 @@ RUN apt-get update && apt-get install -y \
wget \
tar \
xz-utils \
gettext-base \
&& rm -rf /var/lib/apt/lists/*
# Install uv in the final stage
@@ -77,6 +78,9 @@ COPY --from=backend-build /app/src /app/src
# Copy Node.js build artifacts from frontend-build stage
COPY --from=frontend-build /app/dist /app/talemate_frontend/dist
# Preserve index.html as template for runtime envsubst substitution
COPY --from=frontend-build /app/dist/index.html /app/talemate_frontend/dist/index.template.html
# Copy the frontend WSGI file if it exists
COPY frontend_wsgi.py /app/frontend_wsgi.py
@@ -84,7 +88,14 @@ COPY frontend_wsgi.py /app/frontend_wsgi.py
COPY config.example.yaml /app/config.yaml
# Copy essentials
COPY scenes templates chroma* /app/
COPY scenes/ /app/scenes/
COPY templates/ /app/templates/
COPY chroma* /app/
COPY tts/ /app/tts/
# Copy entrypoint script for runtime environment variable substitution
COPY docker-entrypoint.sh /app/docker-entrypoint.sh
RUN chmod +x /app/docker-entrypoint.sh
# Set PYTHONPATH to include the src directory
ENV PYTHONPATH=/app/src:$PYTHONPATH
@@ -93,5 +104,6 @@ ENV PYTHONPATH=/app/src:$PYTHONPATH
EXPOSE 5050
EXPOSE 8080
# Use bash as the shell, activate the virtual environment, and run backend server
CMD ["uv", "run", "src/talemate/server/run.py", "runserver", "--host", "0.0.0.0", "--port", "5050", "--frontend-host", "0.0.0.0", "--frontend-port", "8080"]
# Use entrypoint for runtime config, CMD for the actual server
ENTRYPOINT ["/app/docker-entrypoint.sh"]
CMD ["uv", "run", "src/talemate/server/run.py", "runserver", "--host", "0.0.0.0", "--port", "5050", "--frontend-host", "0.0.0.0", "--frontend-port", "8080"]

View File

@@ -9,13 +9,9 @@ creator:
- an epic sci-fi adventure
## Long-term memory
## Embeddings are now managed through the application UI
## See: Presets -> Embeddings in the application settings
#chromadb:
# embeddings: instructor
# instructor_device: cuda
# instructor_model: hkunlp/instructor-xl
# openai_model: text-embedding-3-small
## Remote LLMs
#openai:

View File

@@ -17,4 +17,5 @@ services:
environment:
- PYTHONUNBUFFERED=1
- PYTHONPATH=/app/src:$PYTHONPATH
- VITE_TALEMATE_BACKEND_WEBSOCKET_URL=${VITE_TALEMATE_BACKEND_WEBSOCKET_URL:-}
command: ["uv", "run", "src/talemate/server/run.py", "runserver", "--host", "0.0.0.0", "--port", "5050", "--frontend-host", "0.0.0.0", "--frontend-port", "8080"]

View File

@@ -15,4 +15,5 @@ services:
environment:
- PYTHONUNBUFFERED=1
- PYTHONPATH=/app/src:$PYTHONPATH
- VITE_TALEMATE_BACKEND_WEBSOCKET_URL=${VITE_TALEMATE_BACKEND_WEBSOCKET_URL:-}
command: ["uv", "run", "src/talemate/server/run.py", "runserver", "--host", "0.0.0.0", "--port", "5050", "--frontend-host", "0.0.0.0", "--frontend-port", "8080"]

38
docker-entrypoint.sh Normal file
View File

@@ -0,0 +1,38 @@
#!/bin/bash
set -e
#
# Talemate Docker Entrypoint
#
# Replaces environment variable placeholders in the frontend bundle
# at container startup, enabling runtime configuration.
#
# Environment Variables:
# VITE_TALEMATE_BACKEND_WEBSOCKET_URL - WebSocket URL for backend connection
# Leave empty/unset for auto-detection
#
TEMPLATE_FILE="/app/talemate_frontend/dist/index.template.html"
OUTPUT_FILE="/app/talemate_frontend/dist/index.html"
echo "============================================"
echo "Talemate Docker Entrypoint"
echo "============================================"
if [ -f "$TEMPLATE_FILE" ]; then
echo "Substituting environment variables..."
echo " VITE_TALEMATE_BACKEND_WEBSOCKET_URL: ${VITE_TALEMATE_BACKEND_WEBSOCKET_URL:-<not set, will auto-detect>}"
# Use envsubst to replace ${VAR} placeholders with actual values
envsubst < "$TEMPLATE_FILE" > "$OUTPUT_FILE"
echo "Runtime config applied to index.html"
else
echo "Warning: Template file not found, skipping envsubst"
fi
echo "Starting Talemate..."
echo "============================================"
# Execute the main command
exec "$@"

View File

@@ -1,5 +1,6 @@
nav:
- 1. Installation: installation
- 2. Connect a client: connect-a-client.md
- 3. Load a scene: load-a-scene.md
- 2. Setup Wizard: setup-wizard.md
- 3. Connect a client: connect-a-client.md
- 4. Load a scene: load-a-scene.md
- ...

View File

@@ -1,3 +1,4 @@
nav:
- change-host-and-port.md
- debug-logging.md
- ...

View File

@@ -96,4 +96,33 @@ Start the backend and frontend as usual.
```batch
start_custom.bat
```
```
## Docker Runtime Configuration
For Docker deployments, you can configure the WebSocket URL at container startup without rebuilding the image.
### Setting WebSocket URL via Environment Variable
```yaml
# docker-compose.yml
services:
talemate:
environment:
- VITE_TALEMATE_BACKEND_WEBSOCKET_URL=ws://your-backend-host:5050/ws
```
Or via command line:
```bash
VITE_TALEMATE_BACKEND_WEBSOCKET_URL=ws://192.168.1.100:5050/ws docker compose up
```
### Configuration Priority
The WebSocket URL is determined in this order:
1. **Runtime environment variable** (`VITE_TALEMATE_BACKEND_WEBSOCKET_URL` at container start)
2. **Auto-detection** (`ws://<current-browser-hostname>:5050/ws`)
This means you can use a single Docker image across different environments (staging, production) by simply changing the environment variable.

View File

@@ -0,0 +1,51 @@
# Debug Logging
By default, Talemate logs at the `INFO` level. To enable more verbose `DEBUG` logging, set the `TALEMATE_DEBUG` environment variable to `1` before starting the server.
This will output detailed debug information from all components, which is useful for troubleshooting issues or reporting bugs.
#### :material-linux: Linux
Prefix the start command with the environment variable:
```bash
TALEMATE_DEBUG=1 ./start.sh
```
Or if running manually:
```bash
TALEMATE_DEBUG=1 uv run src/talemate/server/run.py runserver --host 0.0.0.0 --port 5050
```
#### :material-microsoft-windows: Windows
Set the environment variable before running the start script:
```batch
SET TALEMATE_DEBUG=1
start.bat
```
## Disabling debug logging
To return to normal logging, unset the variable or set it to `0`:
#### :material-linux: Linux
```bash
unset TALEMATE_DEBUG
```
Or simply run without the prefix:
```bash
./start.sh
```
#### :material-microsoft-windows: Windows
```batch
SET TALEMATE_DEBUG=0
start.bat
```

View File

@@ -1,5 +1,8 @@
# Connect a client
!!! note "First time setup?"
If this is your first time launching Talemate, the [Setup Wizard](setup-wizard.md) will guide you through adding your first client and configuring essential settings. This page covers manual client configuration for adding additional clients or if you skipped the wizard.
Once Talemate is up and running and you are connected, you will see a notification in the corner instructing you to configured a client.
![no clients](/talemate/img/0.26.0/no-clients.png)
@@ -36,20 +39,23 @@ Select the API you want to use and click through to follow the instructions to c
- [Anthropic](/talemate/user-guide/clients/types/anthropic/)
- [mistral.ai](/talemate/user-guide/clients/types/mistral/)
- [Cohere](/talemate/user-guide/clients/types/cohere/)
- [DeepSeek](/talemate/user-guide/clients/types/deepseek/)
- [Groq](/talemate/user-guide/clients/types/groq/)
- [Google Gemini](/talemate/user-guide/clients/types/google/)
- [OpenRouter](/talemate/user-guide/clients/types/openrouter/)
##### Local APIs
- [KoboldCpp](/talemate/user-guide/clients/types/koboldcpp/)
- [Text-Generation-WebUI](/talemate/user-guide/clients/types/text-generation-webui/)
- [llama.cpp](/talemate/user-guide/clients/types/llamacpp/)
- [Ollama](/talemate/user-guide/clients/types/ollama/)
- [Text-Generation-WebUI](/talemate/user-guide/clients/types/text-generation-webui/)
- [LMStudio](/talemate/user-guide/clients/types/lmstudio/)
- [TabbyAPI](/talemate/user-guide/clients/types/tabbyapi/)
##### Unofficial OpenAI API implementations
- [DeepInfra](/talemate/user-guide/clients/types/openai-compatible/#deepinfra)
- llamacpp with the `api_like_OAI.py` wrapper
## Assign the client to the agents

View File

@@ -16,10 +16,54 @@ This happens because we mount the config file directly as a docker volume, and i
This will eventually be fixed, for now please make sure to copy the example config file before running the docker compose command.
### Configuring WebSocket URL at Runtime
If you need to connect the frontend to a backend running on a different host or port (e.g., behind a reverse proxy), you can configure this at container startup without rebuilding the image.
Set the `VITE_TALEMATE_BACKEND_WEBSOCKET_URL` environment variable:
```bash
# Using docker run
docker run -e VITE_TALEMATE_BACKEND_WEBSOCKET_URL=wss://api.example.com/ws ghcr.io/vegu-ai/talemate:latest
# Using docker-compose.yml
services:
talemate:
environment:
- VITE_TALEMATE_BACKEND_WEBSOCKET_URL=wss://api.example.com/ws
```
**URL Format:**
- Use `ws://` for unencrypted connections
- Use `wss://` for SSL/TLS connections (required if behind HTTPS proxy)
- Include the `/ws` path suffix
**If not set**, the frontend automatically connects to `ws://<current-hostname>:5050/ws`.
## General
### Running behind reverse proxy with ssl
### Running behind reverse proxy with SSL
Personally i have not been able to make this work yet, but its on my list, issue stems from some vue oddities when specifying the base urls while running in a dev environment. I expect once i start building the project for production this will be resolved.
To run Talemate behind a reverse proxy with SSL:
If you do make it work, please reach out to me so i can update this documentation.
1. Configure your reverse proxy to forward WebSocket connections to the backend (port 5050)
2. Set the WebSocket URL to use your proxy's public address:
```yaml
# docker-compose.yml
environment:
- VITE_TALEMATE_BACKEND_WEBSOCKET_URL=wss://your-domain.com/ws
```
3. Ensure your proxy is configured to handle WebSocket upgrades. Example nginx config:
```nginx
location /ws {
proxy_pass http://talemate:5050/ws;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
}
```

View File

@@ -29,7 +29,7 @@ Once sent, its now the AI's turn to respond - depending on the service and model
### Scenario tools
Above the chat input there is a set of tools to help you interact with the scenario.
Above the chat input there is a set of tools to help you interact with the scenario. You may also notice an **agent activity bar** appearing above the tools when agents are working - this shows which agents are currently processing in the background.
![Getting started ui element tools](/talemate/img/0.26.0/getting-started-ui-element-tools.png)

View File

@@ -0,0 +1,110 @@
# Setup Wizard
When you launch Talemate for the first time, a setup wizard will guide you through the initial configuration. This wizard helps you connect to an AI model and configure essential settings so you can start creating stories right away.
## Step 1: Choose Your AI Provider Type
The first decision is how you want to connect to an AI model. Talemate offers two options:
### Self-hosted
Choose this if you are running your own AI inference service on your computer or a server you control. This includes:
- KoboldCpp
- Text-Generation-WebUI
- llama.cpp
- LMStudio
- TabbyAPI
### Hosted API
Choose this if you want to use a cloud-based AI service. This requires an API key from the provider. Supported services include:
- OpenRouter
- Google
- Anthropic
- And others
![Setup wizard step 1 - choose provider type](../img/0.35.0/setup-wizard-step1-provider-type.png)
## Step 2: Add Your Client
After selecting your provider type, you will choose a specific client to configure.
For **Hosted API** users, OpenRouter is selected by default. For **Self-hosted** users, KoboldCpp is selected by default.
Select your preferred client from the dropdown and click **Add Client** to open the client configuration dialog.
![Setup wizard step 2 - add client](../img/0.35.0/setup-wizard-step2-add-client.png)
Once you complete the client configuration, the wizard will automatically advance to the next step.
!!! info "Reasoning Models"
If you are using a reasoning model (like DeepSeek R1 or GLM), you will need to enable reasoning in the client settings after setup. For OpenRouter users, the default model selected during the wizard (Gemini 3 Flash) has reasoning enabled automatically, but if you switch to a different reasoning-capable model, you'll need to enable it manually in the client settings.
## Step 3: Configure Long-term Memory
Talemate uses embeddings to store and retrieve story details over time. This includes character names, relationships, locations, and other facts that should persist as your story grows.
### Embeddings Model
You have two options:
**Better (Recommended)**
Uses the `Alibaba-NLP/gte-base-en-v1.5` model. This provides more capable memory recall and is recommended for the best experience. The model weights are approximately 550MB and will be downloaded when you first load a scene. If using CUDA, plan for roughly 1GB or more of free VRAM.
**Standard**
A smaller, less capable model. Choose this only if you cannot spare the VRAM for the recommended model.
### Device
Choose where the embeddings model should run:
**CUDA**
Faster performance on NVIDIA GPUs. Recommended if you have an NVIDIA GPU with sufficient free VRAM. The wizard will detect your CUDA availability and display your GPU information.
**CPU**
Works on any system. Use this if you do not have an NVIDIA GPU or prefer not to use GPU memory for embeddings.
![Setup wizard step 3 - memory configuration](../img/0.35.0/setup-wizard-step3-memory.png)
The wizard will automatically detect your system capabilities and suggest appropriate defaults based on your available hardware.
!!! note "Changing these settings later"
You can change the embeddings model and device at any time in **Agents -> Memory**.
## Step 4: Configure Visual Agent (Optional)
This step only appears if you selected **Google** or **OpenRouter** as your client in Step 2.
Since these providers support image generation and analysis, you can configure the Visual Agent to use them for:
- Creating images
- Editing images
- Analyzing images
Choose **Enable** to automatically configure the Visual Agent, or **Skip** to configure it manually later.
![Setup wizard step 4 - visual agent](../img/0.35.0/setup-wizard-step4-visual.png)
## Completing the Wizard
After completing all steps, click **Apply & finish** to save your settings and close the wizard. You are now ready to start using Talemate.
## Skipping the Wizard
If you prefer to configure Talemate manually, you can click **Skip setup for now** at any time. The wizard will close and you can configure clients and settings through the regular interface.
The wizard will not appear again once you have added at least one client. If you skipped the wizard without adding a client, it will appear again the next time you launch Talemate.
## Re-accessing Setup Options
The setup wizard is designed for first-time setup only. However, all the settings it configures can be accessed through the regular interface:
- **Client configuration**: Click **Add Client** in the clients panel on the right side of the screen
- **Memory settings**: Navigate to **Agents -> Memory** and open the settings
- **Visual Agent settings**: Navigate to **Agents -> Visual** and open the settings

Binary file not shown.

Before

Width:  |  Height:  |  Size: 128 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 933 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.5 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.8 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.0 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.0 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.2 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.4 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.6 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 93 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 481 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 314 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 151 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 554 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 144 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 61 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 103 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

View File

@@ -15,6 +15,7 @@ Roleplay with AI with a focus on strong narration and consistent world and game
### First steps
- [Setup Wizard](getting-started/setup-wizard.md) - Initial configuration on first launch
- [Connect a client](getting-started/connect-a-client.md)
- [Load a scene](getting-started/load-a-scene.md)
- [Interact with the scene](user-guide/interacting)

View File

@@ -62,9 +62,19 @@ If > 0 will offset the instructions for the actor (both broad and character spec
![Conversation agent content settings](/talemate/img/0.30.0/conversation-content-settings.png)
Enable this setting to apply a writing style to the generated content.
Content settings control what contextual information is included in the prompts sent to the AI when generating character dialogue.
Make sure the a writing style is selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) to apply the writing style to the generated content.
##### Use Scene Intent
When enabled (default), the [scene intent](/talemate/user-guide/world-editor/scene/direction) (overall intention) will be included in the conversation prompt. This helps the AI generate dialogue that aligns with your story goals and the current scene direction.
Disable this if you want the AI to generate dialogue without being influenced by the scene direction settings.
##### Use Writing Style
When enabled (default), the writing style selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) will be applied to the generated dialogue.
Disable this if you want the AI to generate dialogue without following the scene's writing style template.
## Long Term Memory

View File

@@ -39,9 +39,13 @@ The director can help you with many tasks:
- Create or modify characters, world entries, and story configuration
- Advance time in your story
- Manage game state variables (if your story uses them)
- Generate images and illustrations (if the [Visualizer Agent](/talemate/user-guide/agents/visualizer) is configured)
Simply describe what you want in natural language, and the director will figure out how to accomplish it.
!!! tip "Visual Generation"
When asking the director to create images, the generated visuals can appear in your scene feed as [inline visuals](/talemate/user-guide/inline-visuals). This is controlled by the **Auto-attach visuals** setting in the scene tools visualizer menu.
### Viewing action details
When the director performs an action, you can expand it to see exactly what was done:
@@ -112,6 +116,55 @@ When rejected, the director acknowledges and waits for your next instruction:
![Action Rejected](/talemate/img/0.33.0/director-chat-reject-0002.png)
## Enabling and Disabling Actions
!!! info "New in 0.35.0"
Action toggles were introduced in version 0.35.0.
The director has access to many different actions for querying information, making changes, and progressing your story. You can control which actions the director is allowed to use by enabling or disabling them through the Actions menu.
### Accessing the Actions Menu
Click the **Actions** button in the chat toolbar to open the actions menu.
![Director Chat Actions Menu](/talemate/img/0.35.0/director-chat-actions-menu.png)
### Available Actions
The menu lists all actions grouped by category. Each action has a checkbox indicating whether it is enabled (checked) or disabled (unchecked).
Actions are organized into groups:
| Group | Actions |
|-------|---------|
| **Direct Scene** | Advance time, Create new narration, Direct character action, Yield to User |
| **Query** | Query World Information, Retrieve Context Directly, Query Game State, Query Scene Direction |
| **Update Context** | Existing Characters, World Information, Story Configuration, Static History, Character Creation |
| **Gamestate** | Make Changes (game state variables) |
| **User Interaction** | Prompt for text input (Scene Direction only) |
| **Visuals** | Create new Image(s), Edit Image(s) |
| **Misc** | Directly Retrieve Context |
!!! note "Scene Direction Only Actions"
Some actions are only available during autonomous Scene Direction, not in Director Chat. The **Prompt for text input** action is one example - it allows the director to request information from you during autonomous direction but is not used during interactive chat sessions. See [Prompting the User for Input](/talemate/user-guide/agents/director/scene-direction/#prompting-the-user-for-input) for more details.
### Toggling Actions
Click on an action in the list to toggle it on or off:
- **Enabled** (checked): The director can use this action when responding to your requests
- **Disabled** (unchecked): The director will not use this action, even if it would be helpful
When you disable an action, the director will work around it by using other available actions or by informing you that it cannot complete the requested task.
### Locked Actions
Some actions may be marked as "locked" and cannot be disabled. These are core actions required for the director to function properly. Locked actions appear grayed out in the menu and cannot be toggled.
### Persistence
Your action toggle settings are saved with the scene. When you reload the scene later, your enabled and disabled actions will be restored automatically.
## Director personas
You can customize the director's personality and initial greeting by assigning a persona:

View File

@@ -1,16 +1,30 @@
# Overview
The director agent is responsible for guiding the scene progression and generating dynamic actions.
In the future it will shift / expose more of a game master role, controlling the progression of the story.
The director agent serves as the game master for your scenes, guiding story progression and helping manage the creative experience. It provides several key features that work together to enhance your storytelling.
## Features
### Autonomous Scene Direction
!!! info "New in 0.35.0"
Autonomous Scene Direction replaces the previous Auto Direction feature with a more capable implementation.
Allows the director to autonomously progress your scene using the same actions available through Director Chat. The director analyzes the scene context and decides when and how to move the story forward.
A strong LLM (100B+) with reasoning capabilities is highly recommended for best results.
See the [Autonomous Scene Direction](/talemate/user-guide/agents/director/scene-direction) page for detailed information.
### Director Chat
A conversational interface for interacting with the director directly. You can ask questions, request changes, and guide story progression through natural language.
See the [Director Chat](/talemate/user-guide/agents/director/chat) page for more information.
### Dynamic Actions
Will occasionally generate clickable choices for the user during scene progression. This can be used to allow the user to make choices that will affect the scene or the story in some way without having to manually type out the choice.
Generates clickable choices for the user during scene progression. This allows you to make decisions that affect the scene or story without manually typing out your choice.
### Guide Scene
Will use the summarizer agent's scene analysis to guide characters and the narrator for the next generation, hopefully improving the quality of the generated content.
### Auto Direction
A very experimental feature that will cause the director to attempt to direct the scene automatically, instructing actors or the narrator to move the scene forward according to the story and scene intention.
!!! note "Experimental"
This is the first iteration of this feature and is very much a work in progress. It will likely change substantially in the future.
Uses the summarizer agent's scene analysis to guide characters and the narrator for the next generation, helping improve the quality and coherence of generated content.

View File

@@ -0,0 +1,246 @@
# Autonomous Scene Direction
!!! info "New in 0.35.0"
Autonomous Scene Direction is a new feature introduced in version 0.35.0 that replaces the previous Auto Direction feature.
Autonomous Scene Direction allows the director agent to progress your scene automatically, using the same actions available through the [Director Chat](/talemate/user-guide/agents/director/chat). Instead of manually requesting the director to take actions, the director will analyze the scene and decide what actions to take on its own.
## Requirements
!!! warning "Strong LLM Recommended"
A strong language model (100B+ parameters) with reasoning capabilities is **highly recommended** for meaningful autonomous scene direction. While you may have some success with smaller 32B reasoning models, larger models will produce significantly better results.
See [Reasoning Model Support](/talemate/user-guide/clients/reasoning/) for information on enabling reasoning capabilities.
### Scene Intentions
Autonomous Scene Direction relies on having both a **story intention** and a **scene phase intention** set. Without these, the director lacks the context needed to make meaningful decisions about scene progression.
Set these in the [World Editor - Scene Direction](/talemate/user-guide/world-editor/scene/direction) section:
- **Overall Intention**: The big-picture goal and expectations for your story
- **Current Phase Intention**: The specific goal and context for the current scene
## Enabling Scene Direction
Scene Direction is disabled by default. To enable it:
1. Open the **Director** agent settings from the agents panel
2. Find the **Scene Direction** section
3. Toggle the feature **On**
![Director Scene Direction Settings](/talemate/img/0.35.0/director-scene-direction-settings.png)
!!! tip "Quick Toggle"
Scene Direction has a quick toggle in the agent settings panel, making it easy to turn on and off during play.
## How It Works
When Scene Direction is enabled, the director analyzes the scene after each turn and decides whether to take action. The director can:
- Instruct actors on what to do or say next
- Guide the narrator to progress the story
- Generate new content based on the scene's current state
- Prompt the user for information when needed
- Manage time progression in the story
- Create or modify world entries and characters
The director uses the same actions available in Director Chat, but makes autonomous decisions about when and how to use them based on:
- The overall story intention
- The current scene phase and intention
- The recent scene history
- The participation balance of characters and narrator
### Turn Balance
When **Maintain turn balance** is enabled (the default), the director tracks how often each character and the narrator have participated in recent scene history. This helps ensure:
- No single character dominates the conversation
- The narrator provides enough scene-setting and descriptions
- Neglected characters get opportunities to participate
### Prompting the User for Input
The director can prompt you for information during autonomous scene direction using the **Prompt for text input** action. This allows the director to request your input when it needs guidance or information to continue the story.
When the director uses this action, a text input dialog appears in your scene feed with a title and message explaining what information is needed. You can then type your response and submit it, or cancel the prompt if the input is optional.
![Prompt for Text Input Dialog](/talemate/img/0.35.0/director-prompt-user-dialog.png)
Common situations where the director might prompt you:
- Asking what you want to do next after a significant story event
- Requesting details for character creation when starting a new scene
- Seeking clarification on your intentions when the story could branch in multiple directions
The dialog includes:
- **Title**: A brief heading indicating the nature of the prompt
- **Body**: The full question or request from the director, which may be presented either in-character or out-of-character depending on the context
- **Text Input**: A field where you enter your response (single-line or multi-line depending on the prompt)
- **Continue Button**: Submit your response
- **Cancel Button**: Dismiss the prompt without responding (only available if the input is optional)
After you submit your response, the director receives your input and uses it to inform its next actions in the scene.
This action can be enabled or disabled through the Actions menu if you prefer the director not to prompt you directly. See [Actions Menu](#actions-menu) for details on managing available actions.
## Settings
### General Settings
#### Enable Analysis Step
When enabled, the director performs an internal analysis step before deciding on actions. This helps the director think through complex situations and plan more carefully.
**Recommendation**: Keep this enabled for more thoughtful scene direction.
#### Response Token Budget
Controls the maximum tokens the director can use for reasoning and response generation. Higher values allow for more detailed analysis. Default is 2048.
#### Max Actions Per Turn
The maximum number of actions the director can execute in a single turn. This prevents runaway action chains. Default is 5.
#### Retries
How many times to retry if the director produces a malformed response. Default is 1.
### Context Settings
#### Scene Context Ratio
Controls how the director's context budget is divided between scene context and direction history.
- **Lower values** (e.g., 0.30): 30% for scene context, 70% for direction history
- **Higher values** (e.g., 0.70): 70% for scene context, 30% for direction history
Default is 0.30.
#### Stale History Share
When the direction history needs to be compacted (summarized to save tokens), this controls what fraction is treated as "stale" and summarized versus kept verbatim.
- **Higher values**: Summarize more, keep less verbatim
- **Lower values**: Summarize less, keep more recent messages
Default is 0.70.
### Turn Balance Settings
#### Maintain Turn Balance
When enabled, the director tracks participation of characters and narrator to encourage variety in scene direction.
### Custom Instructions
Add global instructions that will be included in all scene direction prompts across all scenes. Use this for preferences that should apply universally to how the director behaves, regardless of which scene is loaded.
!!! tip "Scene-Specific Instructions"
For instructions tailored to a specific story or scene, use the **Director Instructions** field in the [World Editor - Scene Direction](/talemate/user-guide/world-editor/scene/direction) section instead. Those instructions are stored with the scene and are ideal for genre-specific guidance, story-specific rules, or scene-specific constraints.
### Director Persona
The director persona selected in the agent settings applies to scene direction as well as director chat. Choosing a different persona can significantly affect the director's tone, decision-making style, and how it approaches scene progression.
See [Director Personas](/talemate/user-guide/agents/director/chat/#director-personas) for more information on available personas and how to customize them.
## The Direction Tab
When a scene is loaded, you can view the director's autonomous actions in the **Director Console**. Click the **Direction** tab (bullhorn icon) to see:
![Director Console Scene Direction Tab](/talemate/img/0.35.0/director-console-direction-tab.png)
### Scene Type and Intention
At the top of the Direction tab, you can view and modify the current scene type and phase intention. Changes here affect how the director approaches scene progression.
### Direction Timeline
Below the scene settings is the **Direction Timeline**, which shows:
- The director's reasoning and analysis (if analysis step is enabled)
- Actions taken by the director
- Results of those actions
This provides full transparency into what the director is doing and why.
#### Clearing Direction History
You can clear the direction history using the **Clear** button. This will make the director "forget" previous actions taken during autonomous direction, but will not affect the scene history itself.
### Actions Menu
The **Actions** dropdown lets you enable or disable specific actions the director can use during scene direction. This gives you fine-grained control over what the director is allowed to do autonomously.
![Director Actions Menu](/talemate/img/0.35.0/director-actions-menu.png)
Some actions may be marked as "locked" and cannot be disabled - these are core actions required for scene direction to function.
For a complete list of available actions and their categories, see [Enabling and Disabling Actions](/talemate/user-guide/agents/director/chat/#enabling-and-disabling-actions) in the Director Chat documentation. The same actions are available for both Director Chat and Scene Direction, and your toggle settings are saved separately for each mode.
## Creating Custom Director Actions
Advanced users can create custom director actions using the Node Editor. This allows you to extend what the director can do during autonomous scene direction.
### Director Action Nodes
To create a custom action:
1. Open the **Node Editor** for your scene or module
2. Create a new **Director Chat Action** node
3. Define the action's:
- **Name**: The identifier for the action
- **Description**: What the action does (shown to the LLM)
- **Instructions**: Detailed instructions for how to use the action
4. Connect **Director Chat Sub Action** nodes to define specific behaviors within your action
5. Use **Director Chat Action Argument** nodes to define parameters the LLM can pass to your action
![Director Action Node Example](/talemate/img/0.35.0/director-action-node-example.png)
Sub-actions can be configured for:
- **Both** chat and scene direction modes
- **Chat only** - only available when chatting with the director
- **Scene Direction only** - only available during autonomous direction
### Sub-Action Properties
When creating a sub-action, you can configure:
- **Group**: Organizational group name for the action
- **Action Title**: Display name shown to users
- **Action ID**: Unique identifier
- **Description (Chat)**: Description shown when used in chat mode
- **Description (Scene Direction)**: Description shown when used in autonomous mode
- **Availability**: Which modes the action is available in
- **Force Enabled**: If true, prevents users from disabling this action
## Troubleshooting
### Director Takes No Actions
- Verify both story intention and scene phase intention are set
- Check that Scene Direction is enabled in agent settings
- Ensure Auto Progress is enabled if using the default game loop
### Actions Seem Random or Unhelpful
- Consider using a stronger reasoning model
- Review and refine your scene intentions
- Check if the wrong actions are enabled - disable those you do not want
### Director Keeps Using Same Character
- Enable **Maintain turn balance**
- Check if other characters are properly set up in the scene
- Review character availability settings
### Performance Is Slow
- Reduce **Max actions per turn**
- Lower **Response token budget**
- Consider a faster model or API endpoint

View File

@@ -35,6 +35,20 @@ If `Direction` is selected, the actor will be given the direction as a direct in
If `Inner Monologue` is selected, the actor will be given the direction as a thought.
###### Direction Stickiness
!!! info "New in 0.35.0"
Controls how many scene messages the system looks back when retrieving character directions. This determines how long directions "stick" and continue to influence character behavior.
- **Range**: 1 to 20
- **Default**: 5
When you direct an actor, that direction doesn't just apply to their next response—it persists across multiple turns based on this setting. For example, with a stickiness of 5, a direction to "act suspiciously" will continue to influence the character's behavior for up to 5 relevant scene messages.
!!! note "Time passage clears directions"
Directions are automatically cleared when time passes in the scene. This ensures that directions given in one scene segment don't inappropriately carry over into a new time period.
## Long Term Memory
--8<-- "docs/snippets/tips.md:agent_long_term_memory_settings"
@@ -90,71 +104,89 @@ If enabled the director will guide the narrator in the scene.
The maximum number of tokens for the guidance. (e.g., how long should the guidance be).
## Auto Direction
## Scene Direction
A very experimental first attempt at giving the reigns to the director to direct the scene automatically.
!!! info "New in 0.35.0"
Scene Direction replaces the previous Auto Direction feature with significantly enhanced capabilities.
Currently it can only instruct actors and the narrator, but different actions will be exposed in the future. This is very early in the development cycle and will likely go through substantial changes.
Autonomous Scene Direction allows the director to progress scenes automatically using the same actions available in Director Chat.
!!! note "Both overall and current intent need to be set for auto-direction to be available"
If either the overall or current scene intention is not set, the auto-direction feature will not be available.
For detailed information, see the dedicated [Autonomous Scene Direction](/talemate/user-guide/agents/director/scene-direction) documentation page.
![Auto Direction Unavailable](/talemate/img/0.30.0/auto-direction-unavailable.png)
![Director Scene Direction Settings](/talemate/img/0.35.0/director-scene-direction-settings.png)
Story and scene intentions are set in the [Scene Direction](/talemate/user-guide/world-editor/scene/direction) section of the World Editor.
##### Enable Scene Direction
![Director agent auto direction settings](/talemate/img/0.30.0/director-auto-direction-settings.png)
Toggle to enable or disable autonomous scene direction. This feature is disabled by default.
##### Enable Auto Direction
!!! warning "Strong LLM Required"
A strong language model (100B+) with reasoning capabilities is highly recommended. See [Reasoning Model Support](/talemate/user-guide/clients/reasoning/).
Turn auto direction on and off.
##### Enable Analysis Step
!!! note "Auto progress needs to also be enabled"
If auto direction is enabled, auto progress needs to be enabled as well.
When enabled, the director performs an internal analysis step before deciding on actions.
![Auto Progress On](/talemate/img/0.30.0/auto-progress-on.png)
#### Natural flow
##### Response Token Budget
Will place strict limits on actor turns based on the provided constraints. That means regardless of what the director would like to do, the actor availability will always take precedence.
Maximum tokens for director reasoning and response generation. Default is 2048.
##### Max. Auto turns
##### Max Actions Per Turn
Maximum turns the AI gets in succession (spread accross characters). When this limit is reached, the player will get a turn no matter what.
Maximum number of actions the director can execute per turn. Default is 5.
##### Max. Idle turns
##### Retries
The maximum number of turns a character can go without speaking before they are automatically given a turn by the director. (per character)
Retry count for malformed responses. Default is 1.
##### Max. Repeat Turns
##### Scene Context Ratio
The maximum number of times a character can go in succession without speaking before the director will force them to speak. (per character)
Balance between scene context and direction history in the token budget. Default is 0.30 (30% scene, 70% history).
##### Stale History Share
#### Instructions
When compacting direction history, this fraction is summarized versus kept verbatim. Default is 0.70.
##### Instruct Actors
##### Maintain Turn Balance
Allow the director to instruct actors.
Track character and narrator participation to encourage variety in scene direction.
##### Instruct Narrator
##### Custom Instructions
Allow the director to instruct the narrator.
Custom instructions included in all scene direction prompts to guide the director's behavior.
##### Instruct Frequency
## Character Management
Only pass on instructions to the actors or the narrator every N turns.
The Character Management settings control how the director handles character creation and related tasks.
!!! note "Evaluation of the scene happens regardless"
The director will evaluate the scene after each round regardless of the frequency. This setting merely controls how often the instructions are actually passed on.
![Director Character Management Settings](/talemate/img/0.35.0/director-character-management-settings.png)
##### Evaluate Scene Intention
### Character Creation
Allows the director to evaluate the current scene phase and switch to a different scene type or set a new intention.
!!! info "New in 0.35.0"
The **Limit character attributes** setting is new in version 0.35.0.
The number of turns between evaluations. (0 = NEVER)
##### Limit character attributes
!!! note "Recommended to leave at 0 (never)"
This isn't really working well at this point, so recommended to leave at 0 (never)
Controls the maximum number of attributes that will be generated when creating or updating character sheets. This applies when the director creates new characters or when character sheets are generated through templates.
- **0** (default): No limit - attributes are generated without restriction
- **1-40**: Limits the character sheet to this many attributes
When a limit is set, the AI is instructed to generate no more than the specified number of attributes, and any excess attributes are trimmed during processing.
This setting is useful when you want to keep character sheets concise, or when working with characters that might otherwise generate an excessive number of attributes.
### Persisting Characters
##### Assign Voice (TTS)
If enabled, the director will automatically assign a text-to-speech voice when creating a new character. This requires the TTS agent to be enabled and configured with available voices.
### Generating Visuals
##### Generate Visuals
If enabled, the director is allowed to generate visual assets (portraits, cover images) for characters when requested.
## Director Chat

View File

@@ -6,6 +6,18 @@ You can manage your available embeddings through the application settings.
In the settings dialogue go to **:material-tune: Presets** and then **:material-cube-unfolded: Embeddings**.
!!! warning "INSTRUCTOR Embeddings Removed (0.35.0)"
INSTRUCTOR embeddings are no longer supported. If you were using INSTRUCTOR embeddings, your configuration has been automatically reset to use the default embedding model (all-MiniLM-L6-v2).
**Alternatives:**
- **all-MiniLM-L6-v2** (default) - Fast local embedding, good for most use cases
- **Alibaba-NLP/gte-base-en-v1.5** - More accurate local embedding
- **OpenAI text-embedding-3-small** - Cloud-based option (requires API key)
- **KoboldCpp Client API** - Use an embedding model loaded in KoboldCpp (see [KoboldCpp Embeddings](koboldcpp.md))
Your existing scene memory databases will be re-imported automatically when you load them with the new embedding configuration.
<!--- --8<-- [start:embeddings_setup] -->
## Pre-configured Embeddings
@@ -19,16 +31,6 @@ Fast, but the least accurate.
Sentence transformer model that is decently fast and accurate and will likely become the default for the Memory agent in the future.
### Instructor Models
!!! warning "Support of these likely deprecated"
Its become increasingly difficult to install support for these while keeping other dependencies up to date.
See [this issue](https://github.com/vegu-ai/talemate/issues/176) for more details.
Use the `Alibaba-NLP/Gte-Base-En-V1.5` embedding instead, its pretty close in accuracy and much smaller.
Instructor embeddings, coming in three sizes: `base`, `large`, and `xl`. XL is the most accurate but also has the biggest memory footprint and is the slowest. Using `cuda` is recommended for the `xl` and `large` models.
### OpenAI text-embedding-3-small
OpenAI's current text embedding model. Fast and accurate, but not free.

View File

@@ -2,8 +2,36 @@
The narrator agent handles the generation of narrative text. This could be progressing the story, describing the scene, or providing exposition and answers to questions.
### :material-script: Content
## Scene Intention Awareness
The narrator agent is the first agent that can be influenced by one of your writing style templates.
When you set a **story intention** and **scene phase intention** in the [World Editor - Scene Direction](/talemate/user-guide/world-editor/scene/direction), the narrator automatically incorporates this information into its prompts. This helps the narrator understand both the big-picture goals of your story and the specific objectives of the current scene.
Make sure the a writing style is selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) to apply the writing style to the generated content.
### How It Works
The narrator receives the following context when generating narrative:
- **Story Intention**: Your overarching expectations for the experience (tone, themes, pacing)
- **Scene Type**: The current mode of play (e.g., roleplay, combat, investigation)
- **Scene Phase Intention**: The specific goal and context for what's happening now
With this information, the narrator can better align its output with your creative vision. For example, if your scene intention indicates building tension before a reveal, the narrator will lean into that atmosphere rather than rushing to resolution.
### Setup
To take advantage of scene intention awareness:
1. Open the **World Editor** and navigate to **Scene > Direction**
2. Set an **Overall Intention** describing the story's goals and expectations
3. Set a **Scene Type** and **Current Scene Intention** for the current phase
Both the overall intention and current scene intention should be set for best results. Without them, the narrator generates content without this additional guidance.
For more details on configuring scene direction, see [World Editor - Scene Direction](/talemate/user-guide/world-editor/scene/direction).
## Content
### :material-script: Writing Style
The narrator agent can be influenced by one of your writing style templates.
Make sure a writing style is selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) to apply the writing style to the generated content.

View File

@@ -23,11 +23,19 @@ If checked and talemate detects a repetitive response (based on a threshold), it
![Narrator agent content settings](/talemate/img/0.29.0/narrator-content-settings.png)
The narrator agent is the first agent that can be influenced by one of your writing style templates.
Content settings control what contextual information is included in the prompts sent to the AI when generating narration.
Enable this setting to apply a writing style to the generated content.
##### Use Scene Intent
Make sure the a writing style is selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) to apply the writing style to the generated content.
When enabled (default), the [scene intent](/talemate/user-guide/world-editor/scene/direction) (overall intention) will be included in the narration prompt. This helps the AI generate narrative content that aligns with your story goals and the current scene direction.
Disable this if you want the AI to generate narration without being influenced by the scene direction settings.
##### Use Writing Style
When enabled (default), the writing style selected in the [Scene Settings](/talemate/user-guide/world-editor/scene/settings) will be applied to the generated narration.
Disable this if you want the AI to generate narration without following the scene's writing style template.
## :material-clock-fast: Narrate time passage

View File

@@ -139,6 +139,20 @@ These nodes should be connected to your prompt encoding nodes (for Qwen Image Ed
![Three identical interface nodes labeled "Talemate Reference 1," "2," and "3" are arranged horizontally within a dark-themed node-based editor. Each node features output ports for "IMAGE" and "MASK," along with a file selection field showing "image_qwen_image_edit" and a "choose file to upload" button. Blue and red connection wires link these nodes to other off-screen elements in the workflow.](/talemate/img/0.34.0/comfyui.workflow.setup.talemate-references.png)
### Automatic Deactivation of Unused Reference Nodes
Talemate automatically handles situations where your workflow contains more reference nodes than you provide images for. When you run a generation:
- If you provide fewer reference images than the workflow supports, the unused reference nodes are automatically disconnected from the workflow graph
- If you provide no reference images at all, all reference nodes are disconnected
This means you can use a single image editing workflow for both text-to-image generation and image editing operations. For example, if you configure `qwen_image_edit.json` as your image editing backend:
- When you generate with reference images, those images are uploaded and connected to the appropriate reference nodes
- When you generate without reference images (pure text-to-image), all reference nodes are disconnected automatically, allowing the workflow to run as a standard text-to-image workflow
This behavior prevents errors that would otherwise occur if ComfyUI tried to process reference nodes without actual images loaded into them. You do not need to create separate workflows for text-to-image and image editing - a single workflow with reference nodes can serve both purposes, assuming the model supports it (qwen-image-edit 2511 seems to.)
### Saving and Exporting the Workflow
Once your workflow is configured, you need to save it and export it in the API format for Talemate to use it.

View File

@@ -13,6 +13,7 @@ The Visualizer agent supports multiple image generation backends, allowing you t
- **Multiple Backend Support**: Works with various image generation services including Google, ComfyUI, AUTOMATIC1111, OpenAI, and more
- **Style Templates**: Configure different visual styles for different types of content (character cards, portraits, scene backgrounds, etc.)
- **Visual Library Integration**: Generated images are managed through the Visual Library, where you can organize, iterate, and save visual assets
- **[Inline Visuals](../../inline-visuals.md)**: Generated images can appear directly in your scene feed alongside messages, providing an immersive visual storytelling experience (new in v0.35.0)
- **Automatic Generation**: Optionally allow the agent to automatically generate visual content based on scene context
- **Prompt Generation**: Supports both direct prompts and natural language instructions that incorporate character and scene context
@@ -57,8 +58,11 @@ Quick shortcuts are available through the scenario tools menu, allowing you to q
- **Visualize Scene**: Generate images of the current scene environment
- **Visualize Character**: Generate character portraits or cards
- **Visualize Moment**: Generate scene illustrations depicting the current story moment
These shortcuts support keyboard modifiers: hold **ALT** to generate prompts only (without creating images), or hold **CTRL** to use instruction mode.
These shortcuts support keyboard modifiers: hold **ALT** to generate prompts only (without creating images), or hold **CTRL** to provide custom instructions.
When **Auto-attach visuals** is enabled in the visualizer menu, generated images will automatically appear in your scene feed as [inline visuals](../../inline-visuals.md). You can configure the display size and behavior of these images in the [Appearance Settings](../../app-settings/appearance.md#visuals).
### Director Chat

View File

@@ -2,7 +2,7 @@
![This image displays a dark user interface header labeled "Visualizer," accented with a chromatic aberration effect and a green status dot. Below the title, there are two badges: one labeled "Google" with a monitor icon, and a warning badge featuring a triangle alert symbol that reads "No backend configured."](/talemate/img/0.34.0/visual-agent-general-1.png)
The Visualizer agent settings are organized into two main tabs: **General** and **Styles**.
The Visualizer agent settings are organized into three main sections: **General**, **Prompt Generation**, and **Styles**. Additionally, each backend may have its own configuration options, including [resolution presets](#resolution-presets) for local image generation backends.
![A dark-mode settings interface for a 'Visualizer' tool displaying the 'General' configuration tab. It features dropdown menus showing 'Google' selected as the client with no backends currently configured, alongside a slider for image generation timeout and checkboxes for automatic setup options.](/talemate/img/0.34.0/visual-agent-general-2.png)
@@ -37,6 +37,10 @@ When enabled, allows the Visualizer agent to automatically generate visual conte
This setting is disabled by default, giving you full control over when images are generated.
## Prompt Generation
The Prompt Generation section contains settings that control how image prompts are created and refined before being sent to the image generation backend.
### Fallback Prompt Type
Determines the format used for prompt-only generation when no backends are configured. This setting only affects the output format when generating prompts without actually creating images.
@@ -46,6 +50,62 @@ Available options:
- **Keywords**: Generates prompts using keyword-based formatting
- **Descriptive**: Generates prompts using descriptive text formatting
### Max. Prompt Generation Length
Controls the maximum token length for AI-generated image prompts. When you use Instruct mode or any feature that asks the AI to create a prompt for your image, this setting limits how long that generated prompt can be.
The default is 1024 tokens, and can be adjusted from 512 to 4096 tokens.
Both keyword-style and descriptive prompts are always generated together, so this limit must accommodate both formats.
### Automatic Analysis of References
When enabled, reference images that lack analysis data will be automatically analyzed before being used in prompt generation. This ensures that the AI has detailed information about your reference images when creating prompts, which can lead to better results when generating variations or editing images.
#### How It Works
Normally, you analyze images manually using the **Analyze** button in the [Visual Library](visual-library.md#image-analysis). The analysis text captures details about the image content, which can then be used during prompt generation to help the AI understand what your reference images contain.
With **Automatic Analysis of References** enabled, any reference images that don't already have analysis data will be analyzed on-the-fly when you start a generation. This is particularly useful when:
- You have uploaded images that haven't been analyzed yet
- You're using newly saved images as references before analyzing them
- You want to ensure all references have analysis data without manually analyzing each one
The analysis results are saved to the asset metadata, so each image only needs to be analyzed once. Future generations using the same reference will use the cached analysis.
#### Interaction with Prompt Revision
This setting works in conjunction with **Perform Extra Revision of Editing Prompts** (below). When both settings are enabled and you're generating with references:
1. First, any unanalyzed references are automatically analyzed
2. Then, the prompt revision step uses this analysis data to refine your prompt
If you only enable prompt revision without automatic analysis, the revision step will still work but may have less information about unanalyzed references to work with.
!!! warning "Additional AI Queries"
Enabling this option adds one AI query per unanalyzed reference image. You must have an image analysis backend configured for this feature to work.
This setting is disabled by default.
### Perform Extra Revision of Editing Prompts
When enabled, the AI will refine and simplify image editing prompts based on the provided reference images. This additional processing step can improve generation results by better aligning the prompt with your selected references.
This revision step analyzes the reference images (using their analysis data or tags) and adjusts the prompt to:
- Reference characters or elements by their image number instead of re-describing them
- Preserve the scene composition and setting from your original prompt
- Maintain important context like actions, positioning, and mood
- Only describe differences from the reference images when needed
For example, instead of generating a prompt like "Elena, a tall woman with red hair and green eyes wearing a blue dress, stands in a dimly lit tavern," the revised prompt might become "Elena (IMAGE 1) stands in a dimly lit tavern, looking worried" - letting the reference image provide the character's appearance while your prompt defines the scene.
!!! warning "Additional AI Query"
This adds an extra AI query to the prompt generation process when reference images are provided.
This setting is enabled by default.
## Styles Configuration
![This image shows the "Styles" configuration tab within the dark-mode "Visualizer" interface, listing various dropdown menus for customizing generation settings. Options include "Art Style," set to "Digital Art," alongside selectors for character cards, portraits, and scene elements. A "Manage styles" box at the bottom indicates that additional styles can be created in the Templates manager.](/talemate/img/0.34.0/visual-agent-general-3.png)
@@ -72,3 +132,46 @@ Each style template can include:
- Instructions (specific generation instructions)
These styles are applied automatically when generating images based on the visual type you select.
## Resolution Presets
Local image generation backends (ComfyUI, SD.Next, and AUTOMATIC1111) include a resolution preset picker that lets you quickly select appropriate image dimensions for your generated images. This feature appears in each backend's configuration section.
### How It Works
The resolution preset picker provides settings for three aspect ratios that match the format options available during image generation:
- **Square**: Used for character portraits and icons (e.g., `CHARACTER_PORTRAIT` visual type)
- **Portrait**: Used for tall images like character cards (e.g., `CHARACTER_CARD`, `SCENE_CARD` visual types)
- **Landscape**: Used for wide images like scene backgrounds (e.g., `SCENE_BACKGROUND`, `SCENE_ILLUSTRATION` visual types)
Each resolution setting displays two number fields for width and height, along with a dropdown menu button that reveals available presets.
### Available Presets
The preset picker includes resolution options optimized for different model types:
| Preset | Square | Portrait | Landscape |
|--------|--------|----------|-----------|
| **SD 1.5** | 512 x 512 | 512 x 768 | 768 x 512 |
| **SDXL** | 1024 x 1024 | 832 x 1216 | 1216 x 832 |
| **Qwen Image** | 1328 x 1328 | 928 x 1664 | 1664 x 928 |
| **Z-Image Turbo** | 2048 x 2048 | 1088 x 1920 | 1920 x 1088 |
### Selecting a Preset
To select a resolution preset:
1. Open the Visualizer agent settings
2. Navigate to the backend configuration tab (ComfyUI, SD.Next, or AUTOMATIC1111)
3. Find the **Resolutions** section
4. Click the dropdown menu button next to the resolution you want to change
5. Select the appropriate preset for your model
You can also manually enter custom width and height values by typing directly into the number fields.
!!! note "Backend-Specific Settings"
Resolution presets are configured separately for text-to-image and image editing operations if you're using a backend that supports both (like ComfyUI or SD.Next). This allows you to use different resolutions for each type of generation.
!!! note "Cloud Backends"
Cloud-based backends (Google, OpenAI, OpenRouter) do not have resolution presets because they use fixed or automatically determined resolutions based on the model's capabilities.

View File

@@ -234,6 +234,44 @@ Assets can be configured for use as references in future generations. The **Refe
Tags are particularly useful for organizing large asset libraries and can be used to filter assets in the sidebar.
### Cover Crop
The **Cover crop** tab allows you to define a crop region for an image that will be applied whenever the image is used as a cover image (for characters or scenes). This is useful when you have a wide or tall image where only a specific portion should be displayed in the cover image area.
![Cover crop tab showing the crop editor interface](/talemate/img/0.35.0/visual-library-cover-crop-1.png)
Cover images appear at the top of scenes or as character cards, and the crop ensures the most important part of your image is visible in those displays.
#### Setting Up a Crop Region
To define a crop region:
1. Select an asset from the Scene Assets tree to open it for editing
2. Click the **Cover crop** tab in the metadata panel
3. On the image preview, drag to draw a rectangular crop region
4. Adjust the region as needed:
- **Move**: Drag inside the crop box to reposition it
- **Resize**: Drag any of the four corner handles to resize the crop region
5. Click **Save** to save your changes
![Cover crop editor with a crop region defined](/talemate/img/0.35.0/visual-library-cover-crop-2.png)
The area outside your crop region appears dimmed, giving you a preview of what will be visible when the image is used as a cover.
#### Resetting the Crop
To remove a custom crop and use the full image, click the **Reset** button in the top-right corner of the image preview. This sets the crop region to encompass the entire image.
#### When is the Crop Applied?
The crop region is automatically applied when:
- The image is set as a **scene cover image** and displayed in the scene header
- The image is set as a **character cover image** and displayed in the character panel
- The image appears in any other context that uses the cover image display
The original image file is never modified. The crop is applied dynamically when the image is displayed.
## Image Analysis
Image analysis uses AI to extract detailed information from images. This is useful for:
@@ -246,6 +284,9 @@ Image analysis uses AI to extract detailed information from images. This is usef
!!! note "Image Analysis Backend Required"
Image analysis requires the Image Analysis backend to be configured and available. If the backend is not configured, the Analyze button will be disabled or unavailable. Make sure you have an image analysis backend set up in your visual agent configuration before attempting to analyze images.
!!! tip "Automatic Analysis During Generation"
If you prefer not to manually analyze each image, you can enable **Automatic Analysis of References** in the [Visualizer settings](settings.md#automatic-analysis-of-references). When enabled, any reference images lacking analysis data will be automatically analyzed before prompt generation.
![A tooltip appearing above the "Analyze" button displays the message: "Analyze the image using AI. (Ctrl: set instructions)." The button is part of a dark-mode toolbar, positioned next to a partially visible "Set Cover" option.](/talemate/img/0.34.0/visual-library-21.png)
Click **Analyze** to perform a quick analysis with default settings, or hold **Ctrl** (or **Cmd** on Mac) to open a dialog where you can specify custom analysis instructions.

View File

@@ -16,6 +16,7 @@ In 0.32.0 Talemate's TTS (Text-to-Speech) agent has been completely refactored t
- **[Kokoro](kokoro.md)** - Fastest generation with predefined voice models and mixing
- **[F5-TTS](f5tts.md)** - Fast voice cloning with occasional mispronunciations
- **[Chatterbox](chatterbox.md)** - High-quality voice cloning (slower generation)
- **[Pocket TTS](pocket-tts.md)** - CPU-based voice cloning from Kyutai (no GPU required)
### Remote APIs
- **[ElevenLabs](elevenlabs.md)** - Professional voice synthesis with voice cloning

View File

@@ -0,0 +1,143 @@
# Pocket TTS
Pocket TTS is a local CPU-based text-to-speech model from [Kyutai](https://kyutai.org/) that supports voice cloning from audio files. Unlike other local TTS options that require a GPU, Pocket TTS runs efficiently on your CPU, making it accessible on a wider range of hardware.
![Pocket TTS API settings](/talemate/img/0.35.0/pocket-tts-api-settings.png)
## Key Features
- **CPU-only** - No GPU required, runs on standard computer hardware
- **Voice cloning** - Clone voices from short audio samples (.wav files)
- **Low resource usage** - Uses only 2 CPU cores with a small 100M parameter model
- **Built-in voices** - Includes several ready-to-use voice samples
- **English only** - Currently supports English language generation
## First-Time Setup
The first time you generate audio with Pocket TTS, it will automatically download the model weights. This is a one-time download.
!!! warning "Voice Cloning Access"
Voice cloning requires accepting the model terms on Hugging Face. If voice cloning downloads are blocked:
1. Visit the [Pocket TTS model page](https://huggingface.co/kyutai/pocket-tts) and accept the terms
2. Create a [Hugging Face access token](https://huggingface.co/settings/tokens)
3. Set the token in your environment as `HF_TOKEN`
4. Restart Talemate
## Configuration
##### Variant
The model variant identifier. The default `b6369a24` is the current recommended version.
##### Temperature
Controls voice variation during generation. Higher values (e.g., 1.0) produce more varied but potentially less stable output. Lower values (e.g., 0.5) produce more consistent results. Default is 0.7.
##### LSD Decode Steps
Number of decoding steps. Higher values can improve quality but increase generation time. Default is 1.
##### Noise Clamp
When set above 0, limits noise sampling to prevent extreme values. 0 disables clamping. Default is 0.
##### EOS Threshold
End-of-sequence detection threshold. Controls when the model stops generating audio. Default is -4.0.
##### Frames After EOS
Number of additional audio frames to generate after detecting the end of speech. 0 uses automatic detection. Default is 0.
##### Chunk Size
Text is split into chunks of this size for processing. Smaller values increase responsiveness but may affect natural flow between chunks. 0 disables chunking. Default is 256.
## Built-in Voices
Talemate includes several ready-to-use Pocket TTS voices. These are available immediately without any additional setup:
| Voice | Description |
|-------|-------------|
| Eva | Female, calm, mature, thoughtful |
| Lisa | Female, energetic, young |
| Adam | Male, calm, mature, thoughtful, deep |
| Bradford | Male, calm, mature, thoughtful, deep |
| Julia | Female, calm, mature |
| Zoe | Female |
| William | Male, young |
These voices use audio samples located in the `tts/voice/pocket_tts/` folder within your Talemate installation.
## Adding Custom Voices
### Voice Requirements
Pocket TTS voices use audio files as reference prompts for voice cloning:
- Audio file in .wav format
- Clear speech with minimal background noise
- Single speaker throughout the sample
### Creating a Voice
1. Open the Voice Library
2. Click **:material-plus: New**
3. Select "Pocket TTS" as the provider
4. Configure the voice:
![Add Pocket TTS voice](/talemate/img/0.35.0/add-pocket-tts-voice.png)
**Label:** A descriptive name for the voice (e.g., "Sarah - Warm Female")
**Voice ID / Upload File:** You have two options:
- Upload a .wav file containing the voice sample - the uploaded file becomes the voice ID
- Enter a path to a local .wav file (relative to Talemate workspace or absolute path)
- Enter a Hugging Face URL in the format `hf://kyutai/tts-voices/...`
**Tags:** Add descriptive tags (gender, age, style) for organization and filtering
### Extra Voice Parameters
![Pocket TTS extra voice parameters](/talemate/img/0.35.0/pocket-tts-parameters.png)
##### Truncate Prompt Audio
When enabled, truncates the voice prompt audio to 30 seconds when extracting the voice characteristics. This can help prevent memory issues with very long audio samples.
## Using Hugging Face Voice Catalog
Kyutai provides a catalog of voices on Hugging Face that you can use directly with Pocket TTS. To use a voice from the catalog:
1. Visit the [Kyutai voice catalog](https://huggingface.co/kyutai/tts-voices)
2. Find a voice you want to use
3. Copy the voice path
4. In Talemate, create a new Pocket TTS voice and enter the path as the Voice ID in the format: `hf://kyutai/tts-voices/voice-name/file.wav`
## Troubleshooting
### Model Download Issues
If the model fails to download:
- Check your internet connection
- Verify you have accepted the terms on [Hugging Face](https://huggingface.co/kyutai/pocket-tts)
- Make sure your `HF_TOKEN` environment variable is set correctly
- Try restarting Talemate
### Voice Cloning Not Working
If you can use built-in voices but voice cloning fails:
- Voice cloning requires accepting additional terms on Hugging Face
- Follow the First-Time Setup instructions above to configure your Hugging Face token
### Generation Quality Issues
If the generated audio sounds unusual:
- Try adjusting the Temperature setting - lower values produce more consistent results
- Ensure your voice reference audio is clear with minimal background noise
- Try using a shorter audio sample (5-15 seconds often works well)

View File

@@ -9,6 +9,7 @@ Select which TTS APIs to enable. You can enable multiple APIs simultaneously:
- **Kokoro** - Fastest generation with predefined voice models and mixing
- **F5-TTS** - Fast voice cloning with occasional mispronunciations
- **Chatterbox** - High-quality voice cloning (slower generation)
- **Pocket TTS** - CPU-based voice cloning from Kyutai (no GPU required)
- **ElevenLabs** - Professional voice synthesis with voice cloning
- **Google Gemini-TTS** - Google's text-to-speech service
- **OpenAI** - OpenAI's TTS-1 and TTS-1-HD models

View File

@@ -67,6 +67,13 @@ Check the provider specific documentation for more information on how to configu
- Specify reference text for better quality
- Adjust speed and other parameters
**Pocket TTS:**
- Upload .wav reference files for voice cloning
- Use built-in voice samples included with Talemate
- Use voices from the Hugging Face voice catalog
- Runs on CPU (no GPU required)
**Kokoro:**
- Select from predefined voice models

View File

@@ -47,3 +47,46 @@ If enabled, the proposed changes will be presented as suggestions to the player.
##### Player character
Enable this to have the player character be included in the progression checks.
## Character Portraits
!!! info "New in 0.35.0"
Character portrait features allow automatic portrait selection and generation based on scene context.
![World state agent character portraits settings](/talemate/img/0.35.0/world-state-character-portraits-settings.png)
The Character Portraits settings control how character avatars are displayed alongside dialogue messages and whether they should change automatically based on the scene context.
### Portrait Selection
##### Selection Frequency
Controls how often the World State Agent evaluates which portrait to use for a character based on the current scene context.
- **0**: Never automatically select portraits (portraits must be changed manually)
- **1**: Evaluate with every new message
- **2-10**: Evaluate every N messages
When a message is generated, the agent examines the content and context of the scene, then compares it against the tags associated with each portrait to find the best match.
!!! note "Minimum Portraits Required"
A character needs at least 2 portraits in their visual configuration for automatic selection to activate. You can manage portraits in the [World Editor under Character > Visuals > Portrait](/talemate/user-guide/world-editor/characters/visuals/#portrait).
!!! tip "Tag Your Portraits"
The selection algorithm relies on portrait tags to make decisions. Portraits without tags cannot be intelligently selected. Add descriptive tags like "happy", "sad", "angry", "combat", "formal" to each portrait using the Visual Library.
### Generate New Portraits
##### Generate New Portraits
When enabled, the World State Agent can request the Director to generate new portraits when no suitable portrait is found for the current scene context.
For example, if a character is described as wearing formal attire at a party but no existing portrait shows them in formal wear, the system can automatically commission a new portrait showing the appropriate appearance.
!!! warning "Prerequisites"
This feature requires:
- The Director's **Character Management > Generate Visuals** setting to be enabled
- A Visual Agent with an image generation backend configured
When a new portrait is generated, it is automatically added to the character's portrait collection and tagged based on the scene context.

Some files were not shown because too many files have changed in this diff Show More