0.29.0 (#167 )

* set 0.29.0 * tweaks for dig layered history (wip) * move director agent to directory * relock * remove "none" from dig_layered_history response * determine character development * update character sheet from character development (wip) * org imports * alert outdated template overrides during startup * editor controls normalization of exposition * dialogue formatting refactor * fix narrator.clean_result forcing * regardless of editor fix exposition setting * move more of the dialogue cleanup logic into the editor fix exposition handlers * remove cruft * change ot normal selects and add some margin * move formatting option up * always strip partial sentences * separates exposition fixes from other dialogue cleanup operations, since we still want those * add novel formatting style * honor formatting config when no markers are supplied * fix issue where sometimes character message formatting would miss character name * director can now guide actors through scene analysis * style fixes * typo * select correct system message on direction type * prompt tweaks * disable by default * add support for dynamic instruction injection and include missing guide for internal note usage * change favicon and also indicate business through favicon * img * support xtc, dry and smoothing in text gen webui * prompt tweaks * support xtc, dry, smoothing in koboldcpp client * reorder * dry, xtc and smoothing factor exposed to tabby api client * urls to third party API documentation * remove bos token * add missing preset * focal * focal progress * focal progress and generated suggestions progress * fix issue with discard all suggestions * apply suggestions * move suggestion ux into the world state manager * support generation options for suggestion generation * unused import * refactor focal to json based approach * focal and character suggestion tweaks * rmeove cruft * remove cruft * relock * prompt tweaks * layout spacing updates * ux elements for removal of scenes from quick load menu * context investigation refactor WIP * context investigation refactor * context investigation refactor * context investigation refactor * cleanup * move scene analysis to summarizer agent * remove deprecated context investigation logic * context investigation refactor continued - split into separate file for easier maint * allow direct specification of response context length * context investigation and scene analyzation progress * change analysis length config to number * remove old dig-layered-history templates * summarizer - deep analysis is only available if there is layered history * move world_state agent to dedicated directory * remove unused imports * automatic character progression WIP * character suggestions progress * app busy flag based on agent business * indicate suggestions in world state overview * fix issue with user input cleanup * move conversation agent to a dedicated submodule * Response in action analyze_text_and_extract_context is too short #162 * move narrator agent to its own submodule * narrator improvements WIP * narration improvements WIP * fix issue with regen of character exit narration * narration improvements WIP * prompt tweaks * last_message_of_type can set max iterations * fix multiline parsing * prompt tweaks * director guide actors based of scene analysis * director guidance for actors * prompt tweaks * prompt tweaks * prompt tweaks * fix automatic character proposals not propagating to the ux * fix analysis length * support director guidance in legacy chat format * typo * prompt tweaks * prompt tweaks * error handling * length config * prompt tweaks * typo * remove cruft * prompt tweak * prompt tweak * time passage style changes * remove cruft * deep analysis context investigations honor call limit * refactor conversation agent long term memory to use new memory rag mixin - also streamline prompts * tweaks to RAG mixin agent config * fix narration highlighting * context investgiation fixes director narration guidance summarization tweaks * direactor guide narration progress context investigation fixes that would cause looping of investigations and failure to dig into the correct layers * prompt tweaks * summarization improvements * separate deep analysis chapter selection from analysis into its own prompt * character entry and exit * cache analysis per subtype and some narrator prompt tweaks * separate layered history logic into its own summarizer mixin and expose some additional options * scene can now set an overral writing style using writing style templates narrator option to enable writing style * narrate query writing style support * scene tools - narrator actions refactor to handler and own component * narrator query / look at narrations emitted as context investigation messages refactor context investigation messaage display scene message meta data object * include narrative direction * improve context investigation message prompt insert * reorg supported parameters * fix bug when no message history exists * WIP make regenerate work nicely with director guidance * WIP make regenerate work nicely with director guidance * regenerate conversation fixes * help text * ux tweaks * relock * turn off deep analysis and context investigations by default * long term memory options for director and summarizer * long term memory caching * fix summarization cache toggle not showing up in ux * ux tweaks * layered history summarization includes character information for mentioned characters * deepseek client added * Add fork button to narrator message * analyze and guidance support for time passage narration * cache based on message fingerprint instead of id * configurable system prompts WIP * configurable system prompts WIP * client overrides for system prompts wired to ux * system prompt overhaul * fix issue with unknown system prompt kind * add button to manually request dynamic choices from the director move the generate choices logic of the director agent to its own submodule * remove cruft * 30 may be too long and is causing the client to disappear temporarly * suppoert dynamic choice generate for non player characters * enable `actor` tab for player characters * creator agent now has access to rag tools improve acting instruction generation * client timeout fixes * fix issue where scene removal menu stayed open after remove * expose scene restore functionality to ux * create initial restore point * fix creator extra-context template * didn't mean to remove this * intro scene should be edited through world editor * fix alert * fix partial quotes regardless of editor setting director guidance for conversation reminds to put speech in quotes * fix @ instructions not being passed through to director guidance prompt * anthropic mode list updated * default off * cohere model list updated * reset actAs on next scene load * prompt tweaks * prompt tweaks * prompt tweaks * prompt tweaks * prompt tweaks * remove debug cruft * relock * docs on changing host / port * fix issue with narrator / director actiosn not available on fresh install * fix issue with long content classification determination result * take this reminder to put speech into quotes out for now, it seems to do more harm than good * fix some remaining issues with auto expositon fixes * prompt tweaks * prompt tweaks * fix issue during reload * expensive and warning ux passthrough for agent config * layered sumamry analysation defaults to on * what's new info block added * docs * what's new updated * remove old images * old img cleanup script * prompt tweaks * improve auto prompt template detection via huggingface * add gpt-4o-realtime-preview add gpt-4o-mini-realtime-preview * add o1 and o3-mini * fix o1 and o3 * fix o1 and o3 * more o1 / o3 fixes * o3 fixes
Dockerfile Update (#174 )
2025-12-16 19:57:47 +01:00 · 2025-02-01 17:44:06 +02:00 · 2025-01-30 02:29:44 +02:00 · 2024-11-24 15:43:27 +02:00 · 2024-09-23 12:55:34 +03:00 · 2024-07-26 21:51:07 +03:00
932 changed files with 66207 additions and 21172 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -0,0 +1,30 @@
+name: ci 
+on:
+  push:
+    branches:
+      - master 
+      - main
+      - prep-0.26.0
+permissions:
+  contents: write
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Configure Git Credentials
+        run: |
+          git config user.name github-actions[bot]
+          git config user.email 41898282+github-actions[bot]@users.noreply.github.com
+      - uses: actions/setup-python@v5
+        with:
+          python-version: 3.x
+      - run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV 
+      - uses: actions/cache@v4
+        with:
+          key: mkdocs-material-${{ env.cache_id }}
+          path: .cache
+          restore-keys: |
+            mkdocs-material-
+      - run: pip install mkdocs-material mkdocs-awesome-pages-plugin mkdocs-glightbox
+      - run: mkdocs gh-deploy --force
--- a/.gitignore
+++ b/.gitignore
@@ -1,10 +1,20 @@
-.lmer
 *.pyc
-problems
 *.swp
 *.swo
 *.egg-info
-tales/
 *-internal*
 *.internal*
 *_internal*
+talemate_env
+chroma
+config.yaml
+templates/llm-prompt/user/*.jinja2
+templates/world-state/*.yaml
+scenes/
+!scenes/infinity-quest-dynamic-scenario/
+!scenes/infinity-quest-dynamic-scenario/assets/
+!scenes/infinity-quest-dynamic-scenario/templates/
+!scenes/infinity-quest-dynamic-scenario/infinity-quest.json
+!scenes/infinity-quest/assets/
+!scenes/infinity-quest/infinity-quest.json
+tts_voice_samples/*.wav
--- a/86
+++ b/86
@@ -0,0 +1,86 @@
+# Stage 1: Frontend build
+FROM node:21 AS frontend-build
+
+ENV NODE_ENV=development
+
+WORKDIR /app
+
+# Copy the frontend directory contents into the container at /app
+COPY ./talemate_frontend /app
+
+# Install all dependencies and build
+RUN npm install && npm run build
+
+# Stage 2: Backend build
+FROM python:3.11-slim AS backend-build
+
+WORKDIR /app
+
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    bash \
+    gcc \
+    && rm -rf /var/lib/apt/lists/*
+
+# Install poetry
+RUN pip install poetry
+
+# Copy poetry files
+COPY pyproject.toml poetry.lock* /app/
+
+# Create a virtual environment
+RUN python -m venv /app/talemate_env
+
+# Activate virtual environment and install dependencies
+RUN . /app/talemate_env/bin/activate && \
+    poetry config virtualenvs.create false && \
+    poetry install  --only main --no-root
+
+# Copy the Python source code
+COPY ./src /app/src
+
+# Conditional PyTorch+CUDA install
+ARG CUDA_AVAILABLE=false
+RUN . /app/talemate_env/bin/activate && \
+    if [ "$CUDA_AVAILABLE" = "true" ]; then \
+        echo "Installing PyTorch with CUDA support..." && \
+        pip uninstall torch torchaudio -y && \
+        pip install torch~=2.4.1 torchaudio~=2.4.1 --index-url https://download.pytorch.org/whl/cu121; \
+    fi
+
+# Stage 3: Final image
+FROM python:3.11-slim
+
+WORKDIR /app
+
+RUN apt-get update && apt-get install -y \
+    bash \
+    && rm -rf /var/lib/apt/lists/*
+
+# Copy virtual environment from backend-build stage
+COPY --from=backend-build /app/talemate_env /app/talemate_env
+
+# Copy Python source code
+COPY --from=backend-build /app/src /app/src
+
+# Copy Node.js build artifacts from frontend-build stage
+COPY --from=frontend-build /app/dist /app/talemate_frontend/dist
+
+# Copy the frontend WSGI file if it exists
+COPY frontend_wsgi.py /app/frontend_wsgi.py
+
+# Copy base config
+COPY config.example.yaml /app/config.yaml
+
+# Copy essentials
+COPY scenes templates chroma* /app/
+
+# Set PYTHONPATH to include the src directory
+ENV PYTHONPATH=/app/src:$PYTHONPATH
+
+# Make ports available to the world outside this container
+EXPOSE 5050
+EXPOSE 8080
+
+# Use bash as the shell, activate the virtual environment, and run backend server
+CMD ["/bin/bash", "-c", "source /app/talemate_env/bin/activate && python src/talemate/server/run.py runserver --host 0.0.0.0 --port 5050 --frontend-host 0.0.0.0 --frontend-port 8080"]
--- a/README.md
+++ b/README.md
@@ -1,167 +1,44 @@
 # Talemate

-Allows you to play roleplay scenarios with large language models. 
+Roleplay with AI with a focus on strong narration and consistent world and game state tracking.

-It does not run any large language models itself but relies on existing APIs. Currently supports **text-generation-webui** and **openai**.
+|![Screenshot 3](docs/img/ss-1.png)|![Screenshot 3](docs/img/ss-2.png)|
+|------------------------------------------|------------------------------------------|
+|![Screenshot 4](docs/img/ss-4.png)|![Screenshot 1](docs/img/Screenshot_15.png)|
+|![Screenshot 2](docs/img/Screenshot_16.png)|![Screenshot 3](docs/img/Screenshot_17.png)|

-This means you need to either have an openai api key or know how to setup [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) (locally or remotely via gpu renting.)
+## Core Features

-![Screenshot 1](docs/img/Screenshot_8.png)
-![Screenshot 2](docs/img/Screenshot_2.png)
+- Multiple agents for dialogue, narration, summarization, direction, editing, world state management, character/scenario creation, text-to-speech, and visual generation
+- Supports per agent API selection
+- Long-term memory and passage of time tracking
+- Narrative world state management to reinforce character and world truths
+- Creative tools for managing NPCs, AI-assisted character, and scenario creation with template support
+- Context management for character details, world information, past events, and pinned information
+- Customizable templates for all prompts using Jinja2
+- Modern, responsive UI

-## Current features
+## Documentation

- responive modern ui
- agents
-    - conversation
-    - narration
-    - summarization
-    - director
-    - creative
- multi-client (agents can be connected to separate LLMs)
- long term memory (very experimental at this point)
- narrative world state
- narrative tools
- creative tools 
-    - AI backed character creation with template support (jinja2)
-    - AI backed scenario creation
- runpod integration
- overridable templates foe all LLM prompts. (jinja2)
+- [Installation and Getting started](https://vegu-ai.github.io/talemate/)
+- [User Guide](https://vegu-ai.github.io/talemate/user-guide/interacting/)

-## Planned features
+## Supported APIs

-Kinda making it up as i go along, but i want to lean more into gameplay through AI, keeping track of gamestates, moving away from simply roleplaying towards a more game-ified experience.
+- [OpenAI](https://platform.openai.com/overview)
+- [Anthropic](https://www.anthropic.com/)
+- [mistral.ai](https://mistral.ai/)
+- [Cohere](https://www.cohere.com/)
+- [Groq](https://www.groq.com/)
+- [Google Gemini](https://console.cloud.google.com/)

-In no particular order:
+Supported self-hosted APIs:
+- [KoboldCpp](https://koboldai.org/cpp) ([Local](https://koboldai.org/cpp), [Runpod](https://koboldai.org/runpodcpp), [VastAI](https://koboldai.org/vastcpp), also includes image gen support)
+- [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) (local or with runpod support)
+- [LMStudio](https://lmstudio.ai/)
+- [TabbyAPI](https://github.com/theroyallab/tabbyAPI/)

- Gameplay loop governed by AI
- Improved world state
- Dynamic player choice generation
- Better creative tools
-    - node based scenario / character creation
- Improved long term memory (base is there, but its very rough at the moment)
- Improved director agent
-    - Right now this doesn't really work well on anything but GPT-4 (and even there it's debatable). It tends to steer the story in a way that introduces pacing issues. It needs a model that is creative but also reasons really well i think.
- Automatic1111 client
-
-# Quickstart
-
-## Installation
-
-### Windows
-
-1. Download and install Python 3.10 or higher from the [official Python website](https://www.python.org/downloads/windows/).
-1. Download and install Node.js from the [official Node.js website](https://nodejs.org/en/download/). This will also install npm.
-1. Download the Talemate project to your local machine. Download from [the Releases page](https://github.com/final-wombat/talemate/releases).
-1. Unpack the download and run `install.bat` by double clicking it. This will set up the project on your local machine.
-1. Once the installation is complete, you can start the backend and frontend servers by running `start.bat`.
-1. Navigate your browser to http://localhost:8080
-
-### Linux
-
-`python 3.10` or higher is required.
-
-1. `git clone git@github.com:final-wombat/talemate`
-1. `cd talemate`
-1. `source install.sh`
-1. Start the backend: `python src/talemate/server/run.py runserver --host 0.0.0.0 --port 5001`.
-1. Open a new terminal, navigate to the `talemate_frontend` directory, and start the frontend server by running `npm run serve`.
-
-## Configuration
-
-### OpenAI
-
-To set your openai api key, open `config.yaml` in any text editor and uncomment / add
-
-```yaml
-openai:
-    api_key: sk-my-api-key-goes-here
-```
-
-You will need to restart the backend for this change to take effect.
-
-### RunPod
-
-To set your runpod api key, open `config.yaml` in any text editor and uncomment / add
-
-```yaml
-runpod:
-    api_key: my-api-key-goes-here
-```
-You will need to restart the backend for this change to take effect.
-
-Once the api key is set Pods loaded from text-generation-webui templates (or the bloke's runpod llm template) will be autoamtically added to your client list in talemate. 
-
-**ATTENTION**: Talemate is not a suitable for way for you to determine whether your pod is currently running or not. **Always** check the runpod dashboard to see if your pod is running or not.
-
-## Recommended Models
-
-Note: this is my personal opinion while using talemate. If you find a model that works better for you, let me know about it.
-
-Will be updated as i test more models and over time.
-
-| Model Name                    | Type            | Notes                                                                                                             |
-|-------------------------------|-----------------|-------------------------------------------------------------------------------------------------------------------|
-| [Nous Hermes LLama2](https://huggingface.co/TheBloke/Nous-Hermes-Llama2-GPTQ) | 13B model | My go-to model for 13B parameters. It's good at roleplay and also smart enough to handle the world state and narrative tools. A 13B model loaded via exllama also allows you run chromadb with the xl instructor embeddings off of a single 4090. |
-| [Xwin-LM-13B](https://huggingface.co/TheBloke/Xwin-LM-13B-V0.1-GPTQ) | 13B model | Really strong model, roleplaying capability still needs more testing |
-| [MythoMax](https://huggingface.co/TheBloke/MythoMax-L2-13B-GPTQ) | 13B model | Similar quality to Hermes LLama2, but a bit more creative. Rarely fails on JSON responses. |
-| [Synthia v1.2 34B](https://huggingface.co/TheBloke/Synthia-34B-v1.2-GPTQ) | 34B model | Cannot be run at full context together with chromadb instructor models on a single 4090. But a great choice if you're running chromadb with the default embeddings (or on cpu). |
-| [Xwin-LM-70B](https://huggingface.co/TheBloke/Xwin-LM-70B-V0.1-GPTQ) | 70B model | Great choice if you have the hardware to run it (or can rent it). |
-| [Synthia v1.2 70B](https://huggingface.co/TheBloke/Synthia-70B-v1.2-GPTQ) | 70B model | Great choice if you have the hardware to run it (or can rent it). |
-| [GPT-4](https://platform.openai.com/) | Remote         | Still the best for consistency and reasoning, but is heavily censored. While talemate will send a general "decensor" system prompt, depending on the type of content you want to roleplay, there is a chance your key will be banned. **If you do use this make sure to monitor your api usage, talemate tends to send a lot more requests than other roleplaying applications.** |
-| [GPT-3.5-turbo](https://platform.openai.com/)                 | Remote         | It's really inconsistent with JSON responses, plus its probably still just as heavily censored as GPT-4. If you want to run it i'd suggest running it for the conversation agent, and use GPT-4 for the other agents. **If you do use this make sure to monitor your api usage, talemate tends to send a lot more requests than other roleplaying applications.** |
-
-I have not tested with Llama 1 models in a while, Lazarus was really good at roleplay, but started failing on JSON requirements.
-
-I have not tested with anything below 13B parameters.
-
-## Connecting to an LLM
-
-On the right hand side click the "Add Client" button. If there is no button, you may need to toggle the client options by clicking this button:
-
-![Client options](docs/img/client-options-toggle.png)
-
-### Text-generation-webui
-
-In the modal if you're planning to connect to text-generation-webui, you can likely leave everything as is and just click Save.
-
-![Add client modal](docs/img/add-client-modal.png)
-
-### OpenAI
-
-If you want to add an OpenAI client, just change the client type and select the apropriate model.
-
-![Add client modal](docs/img/add-client-modal-openai.png)
-
-### Ready to go
-
-You will know you are good to go when the client and all the agents have a green dot next to them.
-
-![Ready to go](docs/img/client-setup-complete.png)
-
-## Load the introductory scenario "Infinity Quest"
-
-Generated using talemate creative tools, mostly used for testing / demoing.
-
-You can load it (and any other talemate scenarios or save files) by expanding the "Load" menu in the top left corner and selecting the middle tab. Then simple search for a partial name of the scenario you want to load and click on the result.
-
-![Load scenario location](docs/img/load-scene-location.png)
-
-## Loading character cards
-
-Supports both v1 and v2 chara specs.
-
-Expand the "Load" menu in the top left corner and either click on "Upload a character card" or simply drag and drop a character card file into the same area.
-
-![Load character card location](docs/img/load-card-location.png)
-
-Once a character is uploaded, talemate may actually take a moment because it needs to convert it to a talemate format and will also run additional LLM prompts to generate character attributes and world state.
-
-Make sure you save the scene after the character is loaded as it can then be loaded as normal talemate scenario in the future.
-
-## Further documentation
-
- Creative mode (docs WIP)
- Prompt template overrides
- [ChromaDB (long term memory)](docs/chromadb.md)
- Runpod Integration
+Generic OpenAI api implementations (tested and confirmed working):
+- [DeepInfra](https://deepinfra.com/)
+- [llamacpp](https://github.com/ggerganov/llama.cpp) with the `api_like_OAI.py` wrapper
+- let me know if you have tested any other implementations and they failed / worked or landed somewhere in between
--- a/config.example.yaml
+++ b/config.example.yaml
@@ -1,24 +1,40 @@
+agents: {}
+clients: {}
 creator:
  content_context:
-  - a fun and engaging slice of life story aimed at an adult audience.
-  - a terrifying horror story aimed at an adult audience.
-  - a thrilling action story aimed at an adult audience.
-  - a mysterious adventure aimed at an adult audience.
-  - an epic sci-fi adventure aimed at an adult audience.
-game:
-  default_player_character:
-    color: '#6495ed'
-    description: a young man with a penchant for adventure.
-    gender: male
-    name: Elmer
+  - a fun and engaging slice of life story
+  - a terrifying horror story
+  - a thrilling action story
+  - a mysterious adventure
+  - an epic sci-fi adventure
+
+## Long-term memory

 #chromadb:
 #  embeddings: instructor
 #  instructor_device: cuda
 #  instructor_model: hkunlp/instructor-xl
+#  openai_model: text-embedding-3-small 
  
+## Remote LLMs
+
 #openai:
 #  api_key: <API_KEY>

 #runpod:
-#  api_key: <API_KEY>
+#  api_key: <API_KEY>
+
+## TTS (Text-to-Speech)
+
+#elevenlabs:
+#  api_key: <API_KEY>
+
+#coqui:
+#  api_key: <API_KEY>
+
+#tts:
+#  device: cuda
+#  model: tts_models/multilingual/multi-dataset/xtts_v2
+#  voices:
+#  - label: <name>
+#    value: <path to .wav for voice sample>
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -0,0 +1,21 @@
+version: '3.8'
+
+services:
+  talemate:
+    build:
+      context: .
+      dockerfile: Dockerfile
+      args:
+        - CUDA_AVAILABLE=${CUDA_AVAILABLE:-false}
+    ports:
+      - "${FRONTEND_PORT:-8080}:8080"
+      - "${BACKEND_PORT:-5050}:5050"
+    volumes:
+      - ./config.yaml:/app/config.yaml
+      - ./scenes:/app/scenes
+      - ./templates:/app/templates
+      - ./chroma:/app/chroma
+    environment:
+      - PYTHONUNBUFFERED=1
+      - PYTHONPATH=/app/src:$PYTHONPATH
+    command: ["/bin/bash", "-c", "source /app/talemate_env/bin/activate && python src/talemate/server/run.py runserver --host 0.0.0.0 --port 5050 --frontend-host 0.0.0.0 --frontend-port 8080"]
--- a/docs/.pages
+++ b/docs/.pages
@@ -0,0 +1,5 @@
+nav:
+    - Home: index.md
+    - Getting started: getting-started
+    - User guide: user-guide
+    - Developer guide: dev
--- a/docs/chromadb.md
+++ b/docs/chromadb.md
@@ -1,61 +0,0 @@
-# ChromaDB
-
-Talemate uses ChromaDB to maintain long-term memory. The default embeddings used are really fast but also not incredibly accurate. If you want to use more accurate embeddings you can use the instructor embeddings or the openai embeddings. See below for instructions on how to enable these.
-
-In my testing so far, instructor-xl has proved to be the most accurate (even more-so than openai)
-
-## Local instructor embeddings
-
-If you want chromaDB to use the more accurate (but much slower) instructor embeddings add the following to `config.yaml`:
-
-**Note**: The `xl` model takes a while to load even with cuda. Expect a minute of loading time on the first scene you load.
-
-```yaml
-chromadb:
-    embeddings: instructor
-    instructor_device: cpu
-    instructor_model: hkunlp/instructor-xl
-```
-
-### Instructor embedding models
-
- `hkunlp/instructor-base` (smallest / fastest)
- `hkunlp/instructor-large` 
- `hkunlp/instructor-xl` (largest / slowest) - requires about 5GB of memory
-
-You will need to restart the backend for this change to take effect.
-
-**NOTE** - The first time you do this it will need to download the instructor model you selected. This may take a while, and the talemate backend will be un-responsive during that time.
-
-Once the download is finished, if talemate is still un-responsive, try reloading the front-end to reconnect. When all fails just restart the backend as well. I'll try to make this more robust in the future.
-
-### GPU support
-
-If you want to use the instructor embeddings with GPU support, you will need to install pytorch with CUDA support. 
-
-To do this on windows, run `install-pytorch-cuda.bat` from the project directory. Then change your device in the config to `cuda`:
-
-```yaml
-chromadb:
-    embeddings: instructor
-    instructor_device: cuda
-    instructor_model: hkunlp/instructor-xl
-```
-
-## OpenAI embeddings
-
-First make sure your openai key is specified in the `config.yaml` file
-
-```yaml
-openai:
-  api_key: <your-key-here>
-```
-
-Then add the following to `config.yaml` for chromadb:
-
-```yaml
-chromadb:
-    embeddings: openai
-```
-
-**Note**: As with everything openai, using this isn't free. It's way cheaper than their text completion though. ALSO - if you send super explicit content they may flag / ban your key, so keep that in mind (i hear they usually send warnings first though), and always monitor your usage on their dashboard.
--- a/docs/cleanup.py
+++ b/docs/cleanup.py
@@ -0,0 +1,166 @@
+import os
+import re
+import subprocess
+from pathlib import Path
+import argparse
+
+def find_image_references(md_file):
+    """Find all image references in a markdown file."""
+    with open(md_file, 'r', encoding='utf-8') as f:
+        content = f.read()
+    
+    pattern = r'!\[.*?\]\((.*?)\)'
+    matches = re.findall(pattern, content)
+    
+    cleaned_paths = []
+    for match in matches:
+        path = match.lstrip('/')
+        if 'img/' in path:
+            path = path[path.index('img/') + 4:]
+            # Only keep references to versioned images
+            parts = os.path.normpath(path).split(os.sep)
+            if len(parts) >= 2 and parts[0].replace('.', '').isdigit():
+                cleaned_paths.append(path)
+    
+    return cleaned_paths
+
+def scan_markdown_files(docs_dir):
+    """Recursively scan all markdown files in the docs directory."""
+    md_files = []
+    for root, _, files in os.walk(docs_dir):
+        for file in files:
+            if file.endswith('.md'):
+                md_files.append(os.path.join(root, file))
+    return md_files
+
+def find_all_images(img_dir):
+    """Find all image files in version subdirectories."""
+    image_files = []
+    for root, _, files in os.walk(img_dir):
+        # Get the relative path from img_dir to current directory
+        rel_dir = os.path.relpath(root, img_dir)
+        
+        # Skip if we're in the root img directory
+        if rel_dir == '.':
+            continue
+            
+        # Check if the immediate parent directory is a version number
+        parent_dir = rel_dir.split(os.sep)[0]
+        if not parent_dir.replace('.', '').isdigit():
+            continue
+            
+        for file in files:
+            if file.lower().endswith(('.png', '.jpg', '.jpeg', '.gif', '.svg')):
+                rel_path = os.path.relpath(os.path.join(root, file), img_dir)
+                image_files.append(rel_path)
+    return image_files
+
+def grep_check_image(docs_dir, image_path):
+    """
+    Check if versioned image is referenced anywhere using grep.
+    Returns True if any reference is found, False otherwise.
+    """
+    try:
+        # Split the image path to get version and filename
+        parts = os.path.normpath(image_path).split(os.sep)
+        version = parts[0]  # e.g., "0.29.0"
+        filename = parts[-1]  # e.g., "world-state-suggestions-2.png"
+        
+        # For versioned images, require both version and filename to match
+        version_pattern = f"{version}.*{filename}"
+        try:
+            result = subprocess.run(
+                ['grep', '-r', '-l', version_pattern, docs_dir],
+                capture_output=True,
+                text=True
+            )
+            if result.stdout.strip():
+                print(f"Found reference to {image_path} with version pattern: {version_pattern}")
+                return True
+        except subprocess.CalledProcessError:
+            pass
+            
+    except Exception as e:
+        print(f"Error during grep check for {image_path}: {e}")
+    
+    return False
+
+def main():
+    parser = argparse.ArgumentParser(description='Find and optionally delete unused versioned images in MkDocs project')
+    parser.add_argument('--docs-dir', type=str, required=True, help='Path to the docs directory')
+    parser.add_argument('--img-dir', type=str, required=True, help='Path to the images directory')
+    parser.add_argument('--delete', action='store_true', help='Delete unused images')
+    parser.add_argument('--verbose', action='store_true', help='Show all found references and files')
+    parser.add_argument('--skip-grep', action='store_true', help='Skip the additional grep validation')
+    args = parser.parse_args()
+
+    # Convert paths to absolute paths
+    docs_dir = os.path.abspath(args.docs_dir)
+    img_dir = os.path.abspath(args.img_dir)
+
+    print(f"Scanning markdown files in: {docs_dir}")
+    print(f"Looking for versioned images in: {img_dir}")
+
+    # Get all markdown files
+    md_files = scan_markdown_files(docs_dir)
+    print(f"Found {len(md_files)} markdown files")
+
+    # Collect all image references
+    used_images = set()
+    for md_file in md_files:
+        refs = find_image_references(md_file)
+        used_images.update(refs)
+
+    # Get all actual images (only from version directories)
+    all_images = set(find_all_images(img_dir))
+
+    if args.verbose:
+        print("\nAll versioned image references found in markdown:")
+        for img in sorted(used_images):
+            print(f"- {img}")
+        
+        print("\nAll versioned images in directory:")
+        for img in sorted(all_images):
+            print(f"- {img}")
+
+    # Find potentially unused images
+    unused_images = all_images - used_images
+
+    # Additional grep validation if not skipped
+    if not args.skip_grep and unused_images:
+        print("\nPerforming additional grep validation...")
+        actually_unused = set()
+        for img in unused_images:
+            if not grep_check_image(docs_dir, img):
+                actually_unused.add(img)
+        
+        if len(actually_unused) != len(unused_images):
+            print(f"\nGrep validation found {len(unused_images) - len(actually_unused)} additional image references!")
+        unused_images = actually_unused
+
+    # Report findings
+    print("\nResults:")
+    print(f"Total versioned images found: {len(all_images)}")
+    print(f"Versioned images referenced in markdown: {len(used_images)}")
+    print(f"Unused versioned images: {len(unused_images)}")
+
+    if unused_images:
+        print("\nUnused versioned images:")
+        for img in sorted(unused_images):
+            print(f"- {img}")
+            
+        if args.delete:
+            print("\nDeleting unused versioned images...")
+            for img in unused_images:
+                full_path = os.path.join(img_dir, img)
+                try:
+                    os.remove(full_path)
+                    print(f"Deleted: {img}")
+                except Exception as e:
+                    print(f"Error deleting {img}: {e}")
+            print("\nDeletion complete")
+    else:
+        print("\nNo unused versioned images found!")
+
+if __name__ == "__main__":
+    main()
--- a/docs/dev/agents/example/test/init.py
+++ b/docs/dev/agents/example/test/init.py
@@ -0,0 +1,48 @@
+from talemate.agents.base import Agent, AgentAction
+from talemate.agents.registry import register
+from talemate.events import GameLoopEvent
+import talemate.emit.async_signals
+from talemate.emit import emit
+
+@register()
+class TestAgent(Agent):
+    
+    agent_type = "test"
+    verbose_name = "Test"
+    
+    def __init__(self, client):
+        self.client = client
+        self.is_enabled = True
+        self.actions = {
+            "test": AgentAction(
+                enabled=True,
+                label="Test",
+                description="Test",
+            ),
+        }
+        
+    @property
+    def enabled(self):
+        return self.is_enabled
+
+    @property
+    def has_toggle(self):
+        return True
+
+    @property
+    def experimental(self):
+        return True
+
+    def connect(self, scene):
+        super().connect(scene)
+        talemate.emit.async_signals.get("game_loop").connect(self.on_game_loop)
+        
+    async def on_game_loop(self, emission: GameLoopEvent):
+        """
+        Called on the beginning of every game loop
+        """
+
+        if not self.enabled:
+            return
+
+        emit("status", status="info", message="Annoying you with a test message every game loop.")
--- a/docs/dev/client/example/runpod_vllm/init.py
+++ b/docs/dev/client/example/runpod_vllm/init.py
@@ -0,0 +1,130 @@
+"""
+An attempt to write a client against the runpod serverless vllm worker.
+
+This is close to functional, but since runpod serverless gpu availability is currently terrible, i have
+been unable to properly test it.
+
+Putting it here for now since i think it makes a decent example of how to write a client against a new service.
+"""
+
+import pydantic
+import structlog
+import runpod
+import asyncio
+import aiohttp
+from talemate.client.base import ClientBase, ExtraField
+from talemate.client.registry import register
+from talemate.emit import emit
+from talemate.config import Client as BaseClientConfig
+
+log = structlog.get_logger("talemate.client.runpod_vllm")
+
+class Defaults(pydantic.BaseModel):
+    max_token_length: int = 4096
+    model: str = ""
+    runpod_id: str = ""
+    
+class ClientConfig(BaseClientConfig):
+    runpod_id: str = ""
+
+@register()
+class RunPodVLLMClient(ClientBase):
+    client_type = "runpod_vllm"
+    conversation_retries = 5
+    config_cls = ClientConfig
+
+    class Meta(ClientBase.Meta):
+        title: str = "Runpod VLLM"
+        name_prefix: str = "Runpod VLLM"
+        enable_api_auth: bool = True
+        manual_model: bool = True
+        defaults: Defaults = Defaults()
+        extra_fields: dict[str, ExtraField] = {
+            "runpod_id": ExtraField(
+                name="runpod_id",
+                type="text",
+                label="Runpod ID",
+                required=True,
+                description="The Runpod ID to connect to.",
+            )
+        }
+
+
+    def __init__(self, model=None, runpod_id=None, **kwargs):
+        self.model_name = model
+        self.runpod_id = runpod_id
+        super().__init__(**kwargs)
+
+    @property
+    def experimental(self):
+        return False
+
+
+    def set_client(self, **kwargs):
+        log.debug("set_client", kwargs=kwargs, runpod_id=self.runpod_id)
+        self.runpod_id = kwargs.get("runpod_id", self.runpod_id)
+        
+        
+    def tune_prompt_parameters(self, parameters: dict, kind: str):
+        super().tune_prompt_parameters(parameters, kind)
+
+        keys = list(parameters.keys())
+
+        valid_keys = ["temperature", "top_p", "max_tokens"]
+
+        for key in keys:
+            if key not in valid_keys:
+                del parameters[key]
+
+    async def get_model_name(self):
+        return self.model_name
+
+    async def generate(self, prompt: str, parameters: dict, kind: str):
+        """
+        Generates text from the given prompt and parameters.
+        """
+        prompt = prompt.strip()
+
+        self.log.debug("generate", prompt=prompt[:128] + " ...", parameters=parameters)
+
+        try:
+            
+            async with aiohttp.ClientSession() as session:
+                endpoint = runpod.AsyncioEndpoint(self.runpod_id, session)
+                
+                run_request = await endpoint.run({
+                    "input": {
+                        "prompt": prompt,   
+                    }
+                    #"parameters": parameters
+                })
+                
+                while (await run_request.status()) not in ["COMPLETED", "FAILED", "CANCELLED"]:
+                    status = await run_request.status()
+                    log.debug("generate", status=status)
+                    await asyncio.sleep(0.1)
+                    
+                status = await run_request.status()
+                
+                log.debug("generate", status=status)
+                    
+                response = await run_request.output()
+                
+                log.debug("generate", response=response)
+                
+                return response["choices"][0]["tokens"][0]
+            
+        except Exception as e:
+            self.log.error("generate error", e=e)
+            emit(
+                "status", message="Error during generation (check logs)", status="error"
+            )
+            return ""
+
+    def reconfigure(self, **kwargs):
+        if kwargs.get("model"):
+            self.model_name = kwargs["model"]
+        if "runpod_id" in kwargs:
+            self.api_auth = kwargs["runpod_id"]
+        log.warning("reconfigure", kwargs=kwargs)
+        self.set_client(**kwargs)
--- a/docs/dev/client/example/test/init.py
+++ b/docs/dev/client/example/test/init.py
@@ -0,0 +1,67 @@
+import pydantic
+from openai import AsyncOpenAI
+
+from talemate.client.base import ClientBase
+from talemate.client.registry import register
+
+
+class Defaults(pydantic.BaseModel):
+    api_url: str = "http://localhost:1234"
+    max_token_length: int = 4096
+
+@register()
+class TestClient(ClientBase):
+    client_type = "test"
+
+    class Meta(ClientBase.Meta):
+        name_prefix: str = "test"
+        title: str = "Test"
+        defaults: Defaults = Defaults()
+
+    def set_client(self, **kwargs):
+        self.client = AsyncOpenAI(base_url=self.api_url + "/v1", api_key="sk-1111")
+
+    def tune_prompt_parameters(self, parameters: dict, kind: str):
+        
+        """
+        Talemate adds a bunch of parameters to the prompt, but not all of them are valid for all clients.
+        
+        This method is called before the prompt is sent to the client, and it allows the client to remove
+        any parameters that it doesn't support.
+        """
+        
+        super().tune_prompt_parameters(parameters, kind)
+
+        keys = list(parameters.keys())
+
+        valid_keys = ["temperature", "top_p"]
+
+        for key in keys:
+            if key not in valid_keys:
+                del parameters[key]
+
+    async def get_model_name(self):
+        
+        """
+        This should return the name of the model that is being used.
+        """
+        
+        return "Mock test model"
+
+    async def generate(self, prompt: str, parameters: dict, kind: str):
+        """
+        Generates text from the given prompt and parameters.
+        """
+        human_message = {"role": "user", "content": prompt.strip()}
+
+        self.log.debug("generate", prompt=prompt[:128] + " ...", parameters=parameters)
+
+        try:
+            response = await self.client.chat.completions.create(
+                model=self.model_name, messages=[human_message], **parameters
+            )
+
+            return response.choices[0].message.content
+        except Exception as e:
+            self.log.error("generate error", e=e)
+            return ""
--- a/docs/dev/index.md
+++ b/docs/dev/index.md
@@ -0,0 +1,3 @@
+# Coming soon
+
+Developer documentation is coming soon. Stay tuned! 
--- a/docs/dev/templates.md
+++ b/docs/dev/templates.md
@@ -0,0 +1,85 @@
+# Template Overrides
+
+!!! warning "Old documentation"
+    This is old documentation and needs to be updated, however may still contain useful information.
+
+## Introduction to Templates
+
+In Talemate, templates are used to generate dynamic content for various agents involved in roleplaying scenarios. These templates leverage the Jinja2 templating engine, allowing for the inclusion of variables, conditional logic, and custom functions to create rich and interactive narratives.
+
+## Template Structure
+
+A typical template in Talemate consists of several sections, each enclosed within special section tags (`<|SECTION:NAME|>` and `<|CLOSE_SECTION|>`). These sections can include character details, dialogue examples, scenario overviews, tasks, and additional context. Templates utilize loops and blocks to iterate over data and render content conditionally based on the task requirements.
+
+## Overriding Templates
+
+Users can customize the behavior of Talemate by overriding the default templates. To override a template, create a new template file with the same name in the `./templates/prompts/{agent}/` directory. When a custom template is present, Jinja2 will prioritize it over the default template located in the `./src/talemate/prompts/templates/{agent}/` directory.
+
+## Creator Agent Templates
+
+The creator agent templates allow for the creation of new characters within the character creator. Following the naming convention `character-attributes-*.jinja2`, `character-details-*.jinja2`, and `character-example-dialogue-*.jinja2`, users can add new templates that will be available for selection in the character creator.
+
+### Requirements for Creator Templates
+
+- All three types (`attributes`, `details`, `example-dialogue`) need to be available for a choice to be valid in the character creator.
+- Users can check the human templates for an understanding of how to structure these templates.
+
+### Example Templates
+
+- `src/talemate/prompts/templates/creator/character-attributes-human.jinja2`
+- `src/talemate/prompts/templates/creator/character-details-human.jinja2`
+- `src/talemate/prompts/templates/creator/character-example-dialogue-human.jinja2`
+
+These example templates can serve as a guide for users to create their own custom templates for the character creator.
+
+### Extending Existing Templates
+
+Jinja2's template inheritance feature allows users to extend existing templates and add extra information. By using the `{% extends "template-name.jinja2" %}` tag, a new template can inherit everything from an existing template and then add or override specific blocks of content.
+
+#### Example
+
+To add a description of a character's hairstyle to the human character details template, you could create a new template like this:
+
+```jinja2
+{% extends "character-details-human.jinja2" %}
+{% block questions %}
+{% if character_details.q("what does "+character.name+"'s hair look like?") -%}
+Briefly describe {{ character.name }}'s hair-style using a narrative writing style that reminds of mid 90s point and click adventure games. (2 - 3 sentences).
+{% endif %}
+{% endblock %}
+```
+
+This example shows how to extend the `character-details-human.jinja2` template and add a block for questions about the character's hair. The `{% block questions %}` tag is used to define a section where additional questions can be inserted or existing ones can be overridden.
+
+## Advanced Template Topics
+
+### Jinja2 Functions in Talemate
+
+Talemate exposes several functions to the Jinja2 template environment, providing utilities for data manipulation, querying, and controlling content flow. Here's a list of available functions:
+
+1. `set_prepared_response(response, prepend)`: Sets the prepared response with an optional prepend string. This function allows the template to specify the beginning of the LLM response when processing the rendered template. For example, `set_prepared_response("Certainly!")` will ensure that the LLM's response starts with "Certainly!".
+2. `set_prepared_response_random(responses, prefix)`: Chooses a random response from a list and sets it as the prepared response with an optional prefix.
+3. `set_eval_response(empty)`: Prepares the response for evaluation, optionally initializing a counter for an empty string.
+4. `set_json_response(initial_object, instruction, cutoff)`: Prepares for a JSON response with an initial object and optional instruction and cutoff.
+5. `set_question_eval(question, trigger, counter, weight)`: Sets up a question for evaluation with a trigger, counter, and weight.
+6. `disable_dedupe()`: Disables deduplication of the response text.
+7. `random(min, max)`: Generates a random integer between the specified minimum and maximum.
+8. `query_scene(query, at_the_end, as_narrative)`: Queries the scene with a question and returns the formatted response.
+9. `query_text(query, text, as_question_answer)`: Queries a text with a question and returns the formatted response.
+10. `query_memory(query, as_question_answer, **kwargs)`: Queries the memory with a question and returns the formatted response.
+11. `instruct_text(instruction, text)`: Instructs the text with a command and returns the result.
+12. `retrieve_memories(lines, goal)`: Retrieves memories based on the provided lines and an optional goal.
+13. `uuidgen()`: Generates a UUID string.
+14. `to_int(x)`: Converts the given value to an integer.
+15. `config`: Accesses the configuration settings.
+16. `len(x)`: Returns the length of the given object.
+17. `count_tokens(x)`: Counts the number of tokens in the given text.
+18. `print(x)`: Prints the given object (mainly for debugging purposes).
+
+These functions enhance the capabilities of templates, allowing for dynamic and interactive content generation.
+
+### Error Handling
+
+Errors encountered during template rendering are logged and propagated to the user interface. This ensures that users are informed of any issues that may arise, allowing them to troubleshoot and resolve problems effectively.
+
+By following these guidelines, users can create custom templates that tailor the Talemate experience to their specific storytelling needs.# Template Overrides in Talemate
--- a/docs/dev/third-party-reference.md
+++ b/docs/dev/third-party-reference.md
@@ -0,0 +1,14 @@
+## Third Party API docs
+
+### Chat completions
+
+- [Anthropic](https://docs.anthropic.com/en/api/messages)
+- [Cohere](https://docs.cohere.com/reference/chat)
+- [Google AI](https://ai.google.dev/api/generate-content#v1beta.GenerationConfig)
+- [Groq](https://console.groq.com/docs/api-reference#chat-create)
+- [KoboldCpp](https://lite.koboldai.net/koboldcpp_api#/api/v1)
+- [LMStudio](https://lmstudio.ai/docs/api/rest-api)
+- [Mistral AI](https://docs.mistral.ai/api/)
+- [OpenAI](https://platform.openai.com/docs/api-reference/completions)
+- [TabbyAPI](https://theroyallab.github.io/tabbyAPI/#operation/chat_completion_request_v1_chat_completions_post)
+- [Text-Generation-WebUI](https://github.com/oobabooga/text-generation-webui/blob/main/extensions/openai/typing.py)
--- a/docs/getting-started/.pages
+++ b/docs/getting-started/.pages
@@ -0,0 +1,5 @@
+nav:
+    - 1. Installation: installation
+    - 2. Connect a client: connect-a-client.md
+    - 3. Load a scene: load-a-scene.md
+    - ...
--- a/docs/getting-started/advanced/.pages
+++ b/docs/getting-started/advanced/.pages
@@ -0,0 +1,3 @@
+nav:
+    - change-host-and-port.md
+    - ...
--- a/docs/getting-started/advanced/change-host-and-port.md
+++ b/docs/getting-started/advanced/change-host-and-port.md
@@ -0,0 +1,102 @@
+# Changing host and port
+
+## Backend
+
+By default, the backend listens on `localhost:5050`.
+
+To run the server on a different host and port, you need to change the values passed to the `--host` and `--port` parameters during startup and also make sure the frontend knows the new values.
+
+### Changing the host and port for the backend
+
+#### :material-linux: Linux
+
+Copy `start.sh` to `start_custom.sh` and edit the `--host` and `--port` parameters in the `uvicorn` command.
+
+```bash
+#!/bin/sh
+. talemate_env/bin/activate
+python src/talemate/server/run.py runserver --host 0.0.0.0 --port 1234
+```
+
+#### :material-microsoft-windows: Windows
+
+Copy `start.bat` to `start_custom.bat` and edit the `--host` and `--port` parameters in the `uvicorn` command.
+
+```batch
+start cmd /k "cd talemate_env\Scripts && activate && cd ../../ && python src\talemate\server\run.py runserver --host 0.0.0.0 --port 1234"
+```
+
+### Letting the frontend know about the new host and port
+
+Copy `talemate_frontend/example.env.development.local` to `talemate_frontend/.env.production.local` and edit the `VUE_APP_TALEMATE_BACKEND_WEBSOCKET_URL`.
+
+```env
+VUE_APP_TALEMATE_BACKEND_WEBSOCKET_URL=ws://localhost:1234
+```
+
+Next rebuild the frontend.
+
+```bash
+cd talemate_frontend
+npm run build
+```
+
+### Start the backend and frontend
+
+Start the backend and frontend as usual.
+
+#### :material-linux: Linux
+
+```bash
+./start_custom.sh
+```
+
+#### :material-microsoft-windows: Windows
+
+```batch
+start_custom.bat
+```
+
+## Frontend
+
+By default, the frontend listens on `localhost:8080`.
+
+To change the frontend host and port, you need to change the values passed to the `--frontend-host` and `--frontend-port` parameters during startup.
+
+### Changing the host and port for the frontend
+
+#### :material-linux: Linux
+
+Copy `start.sh` to `start_custom.sh` and edit the `--frontend-host` and `--frontend-port` parameters.
+
+```bash
+#!/bin/sh
+. talemate_env/bin/activate
+python src/talemate/server/run.py runserver --host 0.0.0.0 --port 5055 \
+--frontend-host localhost --frontend-port 8082
+```
+
+#### :material-microsoft-windows: Windows
+
+Copy `start.bat` to `start_custom.bat` and edit the `--frontend-host` and `--frontend-port` parameters.
+
+```batch
+start cmd /k "cd talemate_env\Scripts && activate && cd ../../ && python src\talemate\server\run.py runserver --host 0.0.0.0 --port 5055 --frontend-host localhost --frontend-port 8082"
+```
+
+### Start the backend and frontend
+
+Start the backend and frontend as usual.
+
+#### :material-linux: Linux
+
+```bash
+./start_custom.sh
+```
+
+#### :material-microsoft-windows: Windows
+
+```batch
+start_custom.bat
+```
+
--- a/docs/getting-started/connect-a-client.md
+++ b/docs/getting-started/connect-a-client.md
@@ -0,0 +1,68 @@
+# Connect a client
+
+Once Talemate is up and running and you are connected, you will see a notification in the corner instructing you to configured a client.
+
+![no clients](/talemate/img/0.26.0/no-clients.png)
+
+Talemate uses client(s) to connect to local or remote AI text generation APIs like koboldcpp, text-generation-webui or OpenAI.
+
+## Add a new client
+
+On the right hand side click the **:material-plus-box: ADD CLIENT** button. 
+
+![connect a client add client](/talemate/img/0.26.0/connect-a-client-add-client.png)
+
+!!! note "No button?"
+    If there is no button, you may need to toggle the client options by clicking this button
+
+    ![open clients](/talemate/img/0.26.0/open-clients.png)
+
+The client configuration window will appear. Here you can choose the type of client you want to add.
+
+![connect a client add client modal](/talemate/img/0.26.0/connect-a-client-add-client-modal.png)
+
+## Choose an API / Client Type
+
+We have support for multiple local and remote APIs. You can choose to use one or more of them.
+
+!!! note "Local vs remote"
+    A local API runs on your machine, while a remote API runs on a server somewhere else. 
+
+Select the API you want to use and click through to follow the instructions to configure a client for it:
+
+##### Remote APIs
+
+- [OpenAI](/talemate/user-guide/clients/types/openai/)
+- [Anthropic](/talemate/user-guide/clients/types/anthropic/)
+- [mistral.ai](/talemate/user-guide/clients/types/mistral/)
+- [Cohere](/talemate/user-guide/clients/types/cohere/)
+- [Groq](/talemate/user-guide/clients/types/groq/)
+- [Google Gemini](/talemate/user-guide/clients/types/google/)
+
+##### Local APIs
+
+- [KoboldCpp](/talemate/user-guide/clients/types/koboldcpp/)
+- [Text-Generation-WebUI](/talemate/user-guide/clients/types/text-generation-webui/) 
+- [LMStudio](/talemate/user-guide/clients/types/lmstudio/)
+- [TabbyAPI](/talemate/user-guide/clients/types/tabbyapi/)
+
+##### Unofficial OpenAI API implementations
+
+- [DeepInfra](/talemate/user-guide/clients/types/openai-compatible/#deepinfra)
+- llamacpp with the `api_like_OAI.py` wrapper
+
+## Assign the client to the agents
+
+Whenever you add your first client, Talemate will automatically assign it to all agents. Once the client is configured and assigned, all agents should have a green dot next to them. (Or grey if the agent is currently disabled)
+
+![Connect a client assigned](/talemate/img/0.26.0/connect-a-client-ready.png)
+
+You can tell the client is assigned to the agent by checking the tag beneath the agent name, which will contain the client name if it is assigned.
+
+![Agent has client assigned](/talemate/img/0.26.0/agent-has-client-assigned.png)
+
+## Its not assigned!
+
+If for some reason the client is not assigned to the agent, you can manually assign it to all agents by clicking the **:material-transit-connection-variant: Assign to all agents** button.
+
+![Connect a client assign to all agents](/talemate/img/0.26.0/connect-a-client-assign-to-all-agents.png)
--- a/docs/getting-started/installation/.pages
+++ b/docs/getting-started/installation/.pages
@@ -0,0 +1,5 @@
+nav:
+    - windows.md
+    - linux.md
+    - docker.md
+    - ...
--- a/docs/getting-started/installation/docker.md
+++ b/docs/getting-started/installation/docker.md
@@ -0,0 +1,22 @@
+!!! example "Experimental"
+    Talemate through docker has not received a lot of testing from me, so please let me know if you encounter any issues.
+    
+    You can do so by creating an issue on the [:material-github: GitHub repository](https://github.com/vegu-ai/talemate)
+
+## Quick install instructions
+
+1. `git clone https://github.com/vegu-ai/talemate.git`
+1. `cd talemate`
+1. copy config file
+    1. linux: `cp config.example.yaml config.yaml` 
+    1. windows: `copy config.example.yaml config.yaml`
+1. If your host has a CUDA compatible Nvidia GPU
+    1. Windows (via PowerShell): `$env:CUDA_AVAILABLE="true"; docker compose up`
+    1. Linux: `CUDA_AVAILABLE=true docker compose up`
+1. If your host does **NOT** have a CUDA compatible Nvidia GPU
+    1. Windows: `docker compose up`
+    1. Linux: `docker compose up`
+1. Navigate your browser to http://localhost:8080
+
+!!! note
+    When connecting local APIs running on the hostmachine (e.g. text-generation-webui), you need to use `host.docker.internal` as the hostname.
--- a/docs/getting-started/installation/linux.md
+++ b/docs/getting-started/installation/linux.md
@@ -1,3 +1,27 @@
+
+## Quick install instructions
+
+!!! warning
+    python 3.12 is currently not supported.
+
+### Dependencies
+
+1. node.js and npm - see instructions [here](https://nodejs.org/en/download/package-manager/)
+1. python 3.10 or 3.11 - see instructions [here](https://www.python.org/downloads/)
+
+### Installation
+
+1. `git clone https://github.com/vegu-ai/talemate.git`
+1. `cd talemate`
+1. `source install.sh`
+    - When asked if you want to install pytorch with CUDA support choose `y` if you have
+        a CUDA compatible Nvidia GPU and have installed the necessary drivers.
+1. `source start.sh`
+
+If everything went well, you can proceed to [connect a client](../../connect-a-client).
+
+## Additional Information
+
 ### Setting Up a Virtual Environment

 1. Open a terminal.
@@ -14,7 +38,7 @@

 1. With the virtual environment activated and dependencies installed, you can start the backend server.
 2. Navigate to the `src/talemate/server` directory.
-3. Run the server with `python run.py runserver --host 0.0.0.0 --port 5001`.
+3. Run the server with `python run.py runserver --host 0.0.0.0 --port 5050`.

 ### Running the Frontend

@@ -22,4 +46,4 @@
 2. If you haven't already, install npm dependencies by running `npm install`.
 3. Start the server with `npm run serve`.

-Please note that you may need to set environment variables or modify the host and port as per your setup. You can refer to the `runserver.sh` and `frontend.sh` files for more details.
+Please note that you may need to set environment variables or modify the host and port as per your setup. You can refer to the `runserver.sh` and `frontend.sh` files for more details.
--- a/docs/getting-started/installation/troubleshoot.md
+++ b/docs/getting-started/installation/troubleshoot.md
@@ -0,0 +1,28 @@
+# Common issues
+
+## Windows
+
+### Installation fails with "Microsoft Visual C++" error
+    
+If your installation errors with a notification to upgrade "Microsoft Visual C++" go to https://visualstudio.microsoft.com/visual-cpp-build-tools/ and click "Download Build Tools" and run it.
+
+-  During installation make sure you select the C++ development package (upper left corner)
+-  Run `reinstall.bat` inside talemate directory
+
+## Docker
+
+### Docker has created `config.yaml` directory
+
+If you do not copy the example config to `config.yaml` before running `docker compose up` docker will create a `config` directory in the root of the project. This will cause the backend to fail to start.
+
+This happens because we mount the config file directly as a docker volume, and if it does not exist docker will create a directory with the same name.
+
+This will eventually be fixed, for now please make sure to copy the example config file before running the docker compose command.
+
+## General
+
+### Running behind reverse proxy with ssl
+
+Personally i have not been able to make this work yet, but its on my list, issue stems from some vue oddities when specifying the base urls while running in a dev environment. I expect once i start building the project for production this will be resolved.
+
+If you do make it work, please reach out to me so i can update this documentation.
--- a/docs/getting-started/installation/windows.md
+++ b/docs/getting-started/installation/windows.md
@@ -0,0 +1,42 @@
+## Quick install instructions
+
+!!! warning
+    python 3.12 is currently not supported
+
+1. Download and install Python 3.10 or Python 3.11 from the [official Python website](https://www.python.org/downloads/windows/).
+    - [Click here for direct link to python 3.11.9 download](https://www.python.org/downloads/release/python-3119/)
+1. Download and install Node.js from the [official Node.js website](https://nodejs.org/en/download/prebuilt-installer). This will also install npm.
+1. Download the Talemate project to your local machine. Download from [the Releases page](https://github.com/vegu-ai/talemate/releases).
+1. Unpack the download and run `install.bat` by double clicking it. This will set up the project on your local machine.
+1. Once the installation is complete, you can start the backend and frontend servers by running `start.bat`.
+1. Navigate your browser to http://localhost:8080
+
+If everything went well, you can proceed to [connect a client](../../connect-a-client).
+
+## Additional Information
+
+### How to Install Python 3.10 or 3.11
+
+1. Visit the official Python website's download page for Windows at [https://www.python.org/downloads/windows/](https://www.python.org/downloads/windows/).
+2. Find the latest version of Python 3.10 or 3.11 and click on one of the download links. (You will likely want the Windows installer (64-bit))
+4. Run the installer file and follow the setup instructions. Make sure to check the box that says Add Python 3.10 to PATH before you click Install Now.
+
+### How to Install npm
+
+1. Download Node.js from the official site [https://nodejs.org/en/download/prebuilt-installer](https://nodejs.org/en/download/prebuilt-installer).
+2. Run the installer (the .msi installer is recommended).
+3. Follow the prompts in the installer (Accept the license agreement, click the NEXT button a bunch of times and accept the default installation settings).
+
+### Usage of the Supplied bat Files
+
+#### install.bat
+
+This batch file is used to set up the project on your local machine. It creates a virtual environment, activates it, installs poetry, and uses poetry to install dependencies. It then navigates to the frontend directory and installs the necessary npm packages.
+
+To run this file, simply double click on it or open a command prompt in the same directory and type `install.bat`.
+
+#### start.bat
+
+This batch file is used to start the backend and frontend servers. It opens two command prompts, one for the frontend and one for the backend.
+
+To run this file, simply double click on it or open a command prompt in the same directory and type `start.bat`.
--- a/docs/getting-started/load-a-scene.md
+++ b/docs/getting-started/load-a-scene.md
@@ -0,0 +1,57 @@
+# Load a scenario
+
+Once you've set up a client and assigned it to all the agents, you will be presented with the `Home` screen. From here, you can load talemate scenarios and upload character cards.
+
+To load the introductory `Infinity Quest` scenario, simply click on its entry in the `Quick Load` section.
+
+![Load infinity quest](/talemate/img/0.26.0/getting-started-load-screen.png)
+
+!!! info "First time may take a moment"
+    When you load the a scenario for the first time, Talemate will need to initialize the long term memory model. Which likely means a download. Just be patient and it will be ready soon.
+
+## Interacting with the scenario
+
+After a moment of loading, you will see the scenario's introductory message and be able to send a text interaction.
+
+![Getting stgarted scene 1](/talemate/img/0.26.0/getting-started-scene-1.png)
+
+Its time to send the first message.
+
+Spoken words should go into `"` and actions should be written in `*`. Talemate will automatically supply the other if you supply one. 
+
+![Getting started first interaction](/talemate/img/0.26.0/getting-started-first-interaction.png)
+
+Once sent, its now the AI's turn to respond - depending on the service and model selected this can take a a moment.
+
+![Getting started first ai response](/talemate/img/0.26.0/getting-started-first-ai-response.png)
+
+## Quick overview of UI elements
+
+### Scenario tools
+
+Above the chat input there is a set of tools to help you interact with the scenario.
+
+![Getting started ui element tools](/talemate/img/0.26.0/getting-started-ui-element-tools.png)
+
+These contain tools to, for example:
+
+- regenrate the most recent AI response
+- give directions to characters
+- narrate the scene
+- advance time
+- save the current scene state
+- and more ...
+
+A full guide can be found in the [Scenario Tools](/talemate/user-guide/scenario-tools) section of the user guide.
+
+### World state
+
+Shows a sumamrization of the current scene state.
+
+![getting started world state 1](/talemate/img/0.26.0/getting-started-world-state-1.png)
+
+Each item can be expanded for more information.
+
+![getting started world state 2](/talemate/img/0.26.0/getting-started-world-state-2.png)
+
+Find out more about the world state in the [World State](/talemate/user-guide/world-state) section of the user guide.
--- a/docs/img/0.20.0/comfyui-base-workflow.png
+++ b/docs/img/0.20.0/comfyui-base-workflow.png
--- a/docs/img/0.20.0/visual-queue.png
+++ b/docs/img/0.20.0/visual-queue.png
--- a/docs/img/0.20.0/visualize-scene-tools.png
+++ b/docs/img/0.20.0/visualize-scene-tools.png
--- a/docs/img/0.20.0/visualizer-busy.png
+++ b/docs/img/0.20.0/visualizer-busy.png
--- a/docs/img/0.20.0/visualze-new-images.png
+++ b/docs/img/0.20.0/visualze-new-images.png
--- a/docs/img/0.26.0/agent-disabled.png
+++ b/docs/img/0.26.0/agent-disabled.png
--- a/docs/img/0.26.0/agent-enabled.png
+++ b/docs/img/0.26.0/agent-enabled.png
--- a/docs/img/0.26.0/agent-has-client-assigned.png
+++ b/docs/img/0.26.0/agent-has-client-assigned.png
--- a/docs/img/0.26.0/anthropic-settings.png
+++ b/docs/img/0.26.0/anthropic-settings.png
--- a/docs/img/0.26.0/auto-progress-off.png
+++ b/docs/img/0.26.0/auto-progress-off.png
--- a/docs/img/0.26.0/autosave-blocked.png
+++ b/docs/img/0.26.0/autosave-blocked.png
--- a/docs/img/0.26.0/autosave-disabled.png
+++ b/docs/img/0.26.0/autosave-disabled.png
--- a/docs/img/0.26.0/autosave-enabled.png
+++ b/docs/img/0.26.0/autosave-enabled.png
--- a/docs/img/0.26.0/client-anthropic-no-api-key.png
+++ b/docs/img/0.26.0/client-anthropic-no-api-key.png
--- a/docs/img/0.26.0/client-anthropic-ready.png
+++ b/docs/img/0.26.0/client-anthropic-ready.png
--- a/docs/img/0.26.0/client-anthropic.png
+++ b/docs/img/0.26.0/client-anthropic.png
--- a/docs/img/0.26.0/client-assigned-prompt-template.png
+++ b/docs/img/0.26.0/client-assigned-prompt-template.png
--- a/docs/img/0.26.0/client-cohere-no-api-key.png
+++ b/docs/img/0.26.0/client-cohere-no-api-key.png
--- a/docs/img/0.26.0/client-cohere-ready.png
+++ b/docs/img/0.26.0/client-cohere-ready.png
--- a/docs/img/0.26.0/client-cohere.png
+++ b/docs/img/0.26.0/client-cohere.png
--- a/docs/img/0.26.0/client-deepinfra-ready.png
+++ b/docs/img/0.26.0/client-deepinfra-ready.png
--- a/docs/img/0.26.0/client-deepinfra.png
+++ b/docs/img/0.26.0/client-deepinfra.png
--- a/docs/img/0.26.0/client-google-creds-missing.png
+++ b/docs/img/0.26.0/client-google-creds-missing.png
--- a/docs/img/0.26.0/client-google-ready.png
+++ b/docs/img/0.26.0/client-google-ready.png
--- a/docs/img/0.26.0/client-google.png
+++ b/docs/img/0.26.0/client-google.png
--- a/docs/img/0.26.0/client-groq-no-api-key.png
+++ b/docs/img/0.26.0/client-groq-no-api-key.png
--- a/docs/img/0.26.0/client-groq-ready.png
+++ b/docs/img/0.26.0/client-groq-ready.png
--- a/docs/img/0.26.0/client-groq.png
+++ b/docs/img/0.26.0/client-groq.png
--- a/docs/img/0.26.0/client-hibernate-1.png
+++ b/docs/img/0.26.0/client-hibernate-1.png
--- a/docs/img/0.26.0/client-hibernate-2.png
+++ b/docs/img/0.26.0/client-hibernate-2.png
--- a/docs/img/0.26.0/client-koboldcpp-could-not-connect.png
+++ b/docs/img/0.26.0/client-koboldcpp-could-not-connect.png
--- a/docs/img/0.26.0/client-koboldcpp-ready.png
+++ b/docs/img/0.26.0/client-koboldcpp-ready.png
--- a/docs/img/0.26.0/client-koboldcpp.png
+++ b/docs/img/0.26.0/client-koboldcpp.png
--- a/docs/img/0.26.0/client-lmstudio-could-not-connect.png
+++ b/docs/img/0.26.0/client-lmstudio-could-not-connect.png
--- a/docs/img/0.26.0/client-lmstudio-ready.png
+++ b/docs/img/0.26.0/client-lmstudio-ready.png
--- a/docs/img/0.26.0/client-lmstudio.png
+++ b/docs/img/0.26.0/client-lmstudio.png
--- a/docs/img/0.26.0/client-mistral-no-api-key.png
+++ b/docs/img/0.26.0/client-mistral-no-api-key.png
--- a/docs/img/0.26.0/client-mistral-ready.png
+++ b/docs/img/0.26.0/client-mistral-ready.png
--- a/docs/img/0.26.0/client-mistral.png
+++ b/docs/img/0.26.0/client-mistral.png
--- a/docs/img/0.26.0/client-ooba-could-not-connect.png
+++ b/docs/img/0.26.0/client-ooba-could-not-connect.png
--- a/docs/img/0.26.0/client-ooba-no-model-loaded.png
+++ b/docs/img/0.26.0/client-ooba-no-model-loaded.png
--- a/docs/img/0.26.0/client-ooba-ready.png
+++ b/docs/img/0.26.0/client-ooba-ready.png
--- a/docs/img/0.26.0/client-ooba.png
+++ b/docs/img/0.26.0/client-ooba.png
--- a/docs/img/0.26.0/client-openai-no-api-key.png
+++ b/docs/img/0.26.0/client-openai-no-api-key.png
--- a/docs/img/0.26.0/client-openai-ready.png
+++ b/docs/img/0.26.0/client-openai-ready.png
--- a/docs/img/0.26.0/client-openai.png
+++ b/docs/img/0.26.0/client-openai.png
--- a/docs/img/0.26.0/client-tabbyapi-could-not-connect.png
+++ b/docs/img/0.26.0/client-tabbyapi-could-not-connect.png
--- a/docs/img/0.26.0/client-tabbyapi-ready.png
+++ b/docs/img/0.26.0/client-tabbyapi-ready.png
--- a/docs/img/0.26.0/client-tabbyapi.png
+++ b/docs/img/0.26.0/client-tabbyapi.png
--- a/docs/img/0.26.0/client-unknown-prompt-template-modal.png
+++ b/docs/img/0.26.0/client-unknown-prompt-template-modal.png
--- a/docs/img/0.26.0/client-unknown-prompt-template.png
+++ b/docs/img/0.26.0/client-unknown-prompt-template.png
--- a/docs/img/0.26.0/cohere-settings.png
+++ b/docs/img/0.26.0/cohere-settings.png
--- a/docs/img/0.26.0/connect-a-client-add-client-modal.png
+++ b/docs/img/0.26.0/connect-a-client-add-client-modal.png
--- a/docs/img/0.26.0/connect-a-client-add-client.png
+++ b/docs/img/0.26.0/connect-a-client-add-client.png
--- a/docs/img/0.26.0/connect-a-client-assign-to-all-agents.png
+++ b/docs/img/0.26.0/connect-a-client-assign-to-all-agents.png
--- a/docs/img/0.26.0/connect-a-client-ready.png
+++ b/docs/img/0.26.0/connect-a-client-ready.png
--- a/docs/img/0.26.0/create-new-scene-test.png
+++ b/docs/img/0.26.0/create-new-scene-test.png
--- a/docs/img/0.26.0/create-new-scene.png
+++ b/docs/img/0.26.0/create-new-scene.png
--- a/docs/img/0.26.0/elevenlabs-ready.png
+++ b/docs/img/0.26.0/elevenlabs-ready.png
--- a/docs/img/0.26.0/elevenlabs-settings.png
+++ b/docs/img/0.26.0/elevenlabs-settings.png
--- a/docs/img/0.26.0/getting-started-first-ai-response.png
+++ b/docs/img/0.26.0/getting-started-first-ai-response.png
--- a/docs/img/0.26.0/getting-started-first-interaction.png
+++ b/docs/img/0.26.0/getting-started-first-interaction.png
--- a/docs/img/0.26.0/getting-started-load-screen.png
+++ b/docs/img/0.26.0/getting-started-load-screen.png
--- a/docs/img/0.26.0/getting-started-scene-1.png
+++ b/docs/img/0.26.0/getting-started-scene-1.png
--- a/docs/img/0.26.0/getting-started-ui-element-tools.png
+++ b/docs/img/0.26.0/getting-started-ui-element-tools.png
--- a/docs/img/0.26.0/getting-started-world-state-1.png
+++ b/docs/img/0.26.0/getting-started-world-state-1.png
--- a/docs/img/0.26.0/getting-started-world-state-2.png
+++ b/docs/img/0.26.0/getting-started-world-state-2.png
--- a/docs/img/0.26.0/google-settings.png
+++ b/docs/img/0.26.0/google-settings.png
--- a/docs/img/0.26.0/groq-settings.png
+++ b/docs/img/0.26.0/groq-settings.png
--- a/docs/img/0.26.0/inference-presets-1.png
+++ b/docs/img/0.26.0/inference-presets-1.png
--- a/docs/img/0.26.0/interacting-input-act-as-character.png
+++ b/docs/img/0.26.0/interacting-input-act-as-character.png
--- a/docs/img/0.26.0/interacting-input-act-as-narrator.png
+++ b/docs/img/0.26.0/interacting-input-act-as-narrator.png
--- a/docs/img/0.26.0/interacting-input-request.png
+++ b/docs/img/0.26.0/interacting-input-request.png
--- a/docs/img/0.26.0/mistral-settings.png
+++ b/docs/img/0.26.0/mistral-settings.png
--- a/docs/img/0.26.0/no-api-token.png
+++ b/docs/img/0.26.0/no-api-token.png
--- a/Show More
+++ b/Show More