mirror of
https://github.com/vegu-ai/talemate.git
synced 2025-12-24 15:39:34 +01:00
Compare commits
11 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ddfbd6891b | ||
|
|
143dd47e02 | ||
|
|
cc7cb773d1 | ||
|
|
02c88f75a1 | ||
|
|
419371e0fb | ||
|
|
6e847bf283 | ||
|
|
ceedd3019f | ||
|
|
a28cf2a029 | ||
|
|
60cb271e30 | ||
|
|
1874234d2c | ||
|
|
ef99539e69 |
@@ -1,13 +1,19 @@
|
||||
# Use an official node runtime as a parent image
|
||||
FROM node:20
|
||||
|
||||
# Make sure we are in a development environment (this isn't a production ready Dockerfile)
|
||||
ENV NODE_ENV=development
|
||||
|
||||
# Echo that this isn't a production ready Dockerfile
|
||||
RUN echo "This Dockerfile is not production ready. It is intended for development purposes only."
|
||||
|
||||
# Set the working directory in the container
|
||||
WORKDIR /app
|
||||
|
||||
# Copy the frontend directory contents into the container at /app
|
||||
COPY ./talemate_frontend /app
|
||||
|
||||
# Install any needed packages specified in package.json
|
||||
# Install all dependencies
|
||||
RUN npm install
|
||||
|
||||
# Make port 8080 available to the world outside this container
|
||||
|
||||
37
README.md
37
README.md
@@ -16,6 +16,7 @@ Supported APIs:
|
||||
- [Google Gemini](https://console.cloud.google.com/)
|
||||
|
||||
Supported self-hosted APIs:
|
||||
- [KoboldCpp](https://koboldai.org/cpp) ([Local](https://koboldai.org/cpp), [Runpod](https://koboldai.org/runpodcpp), [VastAI](https://koboldai.org/vastcpp), also includes image gen support)
|
||||
- [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) (local or with runpod support)
|
||||
- [LMStudio](https://lmstudio.ai/)
|
||||
|
||||
@@ -56,6 +57,7 @@ Please read the documents in the `docs` folder for more advanced configuration a
|
||||
- [Ready to go](#ready-to-go)
|
||||
- [Load the introductory scenario "Infinity Quest"](#load-the-introductory-scenario-infinity-quest)
|
||||
- [Loading character cards](#loading-character-cards)
|
||||
- [Configure for hosting](#configure-for-hosting)
|
||||
- [Text-to-Speech (TTS)](docs/tts.md)
|
||||
- [Visual Generation](docs/visual.md)
|
||||
- [ChromaDB (long term memory) configuration](docs/chromadb.md)
|
||||
@@ -93,16 +95,19 @@ There is also a [troubleshooting guide](docs/troubleshoot.md) that might help.
|
||||
|
||||
### Docker
|
||||
|
||||
:warning: Some users currently experience issues with missing dependencies inside the docker container, issue tracked at [#114](https://github.com/vegu-ai/talemate/issues/114)
|
||||
|
||||
1. `git clone https://github.com/vegu-ai/talemate.git`
|
||||
1. `cd talemate`
|
||||
1. `docker-compose up`
|
||||
1. `cp config.example.yaml config.yaml`
|
||||
1. `docker compose up`
|
||||
1. Navigate your browser to http://localhost:8080
|
||||
|
||||
:warning: When connecting local APIs running on the hostmachine (e.g. text-generation-webui), you need to use `host.docker.internal` as the hostname.
|
||||
|
||||
#### To shut down the Docker container
|
||||
|
||||
Just closing the terminal window will not stop the Docker container. You need to run `docker-compose down` to stop the container.
|
||||
Just closing the terminal window will not stop the Docker container. You need to run `docker compose down` to stop the container.
|
||||
|
||||
#### How to install Docker
|
||||
|
||||
@@ -168,19 +173,9 @@ In the case for `bartowski_Nous-Hermes-2-Mistral-7B-DPO-exl2_8_0` that is `ChatM
|
||||
|
||||
### Recommended Models
|
||||
|
||||
As of 2024.03.07 my personal regular drivers (the ones i test with) are:
|
||||
Any of the top models in any of the size classes here should work well (i wouldn't recommend going lower than 7B):
|
||||
|
||||
- Kunoichi-7B
|
||||
- sparsetral-16x7B
|
||||
- Nous-Hermes-2-Mistral-7B-DPO
|
||||
- brucethemoose_Yi-34B-200K-RPMerge
|
||||
- dolphin-2.7-mixtral-8x7b
|
||||
- rAIfle_Verdict-8x7B
|
||||
- Mixtral-8x7B-instruct
|
||||
|
||||
That said, any of the top models in any of the size classes here should work well (i wouldn't recommend going lower than 7B):
|
||||
|
||||
https://www.reddit.com/r/LocalLLaMA/comments/18yp9u4/llm_comparisontest_api_edition_gpt4_vs_gemini_vs/
|
||||
[https://oobabooga.github.io/benchmark.html](https://oobabooga.github.io/benchmark.html)
|
||||
|
||||
## DeepInfra via OpenAI Compatible client
|
||||
|
||||
@@ -253,3 +248,17 @@ Expand the "Load" menu in the top left corner and either click on "Upload a char
|
||||
Once a character is uploaded, talemate may actually take a moment because it needs to convert it to a talemate format and will also run additional LLM prompts to generate character attributes and world state.
|
||||
|
||||
Make sure you save the scene after the character is loaded as it can then be loaded as normal talemate scenario in the future.
|
||||
|
||||
## Configure for hosting
|
||||
|
||||
By default talemate is configured to run locally. If you want to host it behind a reverse proxy or on a server, you will need create some environment variables in the `talemate_frontend/.env.development.local` file
|
||||
|
||||
Start by copying `talemate_frontend/example.env.development.local` to `talemate_frontend/.env.development.local`.
|
||||
|
||||
Then open the file and edit the `ALLOWED_HOSTS` and `VUE_APP_TALEMATE_BACKEND_WEBSOCKET_URL` variables.
|
||||
|
||||
```sh
|
||||
ALLOWED_HOSTS=example.com
|
||||
# wss if behind ssl, ws if not
|
||||
VUE_APP_TALEMATE_BACKEND_WEBSOCKET_URL=wss://example.com:5050
|
||||
```
|
||||
|
||||
@@ -23,5 +23,5 @@ services:
|
||||
dockerfile: Dockerfile.frontend
|
||||
ports:
|
||||
- "8080:8080"
|
||||
volumes:
|
||||
- ./talemate_frontend:/app
|
||||
#volumes:
|
||||
# - ./talemate_frontend:/app
|
||||
|
||||
1098
poetry.lock
generated
1098
poetry.lock
generated
File diff suppressed because it is too large
Load Diff
@@ -4,7 +4,7 @@ build-backend = "poetry.masonry.api"
|
||||
|
||||
[tool.poetry]
|
||||
name = "talemate"
|
||||
version = "0.25.0"
|
||||
version = "0.25.5"
|
||||
description = "AI-backed roleplay and narrative tools"
|
||||
authors = ["FinalWombat"]
|
||||
license = "GNU Affero General Public License v3.0"
|
||||
@@ -51,7 +51,8 @@ chromadb = ">=0.4.17,<1"
|
||||
InstructorEmbedding = "^1.0.1"
|
||||
torch = ">=2.1.0"
|
||||
torchaudio = ">=2.3.0"
|
||||
sentence-transformers="^2.2.2"
|
||||
# locked for instructor embeddings
|
||||
sentence-transformers="==2.2.2"
|
||||
|
||||
[tool.poetry.dev-dependencies]
|
||||
pytest = "^6.2"
|
||||
|
||||
@@ -2,4 +2,4 @@ from .agents import Agent
|
||||
from .client import TextGeneratorWebuiClient
|
||||
from .tale_mate import *
|
||||
|
||||
VERSION = "0.25.0"
|
||||
VERSION = "0.25.5"
|
||||
|
||||
@@ -221,6 +221,9 @@ class Agent(ABC):
|
||||
if callback:
|
||||
await callback()
|
||||
|
||||
async def setup_check(self):
|
||||
return False
|
||||
|
||||
async def ready_check(self, task: asyncio.Task = None):
|
||||
self.ready_check_error = None
|
||||
if task:
|
||||
|
||||
@@ -668,7 +668,9 @@ class ConversationAgent(Agent):
|
||||
|
||||
total_result = util.handle_endofline_special_delimiter(total_result)
|
||||
|
||||
if total_result.startswith(":\n"):
|
||||
log.info("conversation agent", total_result=total_result)
|
||||
|
||||
if total_result.startswith(":\n") or total_result.startswith(": "):
|
||||
total_result = total_result[2:]
|
||||
|
||||
# movie script format
|
||||
|
||||
@@ -80,6 +80,11 @@ class VisualBase(Agent):
|
||||
),
|
||||
},
|
||||
),
|
||||
"automatic_setup": AgentAction(
|
||||
enabled=True,
|
||||
label="Automatic Setup",
|
||||
description="Automatically setup the visual agent if the selected client has an implementation of the selected backend. (Like the KoboldCpp Automatic1111 api)",
|
||||
),
|
||||
"automatic_generation": AgentAction(
|
||||
enabled=False,
|
||||
label="Automatic Generation",
|
||||
@@ -187,8 +192,10 @@ class VisualBase(Agent):
|
||||
prev_ready = self.backend_ready
|
||||
self.backend_ready = False
|
||||
self.ready_check_error = str(error)
|
||||
await self.setup_check()
|
||||
if prev_ready:
|
||||
await self.emit_status()
|
||||
|
||||
|
||||
async def ready_check(self):
|
||||
if not self.enabled:
|
||||
@@ -198,6 +205,15 @@ class VisualBase(Agent):
|
||||
task = asyncio.create_task(fn())
|
||||
await super().ready_check(task)
|
||||
|
||||
async def setup_check(self):
|
||||
|
||||
if not self.actions["automatic_setup"].enabled:
|
||||
return
|
||||
|
||||
backend = self.backend
|
||||
if self.client and hasattr(self.client, f"visual_{backend.lower()}_setup"):
|
||||
await getattr(self.client, f"visual_{backend.lower()}_setup")(self)
|
||||
|
||||
async def apply_config(self, *args, **kwargs):
|
||||
|
||||
try:
|
||||
|
||||
@@ -5,9 +5,10 @@ from talemate.client.anthropic import AnthropicClient
|
||||
from talemate.client.cohere import CohereClient
|
||||
from talemate.client.google import GoogleClient
|
||||
from talemate.client.groq import GroqClient
|
||||
from talemate.client.koboldcpp import KoboldCppClient
|
||||
from talemate.client.lmstudio import LMStudioClient
|
||||
from talemate.client.mistral import MistralAIClient
|
||||
from talemate.client.openai import OpenAIClient
|
||||
from talemate.client.openai_compat import OpenAICompatibleClient
|
||||
from talemate.client.registry import CLIENT_CLASSES, get_client_class, register
|
||||
from talemate.client.textgenwebui import TextGeneratorWebuiClient
|
||||
from talemate.client.textgenwebui import TextGeneratorWebuiClient
|
||||
@@ -122,6 +122,10 @@ class ClientBase:
|
||||
"""
|
||||
return self.Meta().requires_prompt_template
|
||||
|
||||
@property
|
||||
def max_tokens_param_name(self):
|
||||
return "max_tokens"
|
||||
|
||||
def set_client(self, **kwargs):
|
||||
self.client = AsyncOpenAI(base_url=self.api_url, api_key="sk-1111")
|
||||
|
||||
@@ -410,7 +414,6 @@ class ClientBase:
|
||||
self.log.warning("client status error", e=e, client=self.name)
|
||||
self.model_name = None
|
||||
self.connected = False
|
||||
self.toggle_disabled_if_remote()
|
||||
self.emit_status()
|
||||
return
|
||||
|
||||
@@ -626,7 +629,7 @@ class ClientBase:
|
||||
is_repetition, similarity_score, matched_line = util.similarity_score(
|
||||
response, finalized_prompt.split("\n"), similarity_threshold=80
|
||||
)
|
||||
|
||||
|
||||
if not is_repetition:
|
||||
# not a repetition, return the response
|
||||
|
||||
@@ -660,7 +663,7 @@ class ClientBase:
|
||||
|
||||
# then we pad the max_tokens by the pad_max_tokens amount
|
||||
|
||||
prompt_param["max_tokens"] += pad_max_tokens
|
||||
prompt_param[self.max_tokens_param_name] += pad_max_tokens
|
||||
|
||||
# send the prompt again
|
||||
# we use the repetition_adjustment method to further encourage
|
||||
@@ -682,7 +685,7 @@ class ClientBase:
|
||||
|
||||
# a lot of the times the response will now contain the repetition + something new
|
||||
# so we dedupe the response to remove the repetition on sentences level
|
||||
|
||||
|
||||
response = util.dedupe_sentences(
|
||||
response, matched_line, similarity_threshold=85, debug=True
|
||||
)
|
||||
@@ -752,3 +755,29 @@ class ClientBase:
|
||||
new_lines.append(line)
|
||||
|
||||
return "\n".join(new_lines)
|
||||
|
||||
|
||||
def process_response_for_indirect_coercion(self, prompt:str, response:str) -> str:
|
||||
|
||||
"""
|
||||
A lot of remote APIs don't let us control the prompt template and we cannot directly
|
||||
append the beginning of the desired response to the prompt.
|
||||
|
||||
With indirect coercion we tell the LLM what the beginning of the response should be
|
||||
and then hopefully it will adhere to it and we can strip it off the actual response.
|
||||
"""
|
||||
|
||||
_, right = prompt.split("\nStart your response with: ")
|
||||
expected_response = right.strip()
|
||||
if (
|
||||
expected_response
|
||||
and expected_response.startswith("{")
|
||||
):
|
||||
if response.startswith("```json") and response.endswith("```"):
|
||||
response = response[7:-3].strip()
|
||||
|
||||
if right and response.startswith(right):
|
||||
response = response[len(right) :].strip()
|
||||
|
||||
return response
|
||||
|
||||
@@ -1,16 +0,0 @@
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
import random
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Callable, Union
|
||||
|
||||
import requests
|
||||
|
||||
import talemate.client.system_prompts as system_prompts
|
||||
import talemate.util as util
|
||||
from talemate.client.registry import register
|
||||
from talemate.client.textgenwebui import RESTTaleMateClient
|
||||
from talemate.emit import Emission, emit
|
||||
|
||||
# NOT IMPLEMENTED AT THIS POINT
|
||||
306
src/talemate/client/koboldcpp.py
Normal file
306
src/talemate/client/koboldcpp.py
Normal file
@@ -0,0 +1,306 @@
|
||||
import random
|
||||
import re
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
# import urljoin
|
||||
from urllib.parse import urljoin, urlparse
|
||||
import httpx
|
||||
import structlog
|
||||
|
||||
from talemate.client.base import STOPPING_STRINGS, ClientBase, Defaults, ExtraField
|
||||
from talemate.client.registry import register
|
||||
import talemate.util as util
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from talemate.agents.visual import VisualBase
|
||||
|
||||
log = structlog.get_logger("talemate.client.koboldcpp")
|
||||
|
||||
|
||||
class KoboldCppClientDefaults(Defaults):
|
||||
api_url: str = "http://localhost:5001"
|
||||
api_key: str = ""
|
||||
|
||||
|
||||
@register()
|
||||
class KoboldCppClient(ClientBase):
|
||||
auto_determine_prompt_template: bool = True
|
||||
client_type = "koboldcpp"
|
||||
|
||||
class Meta(ClientBase.Meta):
|
||||
name_prefix: str = "KoboldCpp"
|
||||
title: str = "KoboldCpp"
|
||||
enable_api_auth: bool = True
|
||||
defaults: KoboldCppClientDefaults = KoboldCppClientDefaults()
|
||||
|
||||
@property
|
||||
def request_headers(self):
|
||||
headers = {}
|
||||
headers["Content-Type"] = "application/json"
|
||||
if self.api_key:
|
||||
headers["Authorization"] = f"Bearer {self.api_key}"
|
||||
return headers
|
||||
|
||||
@property
|
||||
def url(self) -> str:
|
||||
parts = urlparse(self.api_url)
|
||||
return f"{parts.scheme}://{parts.netloc}"
|
||||
|
||||
@property
|
||||
def is_openai(self) -> bool:
|
||||
"""
|
||||
kcpp has two apis
|
||||
|
||||
open-ai implementation at /v1
|
||||
their own implenation at /api/v1
|
||||
"""
|
||||
return "/api/v1" not in self.api_url
|
||||
|
||||
@property
|
||||
def api_url_for_model(self) -> str:
|
||||
if self.is_openai:
|
||||
# join /model to url
|
||||
return urljoin(self.api_url, "models")
|
||||
else:
|
||||
# join /models to url
|
||||
return urljoin(self.api_url, "model")
|
||||
|
||||
@property
|
||||
def api_url_for_generation(self) -> str:
|
||||
if self.is_openai:
|
||||
# join /v1/completions
|
||||
return urljoin(self.api_url, "completions")
|
||||
else:
|
||||
# join /api/v1/generate
|
||||
return urljoin(self.api_url, "generate")
|
||||
|
||||
@property
|
||||
def max_tokens_param_name(self):
|
||||
if self.is_openai:
|
||||
return "max_tokens"
|
||||
else:
|
||||
return "max_length"
|
||||
|
||||
def api_endpoint_specified(self, url: str) -> bool:
|
||||
return "/v1" in self.api_url
|
||||
|
||||
def ensure_api_endpoint_specified(self):
|
||||
if not self.api_endpoint_specified(self.api_url):
|
||||
# url doesn't specify the api endpoint
|
||||
# use the koboldcpp united api
|
||||
self.api_url = urljoin(self.api_url.rstrip("/") + "/", "/api/v1/")
|
||||
if not self.api_url.endswith("/"):
|
||||
self.api_url += "/"
|
||||
|
||||
def __init__(self, **kwargs):
|
||||
self.api_key = kwargs.pop("api_key", "")
|
||||
super().__init__(**kwargs)
|
||||
self.ensure_api_endpoint_specified()
|
||||
|
||||
def tune_prompt_parameters(self, parameters: dict, kind: str):
|
||||
super().tune_prompt_parameters(parameters, kind)
|
||||
if not self.is_openai:
|
||||
# adjustments for united api
|
||||
parameters["max_length"] = parameters.pop("max_tokens")
|
||||
parameters["max_context_length"] = self.max_token_length
|
||||
if "repetition_penalty_range" in parameters:
|
||||
parameters["rep_pen_range"] = parameters.pop("repetition_penalty_range")
|
||||
if "repetition_penalty" in parameters:
|
||||
parameters["rep_pen"] = parameters.pop("repetition_penalty")
|
||||
if parameters.get("stop_sequence"):
|
||||
parameters["stop_sequence"] = parameters.pop("stopping_strings")
|
||||
|
||||
if parameters.get("extra_stopping_strings"):
|
||||
if "stop_sequence" in parameters:
|
||||
parameters["stop_sequence"] += parameters.pop("extra_stopping_strings")
|
||||
else:
|
||||
parameters["stop_sequence"] = parameters.pop("extra_stopping_strings")
|
||||
|
||||
|
||||
allowed_params = [
|
||||
"max_length",
|
||||
"max_context_length",
|
||||
"rep_pen",
|
||||
"rep_pen_range",
|
||||
"top_p",
|
||||
"top_k",
|
||||
"temperature",
|
||||
"stop_sequence",
|
||||
]
|
||||
else:
|
||||
allowed_params = ["max_tokens", "presence_penalty", "top_p", "temperature"]
|
||||
|
||||
# drop unsupported params
|
||||
for param in list(parameters.keys()):
|
||||
if param not in allowed_params:
|
||||
del parameters[param]
|
||||
|
||||
def set_client(self, **kwargs):
|
||||
self.api_key = kwargs.get("api_key", self.api_key)
|
||||
self.ensure_api_endpoint_specified()
|
||||
|
||||
|
||||
|
||||
|
||||
async def get_model_name(self):
|
||||
self.ensure_api_endpoint_specified()
|
||||
async with httpx.AsyncClient() as client:
|
||||
response = await client.get(
|
||||
self.api_url_for_model,
|
||||
timeout=2,
|
||||
headers=self.request_headers,
|
||||
)
|
||||
|
||||
if response.status_code == 404:
|
||||
raise KeyError(f"Could not find model info at: {self.api_url_for_model}")
|
||||
|
||||
response_data = response.json()
|
||||
if self.is_openai:
|
||||
# {"object": "list", "data": [{"id": "koboldcpp/dolphin-2.8-mistral-7b", "object": "model", "created": 1, "owned_by": "koboldcpp", "permission": [], "root": "koboldcpp"}]}
|
||||
model_name = response_data.get("data")[0].get("id")
|
||||
else:
|
||||
# {"result": "koboldcpp/dolphin-2.8-mistral-7b"}
|
||||
model_name = response_data.get("result")
|
||||
|
||||
# split by "/" and take last
|
||||
if model_name:
|
||||
model_name = model_name.split("/")[-1]
|
||||
|
||||
return model_name
|
||||
|
||||
async def tokencount(self, content:str) -> int:
|
||||
"""
|
||||
KoboldCpp has a tokencount endpoint we can use to count tokens
|
||||
for the prompt and response
|
||||
|
||||
If the endpoint is not available, we will use the default token count estimate
|
||||
"""
|
||||
|
||||
# extract scheme and host from api url
|
||||
|
||||
parts = urlparse(self.api_url)
|
||||
|
||||
url_tokencount = f"{parts.scheme}://{parts.netloc}/api/extra/tokencount"
|
||||
|
||||
async with httpx.AsyncClient() as client:
|
||||
response = await client.post(
|
||||
url_tokencount,
|
||||
json={"prompt":content},
|
||||
timeout=None,
|
||||
headers=self.request_headers,
|
||||
)
|
||||
|
||||
if response.status_code == 404:
|
||||
# kobold united doesn't have tokencount endpoint
|
||||
return util.count_tokens(content)
|
||||
|
||||
tokencount = len(response.json().get("ids",[]))
|
||||
return tokencount
|
||||
|
||||
async def generate(self, prompt: str, parameters: dict, kind: str):
|
||||
"""
|
||||
Generates text from the given prompt and parameters.
|
||||
"""
|
||||
|
||||
parameters["prompt"] = prompt.strip(" ")
|
||||
|
||||
self._returned_prompt_tokens = await self.tokencount(parameters["prompt"] )
|
||||
|
||||
async with httpx.AsyncClient() as client:
|
||||
response = await client.post(
|
||||
self.api_url_for_generation,
|
||||
json=parameters,
|
||||
timeout=None,
|
||||
headers=self.request_headers,
|
||||
)
|
||||
response_data = response.json()
|
||||
try:
|
||||
if self.is_openai:
|
||||
response_text = response_data["choices"][0]["text"]
|
||||
else:
|
||||
response_text = response_data["results"][0]["text"]
|
||||
except (TypeError, KeyError) as exc:
|
||||
log.error("Failed to generate text", exc=exc, response_data=response_data, response_status=response.status_code)
|
||||
response_text = ""
|
||||
|
||||
self._returned_response_tokens = await self.tokencount(response_text)
|
||||
return response_text
|
||||
|
||||
|
||||
def jiggle_randomness(self, prompt_config: dict, offset: float = 0.3) -> dict:
|
||||
"""
|
||||
adjusts temperature and repetition_penalty
|
||||
by random values using the base value as a center
|
||||
"""
|
||||
|
||||
temp = prompt_config["temperature"]
|
||||
|
||||
if "rep_pen" in prompt_config:
|
||||
rep_pen_key = "rep_pen"
|
||||
elif "presence_penalty" in prompt_config:
|
||||
rep_pen_key = "presence_penalty"
|
||||
else:
|
||||
rep_pen_key = "repetition_penalty"
|
||||
|
||||
min_offset = offset * 0.3
|
||||
|
||||
prompt_config["temperature"] = random.uniform(temp + min_offset, temp + offset)
|
||||
try:
|
||||
if rep_pen_key == "presence_penalty":
|
||||
presence_penalty = prompt_config["presence_penalty"]
|
||||
prompt_config["presence_penalty"] = round(random.uniform(
|
||||
presence_penalty + 0.1, presence_penalty + offset
|
||||
),1)
|
||||
else:
|
||||
rep_pen = prompt_config[rep_pen_key]
|
||||
prompt_config[rep_pen_key] = random.uniform(
|
||||
rep_pen + min_offset * 0.3, rep_pen + offset * 0.3
|
||||
)
|
||||
except KeyError:
|
||||
pass
|
||||
|
||||
def reconfigure(self, **kwargs):
|
||||
if "api_key" in kwargs:
|
||||
self.api_key = kwargs.pop("api_key")
|
||||
|
||||
super().reconfigure(**kwargs)
|
||||
|
||||
|
||||
async def visual_automatic1111_setup(self, visual_agent:"VisualBase") -> bool:
|
||||
|
||||
"""
|
||||
Automatically configure the visual agent for automatic1111
|
||||
if the koboldcpp server has a SD model available
|
||||
"""
|
||||
|
||||
if not self.connected:
|
||||
return False
|
||||
|
||||
sd_models_url = urljoin(self.url, "/sdapi/v1/sd-models")
|
||||
|
||||
async with httpx.AsyncClient() as client:
|
||||
|
||||
try:
|
||||
response = await client.get(
|
||||
url=sd_models_url, timeout=2
|
||||
)
|
||||
except Exception as exc:
|
||||
log.error(f"Failed to fetch sd models from {sd_models_url}", exc=exc)
|
||||
return False
|
||||
|
||||
if response.status_code != 200:
|
||||
return False
|
||||
|
||||
response_data = response.json()
|
||||
|
||||
sd_model = response_data[0].get("model_name") if response_data else None
|
||||
|
||||
if not sd_model:
|
||||
return False
|
||||
|
||||
log.info("automatic1111_setup", sd_model=sd_model)
|
||||
|
||||
visual_agent.actions["automatic1111"].config["api_url"].value = self.url
|
||||
visual_agent.is_enabled = True
|
||||
return True
|
||||
|
||||
@@ -25,12 +25,16 @@ SUPPORTED_MODELS = [
|
||||
"mistral-large-latest",
|
||||
]
|
||||
|
||||
JSON_OBJECT_RESPONSE_MODELS = SUPPORTED_MODELS
|
||||
|
||||
JSON_OBJECT_RESPONSE_MODELS = [
|
||||
"open-mixtral-8x22b",
|
||||
"mistral-small-latest",
|
||||
"mistral-medium-latest",
|
||||
"mistral-large-latest",
|
||||
]
|
||||
|
||||
class Defaults(pydantic.BaseModel):
|
||||
max_token_length: int = 16384
|
||||
model: str = "open-mixtral-8x7b"
|
||||
model: str = "open-mixtral-8x22b"
|
||||
|
||||
|
||||
@register()
|
||||
@@ -53,7 +57,7 @@ class MistralAIClient(ClientBase):
|
||||
requires_prompt_template: bool = False
|
||||
defaults: Defaults = Defaults()
|
||||
|
||||
def __init__(self, model="open-mixtral-8x7b", **kwargs):
|
||||
def __init__(self, model="open-mixtral-8x22b", **kwargs):
|
||||
self.model_name = model
|
||||
self.api_key_status = None
|
||||
self.config = load_config()
|
||||
@@ -115,7 +119,7 @@ class MistralAIClient(ClientBase):
|
||||
return
|
||||
|
||||
if not self.model_name:
|
||||
self.model_name = "open-mixtral-8x7b"
|
||||
self.model_name = "open-mixtral-8x22b"
|
||||
|
||||
if max_token_length and not isinstance(max_token_length, int):
|
||||
max_token_length = int(max_token_length)
|
||||
|
||||
@@ -136,13 +136,15 @@ class ModelPrompt:
|
||||
"""
|
||||
|
||||
matches = []
|
||||
|
||||
cleaned_model_name = model_name.replace("/", "__")
|
||||
|
||||
# Iterate over all templates in the loader's directory
|
||||
for template_name in self.env.list_templates():
|
||||
# strip extension
|
||||
template_name_match = os.path.splitext(template_name)[0]
|
||||
# Check if the model name is in the template filename
|
||||
if template_name_match.lower() in model_name.lower():
|
||||
if template_name_match.lower() in cleaned_model_name.lower():
|
||||
matches.append(template_name)
|
||||
|
||||
# If there are no matches, return None
|
||||
@@ -163,16 +165,17 @@ class ModelPrompt:
|
||||
"""
|
||||
|
||||
template_name = template_name.split(".jinja2")[0]
|
||||
|
||||
cleaned_model_name = model_name.replace("/", "__")
|
||||
|
||||
shutil.copyfile(
|
||||
os.path.join(STD_TEMPLATE_PATH, template_name + ".jinja2"),
|
||||
os.path.join(USER_TEMPLATE_PATH, model_name + ".jinja2"),
|
||||
os.path.join(USER_TEMPLATE_PATH, cleaned_model_name + ".jinja2"),
|
||||
)
|
||||
|
||||
return os.path.join(USER_TEMPLATE_PATH, model_name + ".jinja2")
|
||||
return os.path.join(USER_TEMPLATE_PATH, cleaned_model_name + ".jinja2")
|
||||
|
||||
def query_hf_for_prompt_template_suggestion(self, model_name: str):
|
||||
print("query_hf_for_prompt_template_suggestion", model_name)
|
||||
api = huggingface_hub.HfApi()
|
||||
|
||||
try:
|
||||
|
||||
@@ -28,12 +28,14 @@ SUPPORTED_MODELS = [
|
||||
"gpt-4-turbo-preview",
|
||||
"gpt-4-turbo-2024-04-09",
|
||||
"gpt-4-turbo",
|
||||
"gpt-4o-2024-05-13",
|
||||
"gpt-4o",
|
||||
]
|
||||
|
||||
# any model starting with gpt-4- is assumed to support 'json_object'
|
||||
# for others we need to explicitly state the model name
|
||||
JSON_OBJECT_RESPONSE_MODELS = [
|
||||
"gpt-4-1106-preview",
|
||||
"gpt-4-0125-preview",
|
||||
"gpt-4-turbo-preview",
|
||||
"gpt-4o",
|
||||
"gpt-3.5-turbo-0125",
|
||||
]
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import urllib
|
||||
|
||||
import random
|
||||
import pydantic
|
||||
import structlog
|
||||
from openai import AsyncOpenAI, NotFoundError, PermissionDeniedError
|
||||
@@ -20,6 +20,7 @@ class Defaults(pydantic.BaseModel):
|
||||
max_token_length: int = 8192
|
||||
model: str = ""
|
||||
api_handles_prompt_template: bool = False
|
||||
double_coercion: str = None
|
||||
|
||||
|
||||
class ClientConfig(BaseClientConfig):
|
||||
@@ -43,9 +44,9 @@ class OpenAICompatibleClient(ClientBase):
|
||||
"api_handles_prompt_template": ExtraField(
|
||||
name="api_handles_prompt_template",
|
||||
type="bool",
|
||||
label="API Handles Prompt Template",
|
||||
label="API handles prompt template (chat/completions)",
|
||||
required=False,
|
||||
description="The API handles the prompt template, meaning your choice in the UI for the prompt template below will be ignored.",
|
||||
description="The API handles the prompt template, meaning your choice in the UI for the prompt template below will be ignored. This is not recommended and should only be used if the API does not support the `completions` andpoint or you don't know which prompt template to use.",
|
||||
)
|
||||
}
|
||||
|
||||
@@ -83,13 +84,12 @@ class OpenAICompatibleClient(ClientBase):
|
||||
def tune_prompt_parameters(self, parameters: dict, kind: str):
|
||||
super().tune_prompt_parameters(parameters, kind)
|
||||
|
||||
keys = list(parameters.keys())
|
||||
allowed_params = ["max_tokens", "presence_penalty", "top_p", "temperature"]
|
||||
|
||||
valid_keys = ["temperature", "top_p", "max_tokens"]
|
||||
|
||||
for key in keys:
|
||||
if key not in valid_keys:
|
||||
del parameters[key]
|
||||
# drop unsupported params
|
||||
for param in list(parameters.keys()):
|
||||
if param not in allowed_params:
|
||||
del parameters[param]
|
||||
|
||||
def prompt_template(self, system_message: str, prompt: str):
|
||||
|
||||
@@ -117,16 +117,27 @@ class OpenAICompatibleClient(ClientBase):
|
||||
"""
|
||||
Generates text from the given prompt and parameters.
|
||||
"""
|
||||
human_message = {"role": "user", "content": prompt.strip()}
|
||||
|
||||
self.log.debug("generate", prompt=prompt[:128] + " ...", parameters=parameters)
|
||||
|
||||
try:
|
||||
response = await self.client.chat.completions.create(
|
||||
model=self.model_name, messages=[human_message], **parameters
|
||||
)
|
||||
|
||||
return response.choices[0].message.content
|
||||
if self.api_handles_prompt_template:
|
||||
# OpenAI API handles prompt template
|
||||
# Use the chat completions endpoint
|
||||
self.log.debug("generate (chat/completions)", prompt=prompt[:128] + " ...", parameters=parameters)
|
||||
human_message = {"role": "user", "content": prompt.strip()}
|
||||
response = await self.client.chat.completions.create(
|
||||
model=self.model_name, messages=[human_message], **parameters
|
||||
)
|
||||
response = response.choices[0].message.content
|
||||
return self.process_response_for_indirect_coercion(prompt, response)
|
||||
else:
|
||||
# Talemate handles prompt template
|
||||
# Use the completions endpoint
|
||||
self.log.debug("generate (completions)", prompt=prompt[:128] + " ...", parameters=parameters)
|
||||
parameters["prompt"] = prompt
|
||||
response = await self.client.completions.create(
|
||||
model=self.model_name, **parameters
|
||||
)
|
||||
return response.choices[0].text
|
||||
except PermissionDeniedError as e:
|
||||
self.log.error("generate error", e=e)
|
||||
emit("status", message="Client API: Permission Denied", status="error")
|
||||
@@ -151,7 +162,33 @@ class OpenAICompatibleClient(ClientBase):
|
||||
self.api_key = kwargs["api_key"]
|
||||
if "api_handles_prompt_template" in kwargs:
|
||||
self.api_handles_prompt_template = kwargs["api_handles_prompt_template"]
|
||||
# TODO: why isn't this calling super()?
|
||||
if "enabled" in kwargs:
|
||||
self.enabled = bool(kwargs["enabled"])
|
||||
|
||||
if "double_coercion" in kwargs:
|
||||
self.double_coercion = kwargs["double_coercion"]
|
||||
|
||||
log.warning("reconfigure", kwargs=kwargs)
|
||||
|
||||
self.set_client(**kwargs)
|
||||
|
||||
def jiggle_randomness(self, prompt_config: dict, offset: float = 0.3) -> dict:
|
||||
"""
|
||||
adjusts temperature and presence penalty
|
||||
by random values using the base value as a center
|
||||
"""
|
||||
|
||||
temp = prompt_config["temperature"]
|
||||
|
||||
min_offset = offset * 0.3
|
||||
|
||||
prompt_config["temperature"] = random.uniform(temp + min_offset, temp + offset)
|
||||
|
||||
try:
|
||||
presence_penalty = prompt_config["presence_penalty"]
|
||||
prompt_config["presence_penalty"] = round(random.uniform(
|
||||
presence_penalty + 0.1, presence_penalty + offset
|
||||
),1)
|
||||
except KeyError:
|
||||
pass
|
||||
@@ -11,10 +11,15 @@ __all__ = [
|
||||
"PRESET_SIMPLE_1",
|
||||
]
|
||||
|
||||
# TODO: refactor abstraction and make configurable
|
||||
|
||||
PRESENCE_PENALTY_BASE = 0.2
|
||||
|
||||
PRESET_TALEMATE_CONVERSATION = {
|
||||
"temperature": 0.65,
|
||||
"top_p": 0.47,
|
||||
"top_k": 42,
|
||||
"presence_penalty": PRESENCE_PENALTY_BASE,
|
||||
"repetition_penalty": 1.18,
|
||||
"repetition_penalty_range": 2048,
|
||||
}
|
||||
@@ -23,6 +28,7 @@ PRESET_TALEMATE_CREATOR = {
|
||||
"temperature": 0.7,
|
||||
"top_p": 0.9,
|
||||
"top_k": 20,
|
||||
"presence_penalty": PRESENCE_PENALTY_BASE,
|
||||
"repetition_penalty": 1.15,
|
||||
"repetition_penalty_range": 512,
|
||||
}
|
||||
@@ -31,6 +37,7 @@ PRESET_LLAMA_PRECISE = {
|
||||
"temperature": 0.7,
|
||||
"top_p": 0.1,
|
||||
"top_k": 40,
|
||||
"presence_penalty": PRESENCE_PENALTY_BASE,
|
||||
"repetition_penalty": 1.18,
|
||||
}
|
||||
|
||||
@@ -45,6 +52,7 @@ PRESET_DIVINE_INTELLECT = {
|
||||
"temperature": 1.31,
|
||||
"top_p": 0.14,
|
||||
"top_k": 49,
|
||||
"presence_penalty": PRESENCE_PENALTY_BASE,
|
||||
"repetition_penalty_range": 1024,
|
||||
"repetition_penalty": 1.17,
|
||||
}
|
||||
@@ -53,6 +61,7 @@ PRESET_SIMPLE_1 = {
|
||||
"temperature": 0.7,
|
||||
"top_p": 0.9,
|
||||
"top_k": 20,
|
||||
"presence_penalty": PRESENCE_PENALTY_BASE,
|
||||
"repetition_penalty": 1.15,
|
||||
}
|
||||
|
||||
|
||||
@@ -51,6 +51,39 @@ class TextGeneratorWebuiClient(ClientBase):
|
||||
# is this needed?
|
||||
parameters["max_new_tokens"] = parameters["max_tokens"]
|
||||
parameters["stop"] = parameters["stopping_strings"]
|
||||
|
||||
|
||||
# textgenwebui does not error on unsupported parameters
|
||||
# but we should still drop them so they don't get passed to the API
|
||||
# and show up in our prompt debugging tool.
|
||||
|
||||
# note that this is not the full list of their supported parameters
|
||||
# but only those we send.
|
||||
|
||||
allowed_params = [
|
||||
"temperature",
|
||||
"top_p",
|
||||
"top_k",
|
||||
"max_tokens",
|
||||
"repetition_penalty",
|
||||
"repetition_penalty_range",
|
||||
"max_tokens",
|
||||
"stopping_strings",
|
||||
"skip_special_tokens",
|
||||
"stream",
|
||||
# is this needed?
|
||||
"max_new_tokens",
|
||||
"stop",
|
||||
# talemate internal
|
||||
# These will be removed before sending to the API
|
||||
# but we keep them here since they are used during the prompt finalization
|
||||
"extra_stopping_strings",
|
||||
]
|
||||
|
||||
# drop unsupported params
|
||||
for param in list(parameters.keys()):
|
||||
if param not in allowed_params:
|
||||
del parameters[param]
|
||||
|
||||
def set_client(self, **kwargs):
|
||||
self.api_key = kwargs.get("api_key", self.api_key)
|
||||
|
||||
@@ -187,3 +187,5 @@ async def agent_ready_checks():
|
||||
for agent in AGENTS.values():
|
||||
if agent and agent.enabled:
|
||||
await agent.ready_check()
|
||||
elif agent and not agent.enabled:
|
||||
await agent.setup_check()
|
||||
|
||||
@@ -11,6 +11,20 @@ class TestPromptPayload(pydantic.BaseModel):
|
||||
kind: str
|
||||
|
||||
|
||||
def ensure_number(v):
|
||||
"""
|
||||
if v is a str but digit turn into into or float
|
||||
"""
|
||||
|
||||
if isinstance(v, str):
|
||||
if v.isdigit():
|
||||
return int(v)
|
||||
try:
|
||||
return float(v)
|
||||
except ValueError:
|
||||
return v
|
||||
return v
|
||||
|
||||
class DevToolsPlugin:
|
||||
router = "devtools"
|
||||
|
||||
@@ -34,7 +48,7 @@ class DevToolsPlugin:
|
||||
log.info(
|
||||
"Testing prompt",
|
||||
payload={
|
||||
k: v for k, v in payload.generation_parameters.items() if k != "prompt"
|
||||
k: ensure_number(v) for k, v in payload.generation_parameters.items() if k != "prompt"
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
@@ -2123,7 +2123,7 @@ class Scene(Emitter):
|
||||
|
||||
async def add_to_recent_scenes(self):
|
||||
log.debug("add_to_recent_scenes", filename=self.filename)
|
||||
config = Config(**self.config)
|
||||
config = load_config(as_model=True)
|
||||
config.recent_scenes.push(self)
|
||||
config.save()
|
||||
|
||||
|
||||
@@ -5,7 +5,7 @@ import json
|
||||
import re
|
||||
import textwrap
|
||||
from typing import List, Union
|
||||
|
||||
import struct
|
||||
import isodate
|
||||
import structlog
|
||||
from colorama import Back, Fore, Style, init
|
||||
@@ -179,6 +179,29 @@ def color_emotes(text: str, color: str = "blue") -> str:
|
||||
def extract_metadata(img_path, img_format):
|
||||
return chara_read(img_path)
|
||||
|
||||
def read_metadata_from_png_text(image_path:str) -> dict:
|
||||
|
||||
"""
|
||||
Reads the character metadata from the tEXt chunk of a PNG image.
|
||||
"""
|
||||
|
||||
# Read the image
|
||||
with open(image_path, 'rb') as f:
|
||||
png_data = f.read()
|
||||
|
||||
# Split the PNG data into chunks
|
||||
offset = 8 # Skip the PNG signature
|
||||
while offset < len(png_data):
|
||||
length = struct.unpack('!I', png_data[offset:offset+4])[0]
|
||||
chunk_type = png_data[offset+4:offset+8]
|
||||
chunk_data = png_data[offset+8:offset+8+length]
|
||||
if chunk_type == b'tEXt':
|
||||
keyword, text_data = chunk_data.split(b'\x00', 1)
|
||||
if keyword == b'chara':
|
||||
return json.loads(base64.b64decode(text_data).decode('utf-8'))
|
||||
offset += 12 + length
|
||||
|
||||
raise ValueError('No character metadata found.')
|
||||
|
||||
def chara_read(img_url, input_format=None):
|
||||
if input_format is None:
|
||||
@@ -194,7 +217,6 @@ def chara_read(img_url, input_format=None):
|
||||
image = Image.open(io.BytesIO(image_data))
|
||||
|
||||
exif_data = image.getexif()
|
||||
|
||||
if format == "webp":
|
||||
try:
|
||||
if 37510 in exif_data:
|
||||
@@ -235,7 +257,15 @@ def chara_read(img_url, input_format=None):
|
||||
return base64_decoded_data
|
||||
else:
|
||||
log.warn("chara_load", msg="No chara data found in PNG image.")
|
||||
return False
|
||||
log.warn("chara_load", msg="Trying to read from PNG text.")
|
||||
|
||||
try:
|
||||
return read_metadata_from_png_text(img_url)
|
||||
except ValueError:
|
||||
return False
|
||||
except Exception as exc:
|
||||
log.error("chara_load", msg="Error reading metadata from PNG text.", exc_info=exc)
|
||||
return False
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
3
talemate_frontend/example.env.development.local
Normal file
3
talemate_frontend/example.env.development.local
Normal file
@@ -0,0 +1,3 @@
|
||||
ALLOWED_HOSTS=example.com
|
||||
# wss if behind ssl, ws if not
|
||||
VUE_APP_TALEMATE_BACKEND_WEBSOCKET_URL=wss://example.com:5050
|
||||
4
talemate_frontend/package-lock.json
generated
4
talemate_frontend/package-lock.json
generated
@@ -1,12 +1,12 @@
|
||||
{
|
||||
"name": "talemate_frontend",
|
||||
"version": "0.25.0",
|
||||
"version": "0.25.5",
|
||||
"lockfileVersion": 2,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "talemate_frontend",
|
||||
"version": "0.25.0",
|
||||
"version": "0.25.5",
|
||||
"dependencies": {
|
||||
"@codemirror/lang-markdown": "^6.2.5",
|
||||
"@codemirror/theme-one-dark": "^6.1.2",
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "talemate_frontend",
|
||||
"version": "0.25.0",
|
||||
"version": "0.25.5",
|
||||
"private": true,
|
||||
"scripts": {
|
||||
"serve": "vue-cli-service serve",
|
||||
|
||||
@@ -244,6 +244,13 @@ export default {
|
||||
client.api_key = data.api_key;
|
||||
client.double_coercion = data.data.double_coercion;
|
||||
client.data = data.data;
|
||||
for (let key in client.data.meta.extra_fields) {
|
||||
if (client.data[key] === null || client.data[key] === undefined) {
|
||||
client.data[key] = client.data.meta.defaults[key];
|
||||
}
|
||||
client[key] = client.data[key];
|
||||
}
|
||||
|
||||
} else if(!client) {
|
||||
console.log("Adding new client", data);
|
||||
|
||||
@@ -259,6 +266,16 @@ export default {
|
||||
double_coercion: data.data.double_coercion,
|
||||
data: data.data,
|
||||
});
|
||||
|
||||
// apply extra field defaults
|
||||
let client = this.state.clients[this.state.clients.length - 1];
|
||||
for (let key in client.data.meta.extra_fields) {
|
||||
if (client.data[key] === null || client.data[key] === undefined) {
|
||||
client.data[key] = client.data.meta.defaults[key];
|
||||
}
|
||||
client[key] = client.data[key];
|
||||
}
|
||||
|
||||
// sort the clients by name
|
||||
this.state.clients.sort((a, b) => (a.name > b.name) ? 1 : -1);
|
||||
}
|
||||
|
||||
@@ -56,9 +56,9 @@
|
||||
</v-row>
|
||||
<v-row v-for="field in clientMeta().extra_fields" :key="field.name">
|
||||
<v-col cols="12">
|
||||
<v-text-field v-model="client.data[field.name]" v-if="field.type === 'text'" :label="field.label"
|
||||
<v-text-field v-model="client[field.name]" v-if="field.type === 'text'" :label="field.label"
|
||||
:rules="[rules.required]" :hint="field.description"></v-text-field>
|
||||
<v-checkbox v-else-if="field.type === 'bool'" v-model="client.data[field.name]"
|
||||
<v-checkbox v-else-if="field.type === 'bool'" v-model="client[field.name]"
|
||||
:label="field.label" :hint="field.description" density="compact"></v-checkbox>
|
||||
</v-col>
|
||||
</v-row>
|
||||
|
||||
@@ -248,7 +248,7 @@ export default {
|
||||
messageHandlers: [],
|
||||
scene: {},
|
||||
appConfig: {},
|
||||
autcompleting: false,
|
||||
autocompleting: false,
|
||||
autocompletePartialInput: "",
|
||||
autocompleteCallback: null,
|
||||
autocompleteFocusElement: null,
|
||||
@@ -303,9 +303,11 @@ export default {
|
||||
|
||||
this.connecting = true;
|
||||
let currentUrl = new URL(window.location.href);
|
||||
console.log(currentUrl);
|
||||
let websocketUrl = process.env.VUE_APP_TALEMATE_BACKEND_WEBSOCKET_URL || `ws://${currentUrl.hostname}:5050/ws`;
|
||||
|
||||
this.websocket = new WebSocket(`ws://${currentUrl.hostname}:5050/ws`);
|
||||
console.log("urls", { websocketUrl, currentUrl }, {env : process.env});
|
||||
|
||||
this.websocket = new WebSocket(websocketUrl);
|
||||
console.log("Websocket connecting ...")
|
||||
this.websocket.onmessage = this.handleMessage;
|
||||
this.websocket.onopen = () => {
|
||||
|
||||
@@ -1,4 +1,12 @@
|
||||
const { defineConfig } = require('@vue/cli-service')
|
||||
|
||||
const ALLOWED_HOSTS = ((process.env.ALLOWED_HOSTS || "all") !== "all" ? process.env.ALLOWED_HOSTS.split(",") : "all")
|
||||
const VUE_APP_TALEMATE_BACKEND_WEBSOCKET_URL = process.env.VUE_APP_TALEMATE_BACKEND_WEBSOCKET_URL || null
|
||||
|
||||
console.log("NODE_ENV", process.env.NODE_ENV)
|
||||
console.log("ALLOWED_HOSTS", ALLOWED_HOSTS)
|
||||
console.log("VUE_APP_TALEMATE_BACKEND_WEBSOCKET_URL", VUE_APP_TALEMATE_BACKEND_WEBSOCKET_URL)
|
||||
|
||||
module.exports = defineConfig({
|
||||
transpileDependencies: true,
|
||||
|
||||
@@ -9,6 +17,7 @@ module.exports = defineConfig({
|
||||
},
|
||||
|
||||
devServer: {
|
||||
allowedHosts: ALLOWED_HOSTS,
|
||||
client: {
|
||||
overlay: {
|
||||
warnings: false,
|
||||
|
||||
1
templates/llm-prompt/std/Mistral.jinja2
Normal file
1
templates/llm-prompt/std/Mistral.jinja2
Normal file
@@ -0,0 +1 @@
|
||||
<s>[INST] {{ system_message }} {{ user_message }} [/INST] {{ coercion_message }}
|
||||
Reference in New Issue
Block a user