Files
talemate/docs/chromadb.md

36 lines
1.2 KiB
Markdown
Raw Normal View History

2023-05-05 00:50:02 +03:00
## ChromaDB
If you want chromaDB to use the more accurate (but much slower) instructor embeddings add the following to `config.yaml`:
```yaml
chromadb:
embeddings: instructor
instructor_device: cpu
2023-09-18 11:18:52 +03:00
instructor_model: hkunlp/instructor-xl
2023-05-05 00:50:02 +03:00
```
You will need to restart the backend for this change to take effect.
2023-09-18 11:20:04 +03:00
**NOTE** - The first time you do this it will need to download the instructor model you selected. This may take a while, and the talemate backend will be un-responsive during that time.
2023-09-18 11:18:29 +03:00
Once the download is finished, if talemate is still un-responsive, try reloading the front-end to reconnect. When all fails just restart the backend as well.
2023-05-05 00:50:02 +03:00
### GPU support
If you want to use the instructor embeddings with GPU support, you will need to install pytorch with CUDA support.
To do this on windows, run `install-pytorch-cuda.bat` from the project root. Then change your device in the config to `cuda`:
```yaml
chromadb:
embeddings: instructor
instructor_device: cuda
2023-09-18 11:18:52 +03:00
instructor_model: hkunlp/instructor-xl
2023-05-05 00:50:02 +03:00
```
Instructor embedding models:
- `hkunlp/instructor-base` (smallest / fastest)
- `hkunlp/instructor-large`
2023-09-18 11:18:29 +03:00
- `hkunlp/instructor-xl` (largest / slowest) - requires about 5GB of GPU memory