feat(ocr): add support for Docling OCR engine and language configuration

This commit adds support for configuring the OCR engine and language(s) for Docling.
Configuration can be set via the environment variables `DOCLING_OCR_ENGINE` and `DOCLING_OCR_LANG`, or through the UI.

Fixes #13133
This commit is contained in:
Athanasios Oikonomou
2025-05-03 00:31:00 +03:00
committed by Athanasios Oikonomou
parent 7d184c3a14
commit 657162e96d
5 changed files with 67 additions and 2 deletions

View File

@@ -202,6 +202,8 @@ from open_webui.config import (
CONTENT_EXTRACTION_ENGINE,
TIKA_SERVER_URL,
DOCLING_SERVER_URL,
DOCLING_OCR_ENGINE,
DOCLING_OCR_LANG,
DOCUMENT_INTELLIGENCE_ENDPOINT,
DOCUMENT_INTELLIGENCE_KEY,
MISTRAL_OCR_API_KEY,
@@ -635,6 +637,8 @@ app.state.config.ENABLE_WEB_LOADER_SSL_VERIFICATION = ENABLE_WEB_LOADER_SSL_VERI
app.state.config.CONTENT_EXTRACTION_ENGINE = CONTENT_EXTRACTION_ENGINE
app.state.config.TIKA_SERVER_URL = TIKA_SERVER_URL
app.state.config.DOCLING_SERVER_URL = DOCLING_SERVER_URL
app.state.config.DOCLING_OCR_ENGINE = DOCLING_OCR_ENGINE
app.state.config.DOCLING_OCR_LANG = DOCLING_OCR_LANG
app.state.config.DOCUMENT_INTELLIGENCE_ENDPOINT = DOCUMENT_INTELLIGENCE_ENDPOINT
app.state.config.DOCUMENT_INTELLIGENCE_KEY = DOCUMENT_INTELLIGENCE_KEY
app.state.config.MISTRAL_OCR_API_KEY = MISTRAL_OCR_API_KEY