RAG Settings

RAG Settings can be accessed by navigating to Admin / Settings / Documents. The Documents section of the Admin Settings controls how documents are processed, chunked, embedded, and retrieved across the platform. These settings apply globally and affect all Knowledge Bases.

warning

These are advanced settings that affect the quality of document retrieval across all Knowledge Bases. Changes should be made carefully and tested before applying them broadly. If you're unsure about a setting, consult your organization's administrator.

Embedding Models

Embedding models convert text into numerical vectors used for search and retrieval.

Dense Embedding Model

The dense embedding model is used for semantic search — it understands the meaning of text to find relevant results even when exact keywords don't match. This is the primary model used for document retrieval.

Sparse Embedding Model

The sparse embedding model is used for keyword-based search. It is only active when Hybrid search is selected. Sparse search excels at finding exact term matches.

Document Chunking

When a document is uploaded, it is split into smaller pieces called "chunks". These chunks are what get embedded and searched. The chunking strategy affects retrieval quality significantly.

Chunk Type

Determines how documents are split into chunks:

Header (Markdown) — Splits on Markdown headers, preserving document structure. Recommended for well-structured documents.
Recursive Character — Splits by character count with overlap. A general-purpose fallback.
LLM-based Chunker — Uses an LLM to intelligently determine chunk boundaries based on content semantics. Produces higher-quality chunks but is slower and more expensive.

Chunk Size / Chunk Overlap

Visible when Chunk Type is Header or Recursive Character.

Chunk Size — The target size (in characters) for each chunk.
Chunk Overlap — How many characters overlap between consecutive chunks. Overlap helps preserve context across chunk boundaries.

LLM Chunker Settings

Visible when Chunk Type is LLM-based Chunker.

Target Min/Max Words — Soft guidelines the LLM uses when deciding chunk size.
Hard Min/Max Words — Hard limits. Chunks below the hard minimum are merged; chunks above the hard maximum are split.

Document Loading

PDF Loader Type

Controls how text is extracted from uploaded PDF files:

PyPDF PDF Parser — Directly extracts text from the PDF. Fast and reliable for text-heavy documents.
LLM Vision — Converts each PDF page to an image and uses a vision model (VLM) to transcribe the content. Recommended for documents with complex layouts, tables, or images.

info

The PDF loader type is a global setting. Per-Knowledge Base customization of the vision model's behavior is available via the Transcription Instructions button within each Knowledge Base.

Query Settings

These settings control how the system retrieves relevant chunks when a user asks a question.

Search Type

Dense (Semantic) — Uses only the dense embedding model. Best for meaning-based queries.
Sparse (Keyword) — Uses only the sparse embedding model. Best for exact keyword matching.
Hybrid (Dense + Sparse) — Combines both approaches. Generally recommended for the best results.

Search Limit

The maximum number of chunks retrieved from the vector database per query.

Reranking

When enabled, a reranking model re-scores the initially retrieved chunks for better relevance.

Reranker Model — The model used for reranking.
Rerank Limit — How many of the top results to keep after reranking. Must be less than or equal to the Search Limit.

RAG Prompt Template

The prompt template used to inject retrieved context into the LLM's prompt. Uses {{CONTEXT}} and {{QUERY}} as placeholders. Modify this to change how the LLM uses retrieved knowledge when answering questions.

LLM Loader Prompts

warning

These prompts affect document parsing and cleaning behavior across all Knowledge Bases. Adjust with caution.

Global Transcription System Prompt

The system prompt sent to the vision model when using the LLM Vision PDF loader. This controls how the model transcribes PDF pages globally. Individual Knowledge Bases can append additional instructions via their Transcription Instructions setting.

Global Chunk Cleaning Prompt

The system prompt used when the Clean Chunks action is triggered within a Knowledge Base. This instructs the LLM on how to clean and normalize chunk text (e.g., fix formatting, expand abbreviations, correct typos).

Embedding Models​

Dense Embedding Model​

Sparse Embedding Model​

Document Chunking​

Chunk Type​

Chunk Size / Chunk Overlap​

LLM Chunker Settings​

Document Loading​

PDF Loader Type​

Query Settings​

Search Type​

Search Limit​

Reranking​

RAG Prompt Template​

LLM Loader Prompts​

Global Transcription System Prompt​

Global Chunk Cleaning Prompt​