RAG Settings
RAG Settings can be accessed by navigating to Admin / Settings / Documents. The Documents section of the Admin Settings controls how documents are processed, chunked, embedded, and retrieved across the platform. These settings apply globally and affect all Knowledge Bases.
These are advanced settings that affect the quality of document retrieval across all Knowledge Bases. Changes should be made carefully and tested before applying them broadly. If you're unsure about a setting, consult your organization's administrator.
Embedding Models
Embedding models convert text into numerical vectors used for search and retrieval.
Dense Embedding Model
The dense embedding model is used for semantic search — it understands the meaning of text to find relevant results even when exact keywords don't match. This is the primary model used for document retrieval.
Sparse Embedding Model
The sparse embedding model is used for keyword-based search. It is only active when Hybrid search is selected. Sparse search excels at finding exact term matches.
Document Chunking
When a document is uploaded, it is split into smaller pieces called "chunks". These chunks are what get embedded and searched. The chunking strategy affects retrieval quality significantly.
Chunk Type
Determines how documents are split into chunks:
- Header (Markdown) — Splits on Markdown headers, preserving document structure. Recommended for well-structured documents.
- Recursive Character — Splits by character count with overlap. A general-purpose fallback.
- LLM-based Chunker — Uses an LLM to intelligently determine chunk boundaries based on content semantics. Produces higher-quality chunks but is slower and more expensive.
Chunk Size / Chunk Overlap
Visible when Chunk Type is Header or Recursive Character.
- Chunk Size — The target size (in characters) for each chunk.
- Chunk Overlap — How many characters overlap between consecutive chunks. Overlap helps preserve context across chunk boundaries.
LLM Chunker Settings
Visible when Chunk Type is LLM-based Chunker.
- Target Min/Max Words — Soft guidelines the LLM uses when deciding chunk size.
- Hard Min/Max Words — Hard limits. Chunks below the hard minimum are merged; chunks above the hard maximum are split.
Document Loading
PDF Loader Type
Controls how text is extracted from uploaded PDF files:
- PyPDF PDF Parser — Directly extracts text from the PDF. Fast and reliable for text-heavy documents.
- LLM Vision — Converts each PDF page to an image and uses a vision model (VLM) to transcribe the content. Recommended for documents with complex layouts, tables, or images.
The PDF loader type is a global setting. Per-Knowledge Base customization of the vision model's behavior is available via the Transcription Instructions button within each Knowledge Base.
Query Settings
These settings control how the system retrieves relevant chunks when a user asks a question.
Search Type
- Dense (Semantic) — Uses only the dense embedding model. Best for meaning-based queries.
- Sparse (Keyword) — Uses only the sparse embedding model. Best for exact keyword matching.
- Hybrid (Dense + Sparse) — Combines both approaches. Generally recommended for the best results.
Search Limit
The maximum number of chunks retrieved from the vector database per query.
Reranking
When enabled, a reranking model re-scores the initially retrieved chunks for better relevance.
- Reranker Model — The model used for reranking.
- Rerank Limit — How many of the top results to keep after reranking. Must be less than or equal to the Search Limit.
RAG Prompt Template
The prompt template used to inject retrieved context into the LLM's prompt. Uses {{CONTEXT}} and {{QUERY}} as placeholders. Modify this to change how the LLM uses retrieved knowledge when answering questions.
LLM Loader Prompts
These prompts affect document parsing and cleaning behavior across all Knowledge Bases. Adjust with caution.
Global Transcription System Prompt
The system prompt sent to the vision model when using the LLM Vision PDF loader. This controls how the model transcribes PDF pages globally. Individual Knowledge Bases can append additional instructions via their Transcription Instructions setting.
Global Chunk Cleaning Prompt
The system prompt used when the Clean Chunks action is triggered within a Knowledge Base. This instructs the LLM on how to clean and normalize chunk text (e.g., fix formatting, expand abbreviations, correct typos).