Skip to main content

What is a Corpus?

A corpus is a collection of documents that your agent can search and reference during conversations. When a contact asks a question, the agent automatically queries the relevant corpus to find the most accurate answer from your content rather than relying solely on its general knowledge. This is powered by RAG (Retrieval-Augmented Generation) — documents are embedded and indexed so the agent can quickly find the most relevant passages for any question.

Supported Document Types

You can add documents to a corpus in two ways:
MethodFormatsDescription
TextPlain textPaste or type content directly
File UploadPDF, DOCX, TXTUpload document files

Managing a Corpus

ActionDescription
Create CorpusCreate a new empty knowledge base
Add DocumentsUpload files or add text content to the corpus
List DocumentsView all documents in a corpus
Delete DocumentsRemove specific documents from the corpus
QuerySearch the corpus directly (useful for testing)

Attaching to Personas

A corpus is attached to a persona to make it available during conversations. Each persona can have one corpus attached. When attached, the agent automatically gets a search_knowledge tool that it can invoke to query the corpus. You can control the scope of the attachment:
ScopeDescription
voiceKnowledge is only available during voice calls
chatKnowledge is only available during chat conversations
allKnowledge is available across all channels
Keep your corpus focused on the type of questions your agent will actually receive. A smaller, well-curated corpus will give better results than a large, unfocused one.

How It Works in Practice

When a contact asks a question during a conversation with your agent:
  1. The agent decides the question requires looking up information
  2. It invokes the search_knowledge tool with a search query
  3. The most relevant passages from your corpus are returned
  4. The agent uses those passages to formulate an accurate, grounded response
This happens seamlessly — the contact just gets an accurate answer based on your content.
If the corpus doesn’t contain information relevant to the question, the agent will fall back to its general knowledge or let the contact know it doesn’t have that information, depending on your prompt instructions.