Skip to Content
Chatbot (RAG)

Building a chatbot

The POST /ask endpoint is the heart of the API. Give it a question and one or more docsets, and it retrieves relevant passages and generates an answer grounded in them, with citations.

Request

POST /ask

FieldTypeRequiredDescription
questionstringThe user’s question.
docSetNamesstring[]Docsets to search.
pastMessagesMessage[]Prior turns, oldest first (see multi-turn).
configobjectRetrieval/generation tuning (see config).
promptContextstringExtra system context prepended to the prompt.
streambooleanStream the answer as SSE (see streaming).

A Message is { "content": string, "role": "user" | "assistant" }.

Response

FieldTypeDescription
answerstringThe generated answer. Citation markers like [1] refer to citations.
citationsCitation[]Sources used, indexed to the markers in answer.
usageobjectToken usage: inputTokens, outputTokens, calls, perModel.

A Citation is:

{ "index": 1, "fileName": "refund-policy.pdf", "pageNumber": "2", "sourceType": "S3", "sourceMetadata": { "sourceId": "policies/refund-policy.pdf", "driveId": null } }

sourceType is one of S3, SHAREPOINT, or GOOGLEDRIVE. For SharePoint sources, sourceMetadata.driveId is also set.

Multi-turn conversations

Pass the conversation so far in pastMessages (oldest first). Don’t include the current question there — it goes in question.

curl -X POST "$API_URL/ask" \ -H "Authorization: Bearer $API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "question": "And how do I start one?", "docSetNames": ["company-policies"], "pastMessages": [ { "role": "user", "content": "What is our refund policy?" }, { "role": "assistant", "content": "Refunds are available within 30 days [1]." } ] }'

Tuning with config

All config fields are optional; defaults are shown.

FieldTypeDefaultDescription
retrievalLimitinteger (1–200)50How many passages to retrieve.
minimumSimilarityScorenumber (0–1)0.2Discard passages below this similarity.
percentageThresholdnumber (0–1)0.8Confidence threshold for answering.
modelstringOverride the answering model.
enableEntityDisambiguationbooleanfalseEnable entity scoping and clarifying questions.
{ "question": "What changed in the 2024 policy?", "docSetNames": ["company-policies"], "config": { "retrievalLimit": 30, "minimumSimilarityScore": 0.5 } }

promptContext

Use promptContext to add system-level guidance for a single request — for example, a persona or formatting instruction:

{ "question": "Summarize the refund policy.", "docSetNames": ["company-policies"], "promptContext": "Answer in two short bullet points for a support agent." }

Streaming

For a responsive chat UI, stream the answer as it’s generated. Enable streaming either by setting "stream": true in the body or by sending Accept: text/event-stream. The response is a Server-Sent Events  stream of data: lines, each carrying a JSON chunk.

curl -N -X POST "$API_URL/ask" \ -H "Authorization: Bearer $API_TOKEN" \ -H "Content-Type: application/json" \ -H "Accept: text/event-stream" \ -d '{ "question": "What is our refund policy?", "docSetNames": ["company-policies"], "stream": true }'

The stream opens with a :ok comment line to establish the connection. If an error occurs mid-stream, an event with an error field is emitted before the stream closes.

Rendering citations

The answer contains inline markers like [1], [2] that map to entries in citations by their index. A typical UI replaces each marker with a link or tooltip showing the citation’s fileName and pageNumber.