Building a chatbot
The POST /ask endpoint is the heart of the API. Give it a question and one or
more docsets, and it retrieves relevant passages and generates an answer grounded
in them, with citations.
Request
POST /ask
| Field | Type | Required | Description |
|---|---|---|---|
question | string | ✅ | The user’s question. |
docSetNames | string[] | ✅ | Docsets to search. |
pastMessages | Message[] | Prior turns, oldest first (see multi-turn). | |
config | object | Retrieval/generation tuning (see config). | |
promptContext | string | Extra system context prepended to the prompt. | |
stream | boolean | Stream the answer as SSE (see streaming). |
A Message is { "content": string, "role": "user" | "assistant" }.
Response
| Field | Type | Description |
|---|---|---|
answer | string | The generated answer. Citation markers like [1] refer to citations. |
citations | Citation[] | Sources used, indexed to the markers in answer. |
usage | object | Token usage: inputTokens, outputTokens, calls, perModel. |
A Citation is:
{
"index": 1,
"fileName": "refund-policy.pdf",
"pageNumber": "2",
"sourceType": "S3",
"sourceMetadata": { "sourceId": "policies/refund-policy.pdf", "driveId": null }
}sourceType is one of S3, SHAREPOINT, or GOOGLEDRIVE. For SharePoint
sources, sourceMetadata.driveId is also set.
Multi-turn conversations
Pass the conversation so far in pastMessages (oldest first). Don’t include the
current question there — it goes in question.
curl
curl -X POST "$API_URL/ask" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"question": "And how do I start one?",
"docSetNames": ["company-policies"],
"pastMessages": [
{ "role": "user", "content": "What is our refund policy?" },
{ "role": "assistant", "content": "Refunds are available within 30 days [1]." }
]
}'Tuning with config
All config fields are optional; defaults are shown.
| Field | Type | Default | Description |
|---|---|---|---|
retrievalLimit | integer (1–200) | 50 | How many passages to retrieve. |
minimumSimilarityScore | number (0–1) | 0.2 | Discard passages below this similarity. |
percentageThreshold | number (0–1) | 0.8 | Confidence threshold for answering. |
model | string | Override the answering model. | |
enableEntityDisambiguation | boolean | false | Enable entity scoping and clarifying questions. |
{
"question": "What changed in the 2024 policy?",
"docSetNames": ["company-policies"],
"config": { "retrievalLimit": 30, "minimumSimilarityScore": 0.5 }
}promptContext
Use promptContext to add system-level guidance for a single request — for
example, a persona or formatting instruction:
{
"question": "Summarize the refund policy.",
"docSetNames": ["company-policies"],
"promptContext": "Answer in two short bullet points for a support agent."
}Streaming
For a responsive chat UI, stream the answer as it’s generated. Enable streaming
either by setting "stream": true in the body or by sending
Accept: text/event-stream. The response is a
Server-Sent Events
stream of data: lines, each carrying a JSON chunk.
curl
curl -N -X POST "$API_URL/ask" \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d '{
"question": "What is our refund policy?",
"docSetNames": ["company-policies"],
"stream": true
}'The stream opens with a :ok comment line to establish the connection. If an
error occurs mid-stream, an event with an error field is emitted before the
stream closes.
Rendering citations
The answer contains inline markers like [1], [2] that map to entries in
citations by their index. A typical UI replaces each marker with a link or
tooltip showing the citation’s fileName and pageNumber.