Indexes & Documents

An index is an app-scoped container of RAG-ready documents. Upload PDFs, Markdown, HTML, DOCX, XLSX, or text into an index, and the platform chunks + embeds them so the agent can recall passages by semantic search inside its think step.

Indexes are for knowledge — the kind of bytes you want the agent to retrieve from. Customer file blobs and structured records belong in your backend (PocketBase / Supabase / your own DB), exposed to the agent via MCP.

index, _ := client.CreateIndex(ctx, tavora.CreateIndexInput{
    Name:        "Support docs",
    Description: "FAQ + product manuals",
})

indexes, _ := client.ListIndexes(ctx)
got, _    := client.GetIndex(ctx, index.ID)
_,   _     = client.UpdateIndex(ctx, index.ID, tavora.UpdateIndexInput{
    Name: "Support knowledge",
})
_ = client.DeleteIndex(ctx, index.ID)

const index = await client.createIndex({
  name: 'Support docs',
  description: 'FAQ + product manuals',
});

const all = await client.listIndexes();
const got = await client.getIndex(index.id);
await client.updateIndex(index.id, { name: 'Support knowledge' });
await client.deleteIndex(index.id);

Uploading documents

Documents are uploaded as multipart bytes plus optional provenance metadata. Indexable file types (PDF, DOCX, XLSX, CSV, MD, HTML, TXT, images) chunk + embed; other types are stored opaque (status: "stored") and are listable but not searchable.

Go
TypeScript

doc, _ := client.UploadDocument(ctx, tavora.UploadDocumentInput{
    IndexID:  index.ID,
    FilePath: "./faq.md",
    // Optional provenance — round-trips through document metadata.
    Source:   "https://example.com/faq",
    Task:     "support-handbook",
    Type:     "faq",
    Tags:     []string{"v2026.05", "public"},
})

import { openAsBlob } from 'node:fs';

const file = await openAsBlob('./faq.md');
const doc = await client.uploadDocument({
  indexId: index.id,
  file,
  filename: 'faq.md',
  source: 'https://example.com/faq',
  task: 'support-handbook',
  type: 'faq',
  tags: ['v2026.05', 'public'],
});

Versioning by name

Re-uploading with the same name to the same index bumps the document’s version; older versions stay queryable (is_latest=false) and fetchable via ?version=N. Pass if_version for optimistic concurrency — 409 on mismatch, returnable through asVersionConflict to retry the rewrite.

Extracted-markdown siblings

Non-markdown indexable types (PDF, DOCX, XLSX, …) generate an extracted markdown sibling on upload — a second document row with content_type=text/markdown, parent_id pointing at the original, and metadata.derived_from="extraction". Chunks attach to the sibling so search hits cite the editable form.

Dedup

Every uploaded document is hashed server-side; the hex sha256 is returned as content_sha256. Find duplicates with ?content_sha256=<hex> or the sugar ?duplicate_of=<id>.

Searching

Go
TypeScript

chunks, _ := client.Search(ctx, tavora.SearchInput{
    Query:   "what document formats are supported?",
    IndexID: index.ID,
    Limit:   5,
})
// One row per matched chunk.

docs, _ := client.SearchDocuments(ctx, tavora.SearchInput{
    Query:   "what artifacts cover refund policy?",
    IndexID: index.ID,
    Limit:   5,
})
// One row per distinct document, best chunk inlined as best_chunk.preview.

const chunks = await client.search({
  query: 'what document formats are supported?',
  indexId: index.id,
  limit: 5,
});

const docs = await client.searchDocuments({
  query: 'what artifacts cover refund policy?',
  indexId: index.id,
  limit: 5,
});

search() is the agent’s default (chunks). searchDocuments() is the right call when the question is “what artifacts are about X” rather than “what passages are about X”.

Deletion

deleteDocument is soft by default (sets deleted_at, drops is_latest, idempotent — 204 whether the row existed or was already gone). Use deleteDocumentHard to remove the row and the on-disk file.

Per-session restrictions

Pass index_ids on createAgentSession to scope the agent’s retrieval to a subset of an app’s indexes — useful for per-tenant knowledge slicing (one index per tenant, pinned at session create). Omit it and the agent can search every enabled index in the app.