MCP servers

Tavora agents can call tools exposed by any Model Context Protocol (MCP) server the workspace has registered. Inside an execute_js block the agent writes code like:

var list = require('tasklist-example').create_task_list({ name: 'Europe trip' });
var added = 0;
var cities = ['Berlin', 'Munich', 'Hamburg', 'Cologne'];
for (var i = 0; i < cities.length; i++) {
  require('tasklist-example').add_task({
    list_id: list.content.id,
    title: 'Visit ' + cities[i]
  });
  added++;
}
return added + ' tasks added';

One execute_js turn can make as many MCP calls as the plan needs. The SDK-side workflow to make that work is three steps:

Register — store the URL, transport, and auth for the server.
Test — dial the server, capture its tool schemas, and materialise a governed skill row.
(Optional) Bind — pin specific MCP skills to an agent version via skills_json.

1. Register

server, err := client.CreateMCPServer(ctx, tavora.CreateMCPServerInput{
    Name:       "tasklist-example",
    URL:        "https://example.com/mcp",
    Transport:  "streamable_http", // or "sse"
    AuthConfig: json.RawMessage(`{"type":"bearer","token":"` + secret + `"}`),
})

Transports: streamable_http (recommended) or sse. Auth shapes:

Type	JSON
Bearer token	`{"type":"bearer","token":"..."}`
Custom header	`{"type":"header","name":"X-Foo","value":"..."}`

The server on your side must validate the same secret on every incoming request — see examples/tasklist/internal/web/mcp.go for a reference bearer-gate middleware.

2. Test

result, err := client.TestMCPServer(ctx, server.ID)
// result.Tools        → []MCPToolSchema captured from tools/list
// result.Drift        → diff vs the prior snapshot (added/removed/changed)
// result.IsFirstTest  → true if this is the initial capture
// result.Skill        → the materialised skill row

TestMCPServer does three things atomically on the backend:

Dials the MCP server using the registered auth.
Calls tools/list and captures every advertised tool’s name, description, and input schema.
Upserts a type='mcp' skill row linked to the server, storing the captured tools under parameters.tools with a tested_at timestamp.

If the dial or list fails, TestMCPServer returns an error and no skill row is created or modified — broken MCP servers surface here rather than mid-run.

Drift detection

Re-running TestMCPServer against a server with an existing snapshot returns a Drift struct listing what changed:

if !result.IsFirstTest && result.Drift.HasDrift() {
    fmt.Printf("added: %v\n", result.Drift.Added)
    fmt.Printf("removed: %v\n", result.Drift.Removed)
    for _, c := range result.Drift.Changed {
        fmt.Printf("changed: %s (%s)\n", c.Name, c.What)
    }
}

Drift fields are by tool name; whitespace-only changes in input schemas don’t count as drift (the comparison is structural).

CLI equivalent

tavora mcp create --name tasklist-example --url https://... \
    --transport streamable_http
tavora mcp test SERVER_ID

3. Runtime behaviour

At every agent run, the runtime picks how each enabled MCP server reaches the sandbox:

Server state	Result
Tested, skill enabled	Sandbox pack built from stored schemas — no fresh `tools/list` handshake.
Tested, skill disabled	Server skipped (admin kill-switch).
Never tested	Fallback: dial + list at run time (backwards compat).
Server disabled	Always skipped regardless of skill state.

This is why Test is the recommended path: it pins the schemas an agent run sees, so changes upstream surface as drift on the next Test rather than silent behaviour shifts mid-production.

4. (Optional) Per-version binding

An agent version’s skills_json can pin which MCP servers that version uses. Non-empty skills_json filters — only bound MCP skills load. Empty skills_json loads every enabled MCP server on the workspace (legacy default-all, for backwards compat).

// Create a new agent version pinned to one MCP server and one module skill.
v, err := client.CreateAgentVersion(ctx, agentID, tavora.CreateAgentVersionInput{
    FromVersionID: prevVersion.ID,
    Semver:        "1.2.0",
    PersonaMD:     "You plan trips using the tasklist-example server.",
    Skills: []tavora.SkillBinding{
        {SkillID: tasklistMCPSkillID, Version: "latest"},
        {SkillID: helpersModuleSkillID, Version: "latest"},
    },
})

Discover tasklistMCPSkillID from ListSkills — TestMCPServer returns result.Skill.ID on first capture, or look it up later via the materialised skill row.

How the agent calls MCP tools

Every MCP tool is exposed as a JavaScript method on a module named for the MCP server:

// 'tasklist-example' is the server's name.
var mod = require('tasklist-example');
var list = mod.create_task_list({ name: 'Europe' });
mod.add_task({ list_id: list.content.id, title: 'Visit Berlin' });

Calls emit skill_call events in the run trace and are gated by the workspace’s tool policies (key format: <server-name>.<tool-name>).

Debugging

tavora mcp test SERVER_ID — dials and prints the captured tools + drift block. Useful first check when an MCP call is failing.
tavora agents interactive SESSION_ID — REPL that shows every skill_call event inline.
/platform/mcp-servers — UI with a Test button and last-tested timestamp on each row.

Reference implementation

examples/tasklist in the repo is a full end-to-end MCP server: it registers itself on boot (CreateMCPServer), exposes five tools over streamable_http with bearer auth, and ships a web UI that shows the agent, the chat turns, and the resulting SQLite tasks. Start there when building your own.