MCP servers¶
The MCP layer is how every local capability reaches the agent. Claude Cowork — the operator
that drives Cos — runs in a sandbox that blocks outbound HTTP. A skill running inside that VM
cannot fetch the board API, read the OpenWhispr SQLite store, or call the guard sidecar directly.
So each capability is wrapped as an MCP server the operator mounts as a tool surface, and the
network boundary is crossed once, in a place we control, rather than scattered through skill code.
That single design choice — expose, don't call — is what makes the whole system sandbox-portable. It also gives every capability a uniform shape: a stdio MCP server, optionally fronted by an HTTP bridge, with a consistent lifecycle and (for writes) consistent actor attribution.
Inventory¶
Five core servers — board, openwhispr, calendar, guard, vault — plus an optional WhatsApp
add-on (six in all). Setup is split across skills: mcp-bridge-setup
wires the four non-voice core servers (board, calendar, guard, vault), while openwhispr and
whatsapp each have their own setup skill.
| Server | Bridge port | Runtime | Archetype | Role |
|---|---|---|---|---|
| board | :8001 |
Node (stdio) | Fetch wrapper | The one write path to the board — cases, tasks, notes, messages, reminders, priorities, labels, hierarchy |
| openwhispr | :8002 |
Node (stdio) | Local-store reader + watermark | Reads the external OpenWhispr SQLite + audio dir; owns the processed-watermark |
| calendar | :8003 |
Node (stdio) | Fetch wrapper | Calendar events on the board, linked to cases |
| guard | :8004 |
Node (stdio) | Fetch wrapper (→ sidecar) | Prompt-injection scan + sender whitelist; fronts the guard sidecar |
| vault | :8005 |
Node (stdio) + Agent SDK | Embedded agent | Ingest/query the knowledge vault via a headless Claude session |
:8006 (bridge) · :8010 (Go sidecar) |
Python (stdio) + Go | External two-process bridge | Read/send WhatsApp via an upstream whatsapp-mcp checkout |
Sidecars (not in .mcp.json; the MCPs call them over HTTP): search :8008, guard
:8009, whatsapp-go :8010.
Server archetypes¶
The interesting engineering is that these servers are not uniform under the hood. They fall into four archetypes with very different cost, trust, and statefulness profiles.
1. Thin fetch wrapper over the board HTTP API — no LLM¶
board · calendar · guard are pure fetch wrappers. Every tool maps to one HTTP route on
CRM_BASE_URL (guard's routes hit its sidecar instead); the server never shells out to curl and
makes no LLM calls. They are cheap, synchronous, and deterministic.
This is the single-seam principle made concrete: the board's HTTP API is the one write path.
The web UI is the human face of that API; the board MCP is its agent twin. Both write through
the same routes — which is why every agent write carries actor attribution (below) so the board's
append-only activity[] log can record human vs agent vs system.
Reminders and priorities ride the board MCP
Reminders (REM-<n>) and priorities (PRI-<n>) are board-native sub-resources on
/api/reminders and /api/priorities — no new server, port, or .mcp.json change. The
board server's surface grew; the topology did not.
2. Embedded Agent SDK session per call — vault¶
vault is the odd one out and is documented in depth on The vault agent. Where
its siblings are stateless fetch calls, each ingest / query tool call spawns a full,
short-lived Claude Agent SDK session (@anthropic-ai/claude-agent-sdk) scoped to the vault
filesystem. Two consequences the caller must plan for: it needs an ANTHROPIC_API_KEY in its
environment, and each call takes seconds to minutes and costs tokens — so you batch
material into one ingest, never loop.
Because the vault MCP is itself bridged at vault:8005, a naïve inner session could re-mount this
server and recurse. The session options are written to make that impossible (mcpServers: {} +
strictMcpConfig: true, the re-entrant tools in disallowedTools, settingSources: ["project"]
anchored to the vault cwd) — see The vault agent.
3. Local-store reader that owns a watermark — openwhispr¶
openwhispr reads the external OpenWhispr desktop app's store
directly — its SQLite DB (transcriptions.db, read-only via the sqlite3 CLI so it is safe
while the app is running under WAL) and the sibling audio/ dir of .webm recordings. OpenWhispr
has no native "mark read", so this server owns the idempotency primitive: a single
{ id, created } watermark file. list_transcripts returns only notes sorting strictly after the
watermark; mark_processed(id) advances it. That is the whole reason this server exists — it is
what makes the voice loop list → route → mark re-runnable at-least-once without re-emitting a
note (see the per-channel-watermark discipline in the spec).
4. External two-process bridge — whatsapp¶
whatsapp wraps a separate upstream repo (whatsapp-mcp,
a sibling checkout) and is two processes:
- a Go whatsmeow HTTP sidecar (
whatsapp-bridge) that talks to WhatsApp Web, run as a launchd sidecar on:8010(pinned off upstream's:8080, commonly taken; not in.mcp.json); and - a Python stdio MCP, bridged to the clients on
:8006.
The Python MCP reads messages.db directly for all reads, so a triage sweep works even if the
Go sidecar is momentarily down; only tool calls that send need the sidecar's REST API. The Go
bridge's bearer token is resolved at runtime from the checkout (store/.bridge-token), so the
secret never lands in the installed plist — a token rotation needs no re-render.
The sidecar pattern¶
Three capabilities are too heavy or too foreign to live inside a Node MCP, so they run as HTTP sidecars beside or behind the MCP that fronts them:
| Sidecar | Port | Why it is a separate process |
|---|---|---|
| search | :8008 |
A Python embeddings service; the board calls it for semantic ranking |
| guard | :8009 |
A Python FastAPI service hosting the prompt-injection classifier (a HuggingFace model) |
| whatsapp-go | :8010 |
A Go whatsmeow daemon — a different runtime entirely from the Python MCP |
The split keeps the MCP layer thin and lets each sidecar pick its own runtime, warm a model once (rather than per request), and be restarted independently under launchd. The two security-relevant sidecars embody a deliberate duality:
- the guard is a security control and fails CLOSED — an
unreachable
:8009makesscan_emailreturn anUNAVAILABLE → UNTRUSTEDverdict (as non-error text, so the agent reads the action rather than retrying); - search is an availability accelerator and fails OPEN — an
unreachable
:8008silently degrades to keyword ranking with no error.
Same architecture, opposite failure posture, chosen on purpose. Both deep dives live on their own pages; this page only summarizes them.
The two wiring paths¶
Every server is a stdio MCP, but the two clients consume it differently — and that difference drives the most subtle contract in this layer.
flowchart LR
subgraph Cowork["Claude Cowork Desktop"]
direction TB
CW["claude_desktop_config.json<br/>(direct stdio command)"]
end
subgraph Code["Claude Code"]
direction TB
CC[".mcp.json<br/>(type: http, url)"]
end
CW -- "spawns long-lived child" --> S1["MCP server<br/>(stdio)"]
CC -- "HTTP :8001-:8006" --> BR["launchd LaunchAgent<br/>supergateway bridge"]
BR -- "spawns fresh child<br/>per request" --> S2["MCP server<br/>(stdio)"]
S1 -. "idle-exit OFF" .-> S1
S2 -. "COS_MCP_IDLE_EXIT_MS" .-> S2
- Cowork Desktop spawns each server as a direct stdio
commandentry inclaude_desktop_config.json. It does not accept HTTPurlentries — validated; this is why Cowork cannot point at the bridges. - Claude Code reaches each server over
.mcp.json(HTTPurlentries pointing atlocalhost:8001-8006), which a supergateway + launchd HTTP bridge fronts. Each bridge is a LaunchAgent that runs supergateway wrapping the stdio server as Streamable HTTP.
ensure-bridges.sh runs at board boot to nudge the bridges awake, but it only WARNs — it
never blocks the board on a bridge being down.
The child-lifecycle / idle-exit contract¶
The shared packages/mcp-kit start() helper serves both consumers, and they have opposite
lifecycles. This is genuinely subtle and the source of a classic failure mode:
- Direct stdio (Cowork / by hand) — one long-lived child for the whole session; the client
never respawns it. So the idle-exit timer is OFF by default. If it were on, an idle-exit here
would manifest as the dreaded "server transport closed unexpectedly → MCP not responding". A
real disconnect closes stdin, which reaps the child cleanly (the stdin
end/closebackstop is always armed). - supergateway bridge (Claude Code) — in stateless StreamableHttp mode supergateway spawns a
fresh child per request and frees it only on child-exit/protocol-error, never on normal
completion. So idle children leak. Each bridge LaunchAgent therefore opts in with
COS_MCP_IDLE_EXIT_MS=300000, reaping idle children after 5 minutes (a request in flight disarms the timer; supergateway respawns on the next call).
The rule that falls out: COS_MCP_IDLE_EXIT_MS lives only in the bridge plists, never in the
Cowork config. Turning it on globally would break Cowork; leaving it off everywhere would leak
under Code. The regression is guarded by tests/mcp-kit-idle.mjs.
Same env var, two clients, opposite correct values
The idle-exit timer is a per-transport decision, not a global one. If you add a new core MCP,
wire it through mcp-kit start() and set COS_MCP_IDLE_EXIT_MS in the bridge LaunchAgent
only — do not set it in claude_desktop_config.json.
Actor attribution — the agent twin¶
Because the board MCP and the board UI write through the same HTTP API, the board needs to know
who made each change. Every write the MCP issues (anything that isn't a GET) is attributed
to the agent two ways, for robustness against either route convention: an x-actor: agent
header and { "actor": "agent" } folded into the JSON body (added even to bodyless writes like
a soft-delete DELETE). UI writes are attributed to human. The board stamps the resulting
activity[] entry accordingly — you never pass actor yourself.
The read side is the companion contract: get_case surfaces the case's human-actor activity as
a leading "Manual actions by the user (human)" block, and these are authoritative. An agent
must never undo a human manual edit — not a lane move, a task completion, a field edit, or an
archive/restore. When an inference conflicts with a manual action, the agent adds a note or routes
it through the propose → approve → commit queue (propose / approve / reject) instead of
overwriting. This human-in-the-loop discipline is what the triage skills lean on — see
Triage skills and the Platform API for the board contract
those writes hit.
Where to go next¶
- Platform API — the board HTTP API these wrappers sit on.
- The vault agent — the embedded-Agent-SDK server in depth.
- Triage skills — the operator skills that drive these tools.
- Prompt-injection guard · Semantic search — the two sidecars, deep.
- Case hierarchy — the three-tier tree the board tools maintain.