Skip to main content
Version: latest

sdk backend

the sdk backend runs agent programs as managed child processes driven by their native app-server JSON-RPC protocol — no tmux session, no terminal multiplexer, no manual attachment required.

what is the sdk backend?

in kasmos, every agent instance is backed by an ExecutionSession. the default backend is ExecutionModeTmux; the sdk alternative is ExecutionModeSDK. both implement the same session.ExecutionSession interface, so the rest of the system — the TUI, the wave orchestrator, the instance lifecycle — is unaware of which backend is in use.

the sdk backend is selected by session.NewExecutionSession when the resolved execution mode is "sdk":

// session/execution.go
func NewExecutionSession(mode ExecutionMode, name, program string, skipPermissions bool) ExecutionSession {
switch NormalizeExecutionMode(mode) {
case ExecutionModeSDK:
return sdk.New(name, program, skipPermissions)
default:
return newTmuxExecutionSession(name, program, skipPermissions)
}
}

the concrete type is *sdk.Session from session/sdk/session.go. it wraps a Transport (one per supported program) and a Renderer that converts the agent's structured event stream into readable text.

supported programs

the sdk backend requires a Transport implementation that speaks the agent's app-server protocol. only two programs have transports today:

programtransportprotocol
claudeClaudeTransportJSON-RPC 2.0 over stdio with --app-server
codexCodexTransportCodex App Server protocol over stdio

sdk.SupportsProgram(program) reports whether a given program string maps to a supported transport. session.ResolveExecutionMode calls it automatically:

// session/execution.go
func ResolveExecutionMode(requested ExecutionMode, program string) ExecutionMode {
normalised := NormalizeExecutionMode(requested)
if normalised == ExecutionModeSDK && !sdk.SupportsProgram(program) {
return ExecutionModeTmux
}
return normalised
}

if you set execution_mode = "sdk" for a program that has no transport (e.g. opencode, gemini, amp, or a custom binary), kasmos silently falls back to tmux mode. the instance will run correctly in tmux; the sdk mode is simply not applied.

why it exists

the sdk backend was built for automated wave execution — running multiple coder agents concurrently without requiring a tmux server or any human at a terminal.

key advantages over tmux:

  • typed bidirectional control — kasmos drives the agent via its app-server protocol (query, interrupt, approvals, structured stream) instead of tmux terminal scraping.
  • permission flow supportSendPermissionResponse forwards a structured choice to the agent without needing a live PTY.
  • structured output — the Renderer accumulates typed events (text deltas, tool calls, tool results, permission prompts) into a line buffer, exposing the same CapturePaneContent / CapturePaneContentWithOptions surface as tmux.
  • no tmux dependency — works anywhere exec.Cmd works, including CI environments.

interactive operations

sdk sessions do not support terminal attachment. three methods return ErrInteractiveOnly:

  • Attach() — sdk sessions are in-process and cannot be attached to a terminal
  • DetachSafely() — nothing to detach from
  • SetDetachedSize(width, height int) — no PTY to resize

SendKeys and TapEnter are supported by sdk sessions — they buffer and submit text via the transport's SendPrompt method. see how it works for details.

standalone sdk agents

in addition to automated wave execution, the sdk backend is available when you spawn a standalone (ad-hoc) agent from the TUI. the three spawn entry points each resolve the execution mode in slightly different ways.

N — new with prompt

N ("new with prompt") creates an instance whose execution mode is resolved from the default/chat/ad-hoc profile — whichever [agents.*] block is mapped to the empty-string agent type. if that profile sets execution_mode = "sdk" and the resolved program is claude or codex, the session uses the sdk backend. for any other program the mode silently falls back to tmux.

the resolution happens in standaloneExecutionMode (app/app_state.go):

func (m *home) standaloneExecutionMode(agentType, program string) session.ExecutionMode {
profile := m.profileForAgent(agentType)
normalised := config.NormalizeExecutionMode(profile.ExecutionMode)
requested := session.ExecutionMode(normalised)
return session.ResolveExecutionMode(requested, program)
}

s — quick launch

s calls quickLaunchAgent, which always uses the fixer profile:

m.standaloneExecutionMode(session.AgentTypeFixer, fixerProgram)

set execution_mode = "sdk" in your [agents.fixer] block to make quick-launch sessions use the sdk backend when the configured program supports it.

S — spawn ad-hoc agent

S opens a multi-step spawn flow: first a harness picker (if multiple programs are configured), then an execution-mode picker for sdk-capable harnesses, then a name/branch/path form. the execution mode is resolved from the master profile via:

m.standaloneExecutionMode(session.AgentTypeMaster, program)

execution-mode picker rows by harness:

harnesspicker rows
codextmux · sdk · sdk-fast
claudetmux · sdk
any otherpicker is skipped — always tmux

sdk-fast is a codex-only option. it maps to ExecutionMode = sdk plus SDKSpeedTier = "fast" — the same fast lane as typing /fast inside a codex TUI pane. codex forwards serviceTier: "fast" on thread/start, which consumes 2× your usage budget in exchange for faster turn latency. claude has no equivalent tier and shows only tmux and sdk.

for unsupported harnesses (opencode, gemini, amp, custom binaries) session.ResolveExecutionMode always returns tmux regardless of what the profile says — see sdk.SupportsProgram in session/sdk/registry.go.

standalone vs daemon-managed sdk

standalone sdk agents run in-process inside the TUI. they are not registered with the daemon, so the web admin UI has no way to reach them. see managing instances for the full list of web-ui limitations.

when to choose sdk

scenariorecommended mode
automated claude or codex coder in a wavesdk
standalone quick-launch (s) with fixer profile sdksdk
ad-hoc spawn (S / N) with master/default profile sdksdk
ad-hoc codex spawn where faster turns outweigh 2× costsdk-fast (codex only, via S picker)
any other harness (opencode, gemini, amp, custom)tmux (sdk silently falls back anyway)
local development with live terminal attachmenttmux
interactive kas monitor debuggingtmux
agent that needs PTY-based prompt detectiontmux

next steps

  • how it works — transport lifecycle, session internals, env vars, renderer
  • configuration — setting execution_mode = "sdk" in config.toml
  • logs and output — structured event rendering and the capture range api
  • sdk vs tmux — full interface comparison