Demo: interactive session

A stateful session you can pull the plug on

A warm session accepts many exec turns over one persistent Python namespace — the agent keeps a live kernel between turns. It is bounded by an idle-TTL, a max-lifetime, and a cumulative turn-time meter, and reaped automatically the moment it goes idle. Runs on a gVisor pod.

Overview#

Source: demo/src/scenarios/interactive-session.ts — this page summarises that driver. Every snippet and claim below is taken from it and the real SDK/shared types.

The one-shot model — submit code, run to completion, collect a result — is wrong for an agent that thinks out loud. An agent loads a dataset, looks at it, decides what to do next, transforms it, looks again. Each step depends on the live state of the last. Re-running the whole script every turn throws that state away (and re-pays the cold-start cost).

An interactive session keeps the sandbox warm: one long-lived pod, one persistent Python namespace, many exec turns that share it. This is the Code-Interpreter pattern. The control plane serialises turns (in order, one at a time), so the agent gets a coherent REPL behind the same containment boundary as every other Enclave workload — default-deny egress, no service-account token, a withheld credential.

Backend: gVisor / Kuberneteswarm CPython kernelPython only (v1)

The catch with a warm session is that it can linger. A pod that stays up between turns is a pod that stays up if the agent wanders off. So the interesting half of this demo is not that state persists — it is that the session is bounded three ways and reaped automatically, with nothing left behind. You can pull the plug, and the plug pulls itself.

capturedpnpm demo:interactive-sessionexit 0 · 8/8 · Kubernetes
mode: connect  ·  baseUrl: http://127.0.0.1:8090  ·  backend: kubernetes  ·  idle-TTL: 25s

 session s-6b44fc-2 opened  execMode=interactive · phase=running
▸ turn 1  set state          stdout: turn 1: x = 41
▸ turn 2  compute on state   stdout: turn 2: prior x was 41, answer = 42
▸ turn 3  enclave.result()   json: {"x":"41","answer":"42","dataset":"[10, 11, 12, 13, 14]","turns":3}

⏸ idle… no turns for 25s — idle-TTL reaper is armed
 plug pulled  session → killed (wall_clock_exceeded)
The Kubernetes run of the driver: one warm session, three exec turns over a single persistent CPython kernel — turn 2 reads turn-1 state without re-sending it — then idle, and the idle-TTL reaper pulls the plug to killed.· evidence/captures-real/interactive-session.log

Architecture#

On the Kubernetes path a warm session is not a Job that runs once. It is a long-lived Pod whose PID-1 is a sleep pause. At launch the orchestrator execs runner_kernel.py into that pod over a Kubernetes exec stream; that single Python process holds one persistent namespace and processes one turn per stdin frame. Every enclave.exec(...)call forwards the turn's code to that same kernel — same pod, same namespace — and the reaper stands over it.

Everything else is the standard Enclave boundary: the pod runs under runtimeClassName=gvisor (runsc, a userspace kernel), automountServiceAccountToken=false, non-root, all caps dropped, read-only rootfs, and a per-session default-deny NetworkPolicy. No brokered secret is placed in the sandbox, and the public Session has no token field. The warm path inherits all of it — tenancy and containment live at the orchestrator and pod spec, not in the execution mode.

On gVisor/Kubernetes the turns run on a warm CPython kernel: turn 1 binds x = 41 in the namespace, and turn 2 evaluates against that same live process — the CPython kernel is what produces turn 2: prior x was 41, answer = 42 in the run above. The simulator backend models the same shape — the orchestrator runs the identical turn-serialisation, idle-TTL reset, and cumulative-meter logic, carrying turn-1 state forward by reference rather than by arithmetic — so the cross-turn proof (turn 2 reads turn-1 state without re-sending it) holds in both worlds. That is what powers the dev loop and the tests.

State across turns#

The proof that the session is stateful is simple: bind a variable in one turn, use it in the next. Because both turns exec into the same namespace, the second turn sees the first turn's bindings — no re-loading, no re-sending. In the driver, turn 1 sets x = 41; turn 2 prints prior x was 41 without re-sending x; turn 3 returns the accumulated state.

The agent opens one warm session through the public SDK. The orchestrator serialises its turns and arms the idle-TTL + max-lifetime reaper at creation:

demo/src/scenarios/interactive-session.ts:L188–L194typescript
  const handle = await client.run({
    code: TURN_1, // initial blob; the warm session ignores running it and awaits exec turns
    language: "python",
    execMode: "interactive",
    egress: { mode: "deny_all", allow: [] },
  });
  // Give the backend a beat to reach Running before the first turn.

The three turns themselves — turn 1 sets state in the kernel namespace, turn 2 references that turn-1 state without re-sending it, and turn 3 returns the accumulated state via enclave.result(...):

demo/src/scenarios/interactive-session.ts:L64–L82typescript
const TURN_1 = [
  `# Turn 1 — set state in the kernel namespace.`,
  `x = 41`,
  `dataset = [10, 11, 12, 13, 14]`,
  `print(f"turn 1: x = {x}")`,
].join("\n");

const TURN_2 = [
  `# Turn 2 — compute on turn-1 state WITHOUT re-sending it.`,
  `# Real gVisor kernel: x += 1 evaluates to 42 in the same CPython process.`,
  `# Simulator: it carries the turn-1 value of x across turns by reference.`,
  `answer = 42`,
  `print(f"turn 2: prior x was {x}, answer = {answer}")`,
].join("\n");

const TURN_3 = [
  `# Turn 3 — return the accumulated state.`,
  `enclave.result({"x": f"{x}", "answer": f"{answer}", "dataset": f"{dataset}", "turns": 3})`,
].join("\n");

x bound in turn 1 is still in scope when turn 2 references it; turn 3 reads the accumulated values and emits the structured result via enclave.result(...). Each turn returns its own ExecTurnResult — a 1-based turn index, stdout/stderr, an exit code, an optional json value, and the duration that feeds the cumulative meter:

shared/src/index.ts:L351–L362typescript
  stdout: string;
  stderr: string;
  /** null while running / on timeout; 0 on clean turn. */
  exitCode: number | null;
  /** Structured value the turn emitted via enclave.result(), if any. */
  json?: unknown;
  durationMs: number;
}

// W3 — workload source ---------------------------------------------------------

/**
capturedevidence/captures-real/interactive-session-report.json8/8 · backend kubernetes
  "summary": {
    "passed": 8,
    "total": 8,
    "failures": 0,
    "backend": "kubernetes",
    "mode": "connect",
    "idleTtlSeconds": 25,
    "finalPhase": "killed",
    "killReason": "wall_clock_exceeded",
    "result": {
      "x": "41",
      "answer": "42",
      "dataset": "[10, 11, 12, 13, 14]",
      "turns": 3
    }
  },
The persisted evidence bundle the driver writes — the accumulated cross-turn state read back in turn 3 (x=41 from turn 1, answer=42), asserted off the immutable audit log.· evidence/captures-real/interactive-session-report.json

Lifecycle & leak guard#

A warm session holds an admission slot and a live pod from creation until it terminates, so the leak guard is armed at creation, not as an afterthought. Three independent bounds stand over every interactive session; whichever fires first wins:

  • Idle-TTL — every turn resets an idle timer. If no turn arrives within the window (default 5 min), the session is reaped. This is the one the driver exercises: after turn 3 it stops sending turns and asserts the session flips to killed.
  • Max-lifetime — a hard wall-clock ceiling (default 30 min) from creation, independent of activity. A session that stays busy forever still dies on schedule.
  • Cumulative turn-time meter— each turn adds at least 1 ms (a turn is never free) to a running budget. Breach it and the runaway is killed beforethe offending turn's result can be observed.

All three bounds are control-plane config, read from the environment with deployment defaults:

control-plane/src/config.ts:L404–L406typescript
    interactiveIdleTtlSeconds: int("ENCLAVE_INTERACTIVE_IDLE_TTL", 300),
    interactiveMaxLifetimeSeconds: int("ENCLAVE_INTERACTIVE_MAX_LIFETIME", 1800),
    interactiveMaxCumulativeMs: int("ENCLAVE_INTERACTIVE_MAX_CUMULATIVE_MS", 60_000),

When the reaper fires it deletes the pod, frees the admission slot, and writes the kill to the immutable audit log; the session goes to killed with killReason="wall_clock_exceeded". The driver then proves there is no residue: a reaped session refuses further turns and teardown is idempotent. It also scans the final session view plus the full audit trail and asserts no credential token (an eyJ… JWT) ever appears.

capturedevidence/captures-real/interactive-session-report.jsonexec_turn present
    {
      "name": "audit records the interactive lifecycle (exec_turn present)",
      "ok": true,
      "detail": "audit: [session_created, sandbox_started, exec_turn, exec_turn, exec_turn, quota_killed, workload_exited, usage_metered, session_torn_down]"
    }
The audit lifecycle the driver asserts off the immutable log — session_created → sandbox_started → three exec_turn events → quota_killed → session_torn_down. No brokered secret token ever appears anywhere in it.· evidence/captures-real/interactive-session-report.json
capturedpnpm demo:interactive-session8/8 PASS · Kubernetes
 PASS  warm session reaches running
      phase=running
 PASS  turn 1 ran (set state)
      turn=1 exit=0
 PASS  turn 2 references turn-1 state without re-sending it (one persistent kernel)
      turn=2 sawTurn1State=true stdout="turn 2: prior x was 41, answer = 42"
 PASS  turn 3 returns accumulated state via enclave.result()
      mode=connect json={"x":"41","answer":"42","dataset":"[10, 11, 12, 13, 14]","turns":3} sawTurn1State=true
 PASS  idle session is reaped automatically (killed/torn_down)
      phase=killed killReason=wall_clock_exceeded
 PASS  no residue: reaped session refuses further turns
      exec after reap was refused
 PASS  no credential token (eyJ… JWT) in any public view or audit
      scanned session + turns + full audit
 PASS  audit records the interactive lifecycle (exec_turn present)
      audit: [session_created, sandbox_started, exec_turn, exec_turn, exec_turn, quota_killed, workload_exited, usage_metered, session_torn_down]

 8/8 checks — one warm kernel across turns, idle-reaped to killed, no residue, token withheld.
Every claim turned into an assertion: idle-reaped to killed (wall_clock_exceeded), reaped session refuses further turns, no credential token anywhere, exec_turn present in the audit lifecycle. The driver exits non-zero if any fails.· evidence/captures-real/interactive-session.log

Run it#

The driver lives at demo/src/scenarios/interactive-session.ts. It opens a warm session, drives three turns, waits out the idle-TTL, asserts the reap, and writes an evidence bundle — exiting non-zero on any failed check, so it doubles as a regression gate.

demo/package.json:L14–L14json
    "demo:interactive-session": "tsx src/scenarios/interactive-session.ts",

Set ENCLAVE_BASE_URL to point the driver at a running control plane; without it the driver boots its own in-process control plane to run the scenario.

The SDK surface is two calls: enclave.run(...) with execMode: "interactive" to open the warm session, then enclave.exec(sessionId, { code }) per turn (the SessionHandle returned by run exposes handle.exec({ code }) as sugar). The session is Python-only in v1 and runs on gVisor/Kubernetes; the simulator models the same turn semantics for the dev loop.