Agents SDK TypeScript Goes Sandbox-Native: Building Codex-Powered Agents with the Open-Source Harness

On 6 May 2026, OpenAI released v0.9.1 of the Agents SDK for TypeScript — the first version to ship sandbox agents and the open-source harness that underpins Codex itself¹. The Python SDK gained these features in mid-April²; TypeScript developers now have parity. This matters for Codex CLI users because it means you can embed the same harness–compute separation that powers codex exec into your own TypeScript applications, CI pipelines, and multi-agent orchestrators — and swap the underlying compute provider at runtime without touching agent logic.

This article covers the architecture, walks through practical integration patterns with Codex CLI, and examines provider trade-offs for production deployments.

Why the Harness–Compute Split Matters

Codex CLI has always separated its reasoning loop from command execution: the agent decides what to do, and the sandbox decides where and how it runs³. The Agents SDK formalises this boundary into two clean layers:

Layer	Responsibilities	Where it runs
Harness	Model calls, tool dispatch, handoffs, approvals, memory, tracing, recovery	Trusted infrastructure (your server, CI runner, Lambda)
Compute	File I/O, shell commands, package installation, port exposure	Sandboxed environment (local Unix, Docker, E2B, Modal…)

graph TB
    subgraph Harness["Harness (Trusted)"]
        A[Agent Loop] --> B[Tool Router]
        A --> C[Memory]
        A --> D[Tracing]
        B --> E[Approval Gate]
    end

    subgraph Compute["Compute (Sandboxed)"]
        F[Shell]
        G[Filesystem]
        H[apply_patch]
        I[Skills]
    end

    E -->|"Approved"| F
    E -->|"Approved"| G
    E -->|"Approved"| H
    B -->|"Skill request"| I

This split means credentials, API keys, and orchestration state never enter the sandbox. The sandbox sees only the files and environment variables you explicitly declare in the Manifest⁴.

Getting Started: A Minimal Sandbox Agent

Install the SDK and create a sandbox agent that reviews a workspace:

npm install @openai/agents@0.9.1

import { run } from "@openai/agents";
import {
  SandboxAgent,
  Manifest,
  file,
  shell,
  filesystem,
} from "@openai/agents/sandbox";
import { UnixLocalSandboxClient } from "@openai/agents/sandbox/local";

const manifest = new Manifest({
  entries: {
    "AGENTS.md": file({
      content: `# Review Agent
## Rules
- Run tests before summarising.
- Never modify production files.`,
    }),
    "src/index.ts": file({ content: "export const greet = () => 'hello';" }),
    "src/index.test.ts": file({
      content: `import { greet } from './index';
test('greets', () => expect(greet()).toBe('hello'));`,
    }),
  },
});

const reviewer = new SandboxAgent({
  name: "Code Reviewer",
  model: "gpt-5.5",
  instructions:
    "Read AGENTS.md first. Run the test suite, then summarise results.",
  defaultManifest: manifest,
  capabilities: [shell(), filesystem()],
});

const result = await run(
  reviewer,
  "Review this workspace and report any issues.",
  { sandbox: { client: new UnixLocalSandboxClient() } },
);

console.log(result.finalOutput);

The UnixLocalSandboxClient runs commands on the host machine inside a restricted directory — fast for local development, unsuitable for untrusted code⁴.

Capabilities: What a Sandbox Agent Can Do

Each capability grants the agent a set of tools. The defaults mirror what Codex CLI provides in a standard session⁴:

Capability	Tools added	Requires
`shell()`	`run_command`, interactive input	—
`filesystem()`	`apply_patch`, `view_image`, file read/write	—
`skills()`	Skill discovery and materialisation	—
`memory()`	Cross-run lesson retention	`shell()`
`compaction()`	Context trimming for long sessions	—

The apply_patch tool uses the same v4a diff format as Codex CLI⁵, ensuring patches generated in one environment apply cleanly in the other.

Loading Skills from Git

Skills — the same SKILL.md format used by Codex CLI⁶ — can be loaded from local directories or Git repositories:

import { skills, gitRepo } from "@openai/agents/sandbox";

const agent = new SandboxAgent({
  name: "Skilled Agent",
  model: "gpt-5.5",
  instructions: "Use available skills when appropriate.",
  capabilities: [
    shell(),
    filesystem(),
    skills({
      from: gitRepo({
        repo: "your-org/codex-skills",
        ref: "main",
      }),
    }),
  ],
});

This means teams that maintain skill libraries for Codex CLI can reuse them directly in TypeScript agents without duplication⁶.

Nine Sandbox Providers at Runtime

The provider is a runtime argument — change it without modifying agent logic⁴:

Provider	Client class	Best for
Unix-local	`UnixLocalSandboxClient`	Fast local iteration
Docker	`DockerSandboxClient`	Local container isolation
Blaxel	`BlaxelSandboxClient`	Managed multi-tenant
Cloudflare	`CloudflareSandboxClient`	Edge compute
Daytona	`DaytonaSandboxClient`	Multi-language devbox
E2B	`E2BSandboxClient`	Managed preview URLs
Modal	`ModalSandboxClient`	Serverless GPU access
Runloop	`RunloopSandboxClient`	Devbox with tunnels
Vercel	`VercelSandboxClient`	Deployment integration

Switching from local development to production is a one-line change:

import { DockerSandboxClient } from "@openai/agents/sandbox/local";

const result = await run(agent, "Analyse this codebase.", {
  sandbox: {
    client: new DockerSandboxClient({
      image: "node:22-bookworm-slim",
    }),
  },
});

For CI pipelines, Docker or E2B provides repeatable isolation without host contamination⁴.

Session State and Resume

Sandbox sessions support persistence and resumption — the same pattern Codex CLI uses for codex resume⁷:

// First run
const session = await client.create({ manifest });
const firstResult = await run(agent, "Set up the project.", {
  maxTurns: 20,
  sandbox: { session },
});

// Serialise state
const frozen = await client.serializeSessionState?.(session.state);
await session.close?.();

// Later: resume from frozen state
const resumed = await client.resume(
  await client.deserializeSessionState(frozen),
);

const secondResult = await run(agent, "Now add integration tests.", {
  maxTurns: 20,
  sandbox: { session: resumed },
});

This enables long-running workflows that span CI stages, human review gates, or time-bounded compute budgets⁴.

Sandbox Memory: Cross-Run Learning

The memory() capability persists lessons across runs without polluting conversational context⁴. Memory files follow a defined layout:

workspace/
  memories/
    memory_summary.md      # Distilled guidance
    MEMORY.md              # Active memory document
    raw_memories.md        # Unprocessed entries
    raw_memories/<run>.md  # Per-run raw memories
    rollout_summaries/     # Run-level summaries

Three memory modes control behaviour:

Default — read existing memories, generate new entries
Read-only — consume prior lessons without writing
Generate-only — produce memories without reading

This mirrors Codex CLI’s native memories system⁸, and memories written by Codex sessions can be read by TypeScript agents if they share the same workspace directory.

Codex CLI Integration Patterns

Pattern 1: Codex CLI as MCP Server, TypeScript Agent as Orchestrator

Run codex mcp-server and connect a TypeScript SandboxAgent to it via the MCP transport⁹. The orchestrator handles routing and approvals; Codex handles code execution:

import { MCPServerStdio } from "@openai/agents/mcp";

const codexMcp = new MCPServerStdio({
  name: "codex",
  command: "codex",
  args: ["mcp-server", "--path", "/workspace/project"],
});

const orchestrator = new SandboxAgent({
  name: "Orchestrator",
  model: "gpt-5.5",
  mcpServers: [codexMcp],
  instructions: "Delegate coding tasks to the Codex MCP server.",
  capabilities: [shell(), filesystem()],
});

Pattern 2: Shared Manifest for Codex CLI and TypeScript Agents

Define workspace contents once in a Manifest, use it across both surfaces:

const sharedManifest = new Manifest({
  entries: {
    repo: gitRepo({ repo: "your-org/your-project", ref: "main" }),
    "AGENTS.md": file({ content: agentsContent }),
  },
});

// TypeScript agent uses the manifest directly
const result = await run(agent, task, {
  sandbox: { client, manifest: sharedManifest },
});

// Codex CLI uses the same repo + AGENTS.md via normal git clone
// $ codex exec "Run the test suite" --path /workspace/your-project

Pattern 3: Handoff from Non-Sandbox to Sandbox Agent

A lightweight orchestration agent without sandbox capabilities delegates workspace-heavy tasks to a sandbox agent⁴:

import { Agent } from "@openai/agents";

const triage = new Agent({
  name: "Triage",
  model: "gpt-5.5",
  instructions: "Classify incoming requests. Hand off coding tasks.",
  handoffs: [sandboxCodingAgent],
});

The triage agent handles natural-language routing; the sandbox agent handles file manipulation. This separates concerns cleanly and keeps the triage agent’s context lean.

What This Means for Codex CLI Users

The TypeScript SDK release closes a significant gap. Prior to v0.9.1, TypeScript developers who wanted sandbox-powered agents had three options: shell out to codex exec, use the Python SDK, or build their own harness¹⁰. Now they can build natively in TypeScript with the same primitives:

Portable skills — SKILL.md files work across Codex CLI, the Python SDK, and now the TypeScript SDK⁶.
Portable manifests — workspace definitions are provider-agnostic and SDK-agnostic.
Shared memory — agents across languages can read the same memory files if they share a workspace.
Consistent patching — apply_patch uses the same v4a format everywhere⁵.

The beta label is worth noting: API details, defaults, and supported capabilities may change¹. Pin your SDK version in package.json and watch the changelog.

Decision Framework: When to Use What

flowchart TD
    A[Need an agent with workspace?] -->|No| B[Use standard Agent class]
    A -->|Yes| C{Language?}
    C -->|TypeScript| D[SandboxAgent from @openai/agents/sandbox]
    C -->|Python| E[SandboxAgent from openai-agents]
    C -->|Shell script| F[codex exec]
    D --> G{Deployment?}
    E --> G
    G -->|Local dev| H[UnixLocalSandboxClient]
    G -->|CI pipeline| I[DockerSandboxClient]
    G -->|Production| J[E2B / Modal / Cloudflare]
    G -->|Enterprise| K[Daytona / Runloop]

Limitations and Sharp Edges

Beta status — sandbox agents in TypeScript are beta; expect breaking changes between minor versions¹.
No code mode yet — the TypeScript SDK does not yet support the Codex CLI-style code_mode that restricts the agent to code-only output. OpenAI lists this as planned¹.
No subagent primitive — TypeScript sandbox agents cannot natively spawn sub-sandbox-agents the way Codex CLI’s MultiAgentV2 does¹¹. Use handoffs or tool-based composition instead.
Provider parity varies — snapshot support, port exposure, and mount types differ across providers. Test your target provider early⁴.
Memory requires persistence — cross-run memory only works if the workspace directory (or cloud mount) survives between runs⁴.

Citations

OpenAI Agents SDK TypeScript documentation — v0.9.1 release, sandbox agents beta status, planned features. ↩ ↩² ↩³ ↩⁴
OpenAI, “The next evolution of the Agents SDK” — April 2026 announcement of harness and sandbox for Python SDK. ↩
OpenAI, “Unlocking the Codex harness: how we built the App Server” — Codex app-server architecture and harness–compute separation. ↩
OpenAI API docs, “Sandbox Agents” — Manifest format, capabilities, provider table, session state, memory, secret management. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰
OpenAI Codex CLI features — apply_patch v4a diff format, shared across Codex and Agents SDK. ↩ ↩²
OpenAI, “Agent Skills” — SKILL.md format specification, progressive disclosure, skill loading from Git. ↩ ↩² ↩³
OpenAI Codex CLI features — resume — codex resume session persistence. ↩
OpenAI Codex Memories documentation — Native memories pipeline, memory file layout. ↩
OpenAI, “Use Codex with the Agents SDK” — Running Codex as MCP server for SDK orchestration. ↩
OpenAI Developer Changelog, 6 May 2026 — “The updated Agents SDK became available in TypeScript, featuring sandbox agents and an open-source harness built in.” ↩
OpenAI Codex CLI subagents documentation — MultiAgentV2 thread orchestration and depth handling. ↩