Claude Code’s query-engine.ts vs Codex CLI’s codex-rs: Comparing Agent Loop Architectures

Every agentic coding tool reduces to the same fundamental pattern: send a prompt, stream a response, execute tool calls, feed results back, repeat. But how that loop is implemented — the language, the state model, the concurrency strategy — profoundly affects reliability, extensibility, and performance. This article dissects the agent loop cores of two leading tools: Claude Code’s QueryEngine (TypeScript) and Codex CLI’s codex-rs (Rust), drawing on the source code exposed in March 2026 and the open-source Codex repository.

The Agent Loop Pattern

Both tools implement the same conceptual loop:

flowchart TD
    A[User Input] --> B[Build Prompt + Context]
    B --> C[Stream Model Response]
    C --> D{Tool Calls?}
    D -->|Yes| E[Execute Tools]
    E --> F[Inject Results]
    F --> C
    D -->|No| G[Yield Final Response]
    G --> H[Run Stop Hooks]

The differences lie entirely in how each system implements these stages and what escape hatches they provide when things go wrong.

Claude Code: The TypeScript Monolith

QueryEngine — The 46K-Line Brain

QueryEngine.ts is the single file that owns Claude Code’s entire query lifecycle ¹. One instance is created per conversation, holding mutable state including message history (mutableMessages), cumulative token usage (totalUsage), permission denial records, and an AbortController for cancellation ².

The entry point is submitMessage(), an async generator that:

Builds the system prompt via fetchSystemPromptParts()
Writes the transcript to disk before the API call — enabling crash recovery ²
Delegates to query(), which in turn calls queryLoop()

The queryLoop() State Machine

queryLoop() is a while(true) loop carrying a typed State object between iterations ². This state tracks messages, tool contexts, compaction metadata, and a transition reason code explaining why the loop continued. Seven continuation reasons are defined ²:

Transition	Trigger
`max_output_tokens_escalate`	Hit 8K cap; retry at 64K
`max_output_tokens_recovery`	Output limit hit; inject nudge (max 3×)
`reactive_compact_retry`	Prompt overflow; compact and retry
`collapse_drain_retry`	Context collapse stages exhausted
`stop_hook_blocking`	Hook error injected as user message
`token_budget_continuation`	Budget check; nudge and continue
(implicit tool loop)	Model returned `tool_use` blocks

The loop exits by yielding a Terminal state with reasons like completed, model_error, prompt_too_long, or aborted_streaming ².

Streaming and Retry

queryModel() streams responses via Anthropic’s SDK, reconstructing AssistantMessage objects per content block ². When a tool_use block arrives, the engine sets needsFollowUp = true and continues the loop.

Retry logic uses exponential backoff with jitter, up to 10 attempts ². Notable behaviours:

529 (overloaded): Foreground retries; background bails immediately
Opus fallback: After 3 consecutive 529s, throws FallbackTriggeredError
OAuth 401: Forces token refresh before retry
Persistent mode: Retries indefinitely with a 30-minute backoff cap, yielding heartbeats every 30 seconds

Tool Registry (tools.ts)

tools.ts serves as the central dispatch map ³. Every tool registers into this map, and the loop remains identical regardless of tool additions. The permission system classifies operations into four modes: default, auto, bypass, and the confusingly named yolo (which, contrary to intuition, uses a SAFE_YOLO_ALLOWLISTED_TOOLS set that restricts execution to read-only operations like FILE_READ, GREP, and GLOB) ⁴.

Context Management Pipeline

Claude Code employs a five-stage context reduction pipeline, applied in priority order before each API call ²:

applyToolResultBudget() — caps result byte size, externalises large outputs
snipCompact — removes provably unneeded middle messages
microcompact — merges tool-result/user pairs; cached variant uses API-side cache edits
contextCollapse — read-time projection over full history
autoCompact — full summarisation when approaching the blocking limit

A circuit breaker exits with PROMPT_TOO_LONG_ERROR_MESSAGE if context exceeds limits after all compaction attempts ².

Codex CLI: The Rust State Machine

codex.rs — Submission/Event Architecture

Where Claude Code uses generators, Codex CLI uses Rust’s async runtime. The central Codex struct processes operations through a submission_loop running as a dedicated tokio task ⁵. Operations arrive as typed Op messages, and results flow back as EventMsg values — a clean message-passing architecture that enables non-blocking interaction across TUI, CLI, and IDE modes ⁵.

flowchart LR
    subgraph Input Sources
        TUI[TUI]
        CLI[CLI exec]
        IDE[IDE Extension]
    end
    subgraph codex-rs Core
        SL[submission_loop]
        CM[ContextManager]
        MC[ModelClient]
        TO[ToolOrchestrator]
    end
    TUI -->|Op| SL
    CLI -->|Op| SL
    IDE -->|Op| SL
    SL --> CM
    CM --> MC
    MC -->|SSE Stream| SL
    SL -->|tool_call| TO
    TO -->|result| SL
    SL -->|EventMsg| TUI
    SL -->|EventMsg| CLI
    SL -->|EventMsg| IDE

CodexThread — Turn Orchestration

Each conversation turn is managed by a CodexThread ⁵:

User input is wrapped in a Submission
ContextManager builds the prompt, incorporating message history and token tracking
ModelClient streams responses via SSE
Tool calls interrupt the stream for execution
Results feed back for continuation

State is maintained across turns via ContextManager (message history, token counts, cached prompt prefixes) and Session (turn-level response assembly) ⁵. Session rollouts are persisted to compressed JSONL in ~/.codex/sessions/, enabling replay and forking for multi-agent workflows ⁵.

Tool Orchestration

The ToolOrchestrator acts as middleware between tool invocations and execution runtimes, enforcing three policies ⁶:

Approval policy — routes to human review or guardian sub-agent
Sandbox policy — selects appropriate OS-level confinement
Command safety classification — categorises operations by risk

Built-in tools include ⁵:

Tool	Purpose
`apply_patch`	Structured file editing via unified diff
`js_repl`	Persistent Node.js kernel
`tool_search`	BM25-powered semantic search
`spawn_agent` / `wait_agent`	Hierarchical multi-agent spawning
`request_permissions`	Mid-turn sandbox escalation
`web_search`	Live web search
MCP tools	External servers via `McpConnectionManager`

Sandbox Architecture

Codex CLI’s sandbox translates high-level SandboxPolicy enums into OS-native primitives ⁷:

macOS: Seatbelt profiles
Linux: Landlock rules (with Bubblewrap as default since v0.115.0 ⁸)
Windows: Restricted tokens with OS-level egress rules ⁹

Three policy levels are available: DangerFullAccess, ReadOnly, and WorkspaceWrite (write access limited to the current working directory, with .git/ protected) ⁷.

Architectural Comparison

graph TB
    subgraph "Claude Code (TypeScript)"
        QE[QueryEngine.ts<br/>~46K LOC monolith]
        QE --> Tools1[tools.ts registry]
        QE --> CM1[5-stage compaction]
        QE --> Retry1[withRetry — 10 attempts]
        QE --> Stream1[SSE → async generators]
    end
    subgraph "Codex CLI (Rust)"
        CRS[codex.rs<br/>~90 crate workspace]
        CRS --> Tools2[ToolOrchestrator]
        CRS --> CM2[ContextManager]
        CRS --> Retry2[Model-level retry]
        CRS --> Stream2[SSE → tokio channels]
    end

Language and Concurrency

Claude Code’s async generator chain (submitMessage → query → queryLoop → queryModel → withRetry) yields streaming tokens at every level ². This is elegant but inherently single-threaded — JavaScript’s event loop handles I/O concurrency, but CPU-bound work (context compaction, token counting) blocks the loop.

Codex CLI’s tokio runtime provides true multi-threaded async ⁵. The submission_loop processes operations sequentially for state consistency, but tool execution, sandbox setup, and MCP communication run on separate tasks. The StreamingToolExecutor equivalent fires tools in parallel with the stream still open, reducing multi-tool latency ⁵.

State Management

Claude Code carries a mutable State object through a while(true) loop with seven typed continuation reasons ². This is explicit but creates a single, complex state machine.

Codex CLI uses message-passing between Op submissions and EventMsg responses ⁵. State lives in dedicated managers (ContextManager, Session, RolloutRecorder), each with a focused responsibility. This separation makes it easier to test individual components and reason about state transitions.

Crash Recovery

Both systems prioritise resumability. Claude Code writes transcripts before API calls ². Codex CLI persists compressed JSONL rollouts that can be replayed to reconstruct any session state ⁵. Codex’s approach additionally supports forking — branching a sub-agent from a primary session’s state ⁵.

Extensibility

Claude Code’s monolithic QueryEngine means every new feature (streaming tool execution, context collapse, auto-dream) adds complexity to the same file ¹. The generator chain makes it difficult to insert middleware without modifying the core.

Codex CLI’s crate workspace (~90 member crates ⁵) provides natural module boundaries. New tools register via ToolOrchestrator without touching the core loop. MCP servers connect through McpConnectionManager with stdio or HTTP transport ⁵, and the approval system can delegate to guardian sub-agents — a pattern not available in Claude Code’s permission model.

Scale and Performance

Claude Code’s TypeScript codebase reportedly spans ~390K lines of code ¹. The Codex CLI binary, compiled from 95.6% Rust, is self-contained with fast startup times ⁸. For CI and batch workflows (codex exec), the Rust binary’s cold-start advantage is material — there is no Node.js runtime to initialise.

What Practitioners Can Learn

The comparison reveals three design principles for agent loop architecture:

Explicit continuation reasons beat implicit loops. Claude Code’s seven-variant Transition enum documents every reason the loop continues. This is worth adopting even in simpler agents — when debugging why an agent made 47 API calls, a typed reason code is invaluable.
Separate state from orchestration. Codex CLI’s split between ContextManager, ToolOrchestrator, and Session makes each concern testable in isolation. Claude Code’s single QueryEngine trades this for co-located logic, which is faster to prototype but harder to maintain at scale.
Transcript-first design enables recovery. Both systems write state before API calls. For any production agent, this pattern — persist intent before execution — should be non-negotiable.

The “secret sauce” of neither tool is the agent loop itself. It is the co-optimisation between model and harness: how the system prompt is structured, which tools are exposed, how context is managed, and how failures are recovered. Terminal-Bench results confirm that the same model can score 16 percentage points higher in one harness than another ¹⁰ — the loop architecture is the competitive differentiator, not the model.

Claude Code's query-engine.ts vs Codex CLI's codex-rs: Comparing Agent Loop Architectures

Claude Code’s query-engine.ts vs Codex CLI’s codex-rs: Comparing Agent Loop Architectures

The Agent Loop Pattern

Claude Code: The TypeScript Monolith

QueryEngine — The 46K-Line Brain

The queryLoop() State Machine

Streaming and Retry

Tool Registry (tools.ts)

Context Management Pipeline

Codex CLI: The Rust State Machine

codex.rs — Submission/Event Architecture

CodexThread — Turn Orchestration

Tool Orchestration

Sandbox Architecture

Architectural Comparison

Language and Concurrency

State Management

Crash Recovery

Extensibility

Scale and Performance

What Practitioners Can Learn

Citations