Codex CLI Web Search Configuration: Cached vs Live Modes, Domain Allow-Lists, and Prompt Injection Defence

Every coding agent eventually needs to look something up. A deprecated API flag, a new framework release, an unfamiliar error code — the model’s training data is a frozen snapshot, and your dependencies are not. Codex CLI ships a first-party web search tool that addresses this knowledge gap, but it introduces an attack surface that most developers never think about until something goes wrong¹.

This article covers the three web search modes, the new object-form configuration with domain allow-lists and context sizing, the security implications of each mode, and practical configuration patterns for solo developers, teams, and enterprise deployments.

Why Web Search Matters in an Agent Loop

When Codex encounters an unfamiliar library version or a build error it cannot explain from context alone, it can invoke its built-in web_search tool to retrieve documentation, Stack Overflow answers, or changelog entries¹. You will see web_search items in the transcript or in codex exec --json output whenever the agent looks something up¹.

The problem: every web search result is untrusted input. A live fetch can return a page containing hidden instructions — prompt injection — that cause the agent to execute unintended actions². Cached search mitigates this by serving results from an OpenAI-maintained index rather than fetching arbitrary live pages, but it trades freshness for safety¹.

The Three Modes

Codex CLI offers three web search modes, set via the top-level web_search key in config.toml³:

Mode	Behaviour	Default When
`cached`	Returns pre-indexed results from OpenAI’s web index. No live fetches.	Standard sandbox modes (`workspace-write`, `read-only`)
`live`	Fetches the most recent data from the web in real time.	`--yolo` / `danger-full-access` sandbox modes
`disabled`	Removes the web search tool entirely. The agent cannot search.	Explicitly set by user or admin

# ~/.codex/config.toml — set the default mode
web_search = "cached"

The critical detail: when you switch to --yolo or any full-access sandbox setting, Codex silently upgrades web_search from cached to live¹³. If you are running a permissive sandbox for a trusted repository, you may be exposing the agent to live web content without realising it.

The Object-Form Configuration

Since v0.128, the tools.web_search key accepts an object form that provides fine-grained control over search behaviour³:

[tools.web_search]
context_size = "medium"
allowed_domains = [
  "docs.openai.com",
  "developers.openai.com",
  "github.com",
  "stackoverflow.com",
  "developer.mozilla.org",
]

context_size

Controls how much content Codex retrieves from each search result³:

Value	Use Case
`low`	Quick lookups — error codes, API signatures, version numbers
`medium`	Balanced — documentation sections, code examples
`high`	Deep dives — full tutorials, architectural explanations

Higher context sizes consume more of the context window per search. For cost-sensitive CI pipelines using codex exec, low is usually sufficient.

allowed_domains

An array of domain strings that restricts which sites the agent may retrieve results from³. This is the single most effective defence against prompt injection through web search because it limits the attack surface to domains you trust.

[tools.web_search]
allowed_domains = [
  "docs.openai.com",
  "developer.mozilla.org",
  "rust-lang.org",
  "docs.rs",
]

If the array is empty or omitted, all domains are permitted. For enterprise deployments, populate this list from a shared managed_config.toml to enforce team-wide defaults⁴.

location

An optional object providing approximate geographical context for location-sensitive searches³:

[tools.web_search]
context_size = "medium"
allowed_domains = ["docs.openai.com", "github.com"]

[tools.web_search.location]
country = "GB"
region = "England"
city = "London"
timezone = "Europe/London"

This is rarely needed for coding tasks but can be useful when searching for region-specific API endpoints or compliance documentation.

Profile-Based Mode Switching

Different workflows demand different search modes. Codex profiles let you switch without editing your base configuration³⁵:

# Base config — conservative default
web_search = "cached"

[profiles.research]
web_search = "live"
model = "o3"

[profiles.ci]
web_search = "disabled"
model = "o4-mini"

[profiles.secure]
web_search = "cached"

Invoke a profile with codex --profile research or codex exec --profile ci. The CI profile disables web search entirely — in a pipeline, the agent should work from the repository and its AGENTS.md context, not from arbitrary web content⁶.

flowchart TD
    A[codex invoked] --> B{Which profile?}
    B -->|default| C[web_search = cached]
    B -->|research| D[web_search = live]
    B -->|ci| E[web_search = disabled]
    B -->|--yolo flag| F[web_search = live<br/>auto-upgraded]
    C --> G[OpenAI index only]
    D --> H[Live web fetches]
    E --> I[No web search tool]
    F --> H

The Prompt Injection Threat Model

Web search introduces a specific class of prompt injection: the agent fetches a page that contains hidden instructions designed to manipulate its behaviour². This is not hypothetical — indirect prompt injection via web content has been demonstrated against multiple agent frameworks⁷.

How the Attack Works

The agent encounters an unfamiliar error and invokes web_search.
A search result contains adversarial text (e.g., hidden in HTML comments, white-on-white text, or legitimate-looking documentation).
The agent incorporates the injected instructions into its reasoning.
The injected instructions cause the agent to exfiltrate data, modify files maliciously, or bypass approval policies.

Defence Layers

OpenAI’s “Running Codex safely” blog post, published on 8 May 2026, describes a multi-layered approach to mitigating this risk⁸:

Layer	Mechanism	Configuration
Cached search	Pre-indexed results curated by OpenAI; no live page fetches	`web_search = "cached"`
Domain allow-lists	Restrict retrievable domains to trusted sources	`allowed_domains = [...]`
Sandbox isolation	OS-enforced sandbox limits what the agent can do with injected instructions	`sandbox = "workspace-write"`
Approval policy	High-risk actions require human approval regardless of web content	`approval_policy = "on-failure"`
Guardian auto-review	Automated reviewer agent checks for data exfiltration and credential probing	`automatic_approval_review_policy = "auto_review"`

The key insight from OpenAI’s internal deployment is that these layers compose⁸. Cached search reduces the probability of encountering injected content. Domain allow-lists reduce it further. The sandbox and approval policy limit the blast radius if injection does occur. Guardian catches the most egregious exploitation attempts.

When to Use Live Mode

Live mode is appropriate when:

You are researching a newly released library or API (the cached index may lag by hours or days).
You are debugging a production incident and need the latest status page or changelog entry.
You are working on a trusted, isolated machine with full-access sandbox intentionally enabled.

Even in live mode, combine it with allowed_domains to limit exposure:

web_search = "live"

[tools.web_search]
context_size = "low"
allowed_domains = [
  "status.openai.com",
  "github.com",
  "docs.python.org",
]

Enterprise Configuration with requirements.toml

For teams on Codex Business or Enterprise plans, administrators can enforce web search policies via managed configuration⁴⁹. The requirements.toml file constrains security-sensitive settings that individual users cannot override:

# requirements.toml — admin-enforced policy
web_search = "cached"

This ensures that no developer on the team can switch to live mode without an administrator updating the policy. Combined with a managed allowed_domains list, this gives security teams confidence that the agent’s web access is bounded⁴.

Administrators can also push different policies to different user groups via the Codex Policies page in the admin console, allowing research teams live access whilst restricting production-facing teams to cached mode⁹.

Practical Recipes

Recipe 1: Documentation Lookup in CI

Use codex exec with web search disabled to avoid any external data dependency in your pipeline:

codex exec \
  --profile ci \
  --sandbox workspace-write \
  "Fix the failing TypeScript type check in src/api/handler.ts"

Recipe 2: Research Session with Guardrails

For a development session where you need current documentation but want protection:

# .codex/config.toml (repo-level)
web_search = "live"

[tools.web_search]
context_size = "medium"
allowed_domains = [
  "docs.openai.com",
  "developer.mozilla.org",
  "nodejs.org",
  "typescriptlang.org",
]

Recipe 3: Air-Gapped Enterprise Deployment

For regulated environments where no external web access is permitted:

web_search = "disabled"

Combine with MCP documentation servers (such as Context7¹⁰) to provide the agent with current library documentation without any live web access.

Verifying Your Configuration

Check which web search mode is active in a running session:

# In the TUI, check current settings
/status

In non-interactive mode, the --json flag on codex exec includes web_search items in the output transcript, letting you audit exactly which searches the agent performed and which domains it accessed¹.

Key Takeaways

Default to cached mode. It covers the vast majority of documentation lookups without exposing the agent to live prompt injection vectors.
Use allowed_domains whenever you enable live mode. An unrestricted live search is an open door to prompt injection.
Disable web search in CI. Your pipeline should be deterministic; web search introduces non-determinism and security risk.
Layer your defences. Cached search, domain allow-lists, sandbox isolation, and approval policies compose into a robust security posture.
Watch for silent mode upgrades. The --yolo and danger-full-access flags automatically switch web search from cached to live.

Citations

OpenAI, “Features – Codex CLI,” OpenAI Developers, 2026. https://developers.openai.com/codex/cli/features ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
OpenAI, “Agent Approvals & Security – Codex,” OpenAI Developers, 2026. https://developers.openai.com/codex/agent-approvals-security ↩ ↩²
OpenAI, “Configuration Reference – Codex,” OpenAI Developers, 2026. https://developers.openai.com/codex/config-reference ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷
OpenAI, “Managed Configuration – Codex Enterprise,” OpenAI Developers, 2026. https://developers.openai.com/codex/enterprise/managed-configuration ↩ ↩² ↩³
OpenAI, “Advanced Configuration – Codex,” OpenAI Developers, 2026. https://developers.openai.com/codex/config-advanced ↩
OpenAI, “Non-interactive Mode – Codex,” OpenAI Developers, 2026. https://developers.openai.com/codex/noninteractive ↩
Greshake, K. et al., “Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection,” arXiv:2302.12173, 2023. https://arxiv.org/abs/2302.12173 ↩
OpenAI, “Running Codex Safely at OpenAI,” OpenAI Blog, 8 May 2026. https://openai.com/index/running-codex-safely/ ↩ ↩²
OpenAI, “Admin Setup – Codex Enterprise,” OpenAI Developers, 2026. https://developers.openai.com/codex/enterprise/admin-setup ↩ ↩²
OpenAI, “Sample Configuration – Codex,” OpenAI Developers, 2026. https://developers.openai.com/codex/config-sample ↩