Codex CLI Reasoning Tiers: Mapping the June 2026 Model Picker to CLI Profiles for Cross-Surface Consistency

On 10 June 2026, OpenAI replaced the ChatGPT model picker with a simplified six-tier reasoning menu: Instant, Medium, High, Extra High, Pro Standard, and Pro Extended¹. The labels are friendlier, but they hide a problem every team running both the Codex App and Codex CLI will eventually hit: a developer selects “High” in the App, then switches to the CLI and gets a different model at a different reasoning effort, producing inconsistent behaviour for the same task. This article maps the new tiers to concrete config.toml profiles, explains the token economics behind each level, and provides a team standardisation playbook that keeps agent behaviour consistent regardless of which surface a developer uses.

What Changed on 10 June

Before the update, ChatGPT exposed model names directly — GPT-5.5, GPT-5.5 Thinking, and so on — alongside a separate thinking-level toggle (Standard, Extended, Heavy). The June 10 redesign collapsed both dimensions into a single selector¹:

Picker Label	Previous Name	Underlying Model	Reasoning Effort	Access
Instant	GPT-5.5 Instant	GPT-5.5	`none` / `minimal`	Plus, Pro, Enterprise, Edu
Medium	Thinking Standard	GPT-5.5	`medium`	Plus, Pro, Enterprise, Edu
High	Thinking Extended	GPT-5.5	`high`	Plus, Pro, Enterprise, Edu
Extra High	Thinking Heavy	GPT-5.5	`xhigh`	Pro only
Pro Standard	Pro Standard	GPT-5.5	`high` + extended compute	Pro only
Pro Extended	Pro Extended	GPT-5.5	`xhigh` + extended compute	Pro only

The model underneath is the same across all six tiers — GPT-5.5². What changes is the reasoning budget the model allocates before producing output. Instant skips extended reasoning entirely; Extra High and the Pro tiers let the model think for significantly longer, burning substantially more tokens in the process³.

Why This Matters for CLI Users

The Codex App now abstracts away model_reasoning_effort behind friendly labels. The CLI does not. A developer who picks “High” in the App and then runs codex from a terminal with the default config.toml will land on medium reasoning effort — the CLI’s default for GPT-5.5⁴. The agent will produce noticeably different results for complex tasks: shallower analysis, fewer edge cases considered, and occasionally missed architectural concerns that the App session handled correctly.

For teams using both surfaces — and the June 2026 data suggests most enterprise teams do⁵ — this divergence creates a subtle quality inconsistency that is difficult to debug because neither surface reports the other’s settings.

Mapping Tiers to CLI Profiles

The solution is named profiles in config.toml that mirror the App’s tier labels exactly. Since Codex CLI moved to per-file profile configuration in v0.135⁶, each profile lives in its own file under ~/.codex/:

Instant Profile

# ~/.codex/instant.config.toml
model = "gpt-5.5"
model_reasoning_effort = "none"
model_reasoning_summary = "none"
model_verbosity = "low"

Best for: quick lookups, simple renames, trivial formatting fixes. Latency is minimal and token cost is lowest³.

Medium Profile

# ~/.codex/medium.config.toml
model = "gpt-5.5"
model_reasoning_effort = "medium"

This is the default reasoning tier and the recommended starting point for most development work. GPT-5.5 defaults to medium reasoning effort when no override is specified⁴, so this profile mainly serves as an explicit anchor — ensuring the developer knows they are matching the App’s “Medium” tier rather than relying on an implicit default.

High Profile

# ~/.codex/high.config.toml
model = "gpt-5.5"
model_reasoning_effort = "high"

Best for: code review, refactoring decisions, security analysis, and any task where missing an edge case costs more than the extra reasoning tokens⁷.

Extra High Profile

# ~/.codex/xhigh.config.toml
model = "gpt-5.5"
model_reasoning_effort = "xhigh"

Reserve this for genuinely hard problems — architectural decisions, complex migration planning, and security audits. The token burn rate roughly doubles compared to high³. This tier maps to the App’s “Extra High” and is only available to Pro subscribers in the App, though CLI users with API keys can access xhigh regardless of their ChatGPT subscription tier⁸.

Using Profiles

Launch a session with a specific profile:

codex --profile high

Or set a default in your base config.toml:

# ~/.codex/config.toml
model = "gpt-5.5"
model_reasoning_effort = "medium"

The /effort slash command allows mid-session switching without restarting⁹:

/effort high

The Token Economics of Each Tier

Understanding the cost implications is essential for budget-conscious teams. GPT-5.5 pricing sits at $5.00 per million input tokens and $30.00 per million output tokens¹⁰. Reasoning tokens — the hidden “thinking” tokens the model generates internally — count as output tokens but are not visible in the response³.

graph LR
    A[Instant<br/>none/minimal] -->|~1x base cost| B[Medium<br/>medium effort]
    B -->|~1.5-2x| C[High<br/>high effort]
    C -->|~2-3x| D[Extra High<br/>xhigh effort]
    D -->|~3-5x| E[Pro Extended<br/>xhigh + extended]

    style A fill:#2d6a4f,color:#fff
    style B fill:#40916c,color:#fff
    style C fill:#e9c46a,color:#000
    style D fill:#e76f51,color:#fff
    style E fill:#9b2226,color:#fff

The multipliers above are approximate and task-dependent. A simple code formatting task may see negligible difference between Instant and High. A complex architectural review can see reasoning tokens exceed output tokens by 5:1 at xhigh³.

Cached Input Discount

GPT-5.5 cached input tokens cost $0.50 per million — a 90% discount on standard input pricing¹⁰. For teams running repeated sessions against the same codebase, this makes high and xhigh significantly more affordable on second and subsequent turns, since AGENTS.md files, system prompts, and tool definitions all hit the cache.

Plan Mode Reasoning Effort

Codex CLI supports a separate plan_mode_reasoning_effort key that activates when you use the /plan command⁹. This allows a useful pattern: run interactive sessions at medium but escalate to high when the agent enters planning mode:

# ~/.codex/config.toml
model = "gpt-5.5"
model_reasoning_effort = "medium"
plan_mode_reasoning_effort = "high"

This mirrors a common human workflow — think harder when planning, move faster when executing — and keeps costs manageable while ensuring plans are thorough.

Cross-Surface Consistency for Teams

The requirements.toml Approach

Enterprise administrators can enforce a minimum reasoning floor across the organisation using requirements.toml¹¹. While requirements.toml cannot directly set reasoning effort, it can restrict available models and enforce managed configuration bundles — a feature that landed in v0.137 and was extended with cloud-managed config bundles in June 2026¹²:

# requirements.toml (managed by admin)
[model]
allowed = ["gpt-5.5"]

Repository-Level AGENTS.md

For project-specific consistency, include reasoning guidance in the repository’s AGENTS.md:

## Reasoning Guidelines

- Code review tasks: use `high` reasoning effort
- Formatting and linting: `none` or `minimal` is sufficient
- Security-sensitive changes: always use `xhigh`
- Default for general development: `medium`

This does not enforce profile selection, but it primes the agent to behave consistently and gives team members a shared reference for which tier to select.

Shared Profile Distribution via Plugins

Teams can distribute standardised profiles through the Codex plugin system¹³. A plugin manifest can bundle profile configuration files alongside skills and hooks, ensuring every team member has identical tier definitions:

{
  "name": "team-profiles",
  "version": "1.0.0",
  "skills": [],
  "config_files": [
    "instant.config.toml",
    "medium.config.toml",
    "high.config.toml",
    "xhigh.config.toml"
  ]
}

The “Show Additional Models” Escape Hatch

The June 10 update also introduced a “Show additional models” toggle in the App’s web settings, revealing legacy options including o3, o4-mini, and 4.1¹. For CLI users, these models remain accessible via config.toml without any toggle — but most are now retired from the consumer ChatGPT picker and approaching API deprecation windows¹⁴. Teams standardising on the new tier model should treat these legacy options as transitional and plan their removal from CI/CD profiles before the deprecation deadlines.

A Practical Team Standardisation Checklist

Audit current usage: Run codex doctor --json and check what model and reasoning effort each team member is actually using¹⁵.
Create shared profiles: Distribute four standard profile files (instant, medium, high, xhigh) through your plugin marketplace or repository.
Set sensible defaults: Configure the base config.toml with medium as the default and high for plan mode.
Document tier selection: Add reasoning guidelines to your project’s AGENTS.md so developers know which tier to use for which task.
Monitor token consumption: Use the workspace analytics dashboard (available for Business and Enterprise accounts) to track whether xhigh usage is justified by measurably better outcomes¹⁶.
Align App and CLI: Communicate to the team that the App’s “High” equals --profile high in the CLI. Post the mapping table in your team wiki.

Limitations and Caveats

Pro Standard and Pro Extended tiers use extended compute budgets that are not directly replicable through the CLI’s model_reasoning_effort key alone. These tiers involve server-side resource allocation tied to Pro subscriptions¹. CLI users with API keys can use xhigh but will not get the exact same extended compute behaviour. ⚠️
Automatic Instant-to-Medium switching: The App offers a setting where Instant auto-switches to Medium for harder questions². The CLI has no equivalent — you get exactly the effort level you configure. For teams that rely on this adaptive behaviour, the CLI’s explicit profile model is actually an advantage: deterministic reasoning depth means reproducible results.
Token visibility: Reasoning tokens are not surfaced in the CLI’s standard output. Use codex doctor or the workspace analytics dashboard to understand actual token consumption per session¹⁵.

Conclusion

The simplified model picker is a usability improvement for casual ChatGPT users, but it creates a translation gap for teams working across the Codex App and CLI. Named profiles that map one-to-one with the picker’s tiers close that gap. The investment is small — four TOML files and a team convention — but the payoff is significant: consistent agent behaviour, predictable costs, and no more debugging sessions that turn out to be reasoning-effort mismatches in disguise.

Citations

OpenAI, “ChatGPT Release Notes — June 10, 2026: Simplified model picker”, https://help.openai.com/en/articles/6825453-chatgpt-release-notes ↩ ↩² ↩³ ↩⁴
OpenAI, “GPT-5.5 in ChatGPT”, https://help.openai.com/en/articles/11909943 ↩ ↩²
OpenAI, “Reasoning models — reasoning effort and token economics”, https://developers.openai.com/api/docs/guides/reasoning ↩ ↩² ↩³ ↩⁴ ↩⁵
OpenAI, “Using GPT-5.5 — default reasoning effort”, https://developers.openai.com/api/docs/guides/latest-model ↩ ↩²
OpenAI, “Codex is now generally available — 4 million weekly developers”, https://openai.com/index/codex-now-generally-available/ ↩
OpenAI, “Codex CLI Advanced Configuration — profile files”, https://developers.openai.com/codex/config-advanced ↩
Arsturn, “GPT-5 Reasoning Effort Levels Explained”, https://www.arsturn.com/blog/gpt-5-reasoning-effort-levels-explained ↩
OpenAI, “OpenAI API Pricing — GPT-5.5”, https://openai.com/api/pricing/ ↩
Codex Knowledge Base, “Reasoning Effort Tuning: Minimal to xhigh for Cost and Speed”, https://codex.danielvaughan.com/2026/03/27/reasoning-effort-tuning/ ↩ ↩²
OpenAI, “API Pricing — GPT-5.5: $5/$30 per 1M tokens”, https://openai.com/api/pricing/ ↩ ↩²
OpenAI, “Codex CLI — requirements.toml policy enforcement”, https://developers.openai.com/codex/cli ↩
OpenAI, “Codex Changelog — v0.137: cloud-managed config bundles”, https://developers.openai.com/codex/changelog ↩
OpenAI, “Codex Plugin System — bundling skills and configuration”, https://deepwiki.com/openai/codex/5.11-plugins-system ↩
OpenAI, “Model deprecations and retirements”, https://help.openai.com/en/articles/9624314-model-release-notes ↩
OpenAI, “Codex CLI — codex doctor diagnostics”, https://developers.openai.com/codex/cli/features ↩ ↩²
OpenAI, “Codex Pricing — workspace analytics”, https://developers.openai.com/codex/pricing ↩