Codex Pay-As-You-Go Pricing: Modelling Costs for Multi-Agent Workflows

The April 2026 Pricing Restructure

On 2 April 2026, OpenAI replaced Codex’s per-message credit system with token-based billing aligned to API usage¹. The same announcement lowered the annual ChatGPT Business seat price from $25 to $20² and introduced a new seat type—Codex-only seats—that carry no fixed monthly fee and bill purely on token consumption¹. The restructure applies to new and existing Plus, Pro, ChatGPT Business, and new ChatGPT Enterprise plans³.

The headline change is transparency: every token consumed by every agent is now individually measurable, making multi-agent cost modelling feasible for the first time.

Seat Types After the Restructure

Teams on ChatGPT Business and Enterprise now choose between two seat types:

Seat Type	Fixed Cost	Codex Access	Rate Limits	Billing Model
Standard Business	$20/seat/month	Included (capped)	Plan limits apply	Subscription + optional credits
Codex-only	$0/seat/month	Full access	No rate limits	Token consumption only

Codex-only seats provide full Codex access without ChatGPT workspace features¹. Usage is billed on token consumption, giving teams a clearer view of how spend maps to actual work¹. For teams that need Codex but not the broader ChatGPT workspace, this removes the per-seat floor entirely.

The $500 Credit Promotion

Eligible ChatGPT Business workspaces can earn $100 in Codex credits for each new Codex-only member who sends their first Codex message, up to $500 per workspace¹. This provides a low-risk evaluation path—enough credit to run meaningful multi-agent trials before committing to ongoing spend.

The Rate Card: Models and Token Costs

The April 2026 rate card lists four models with credits priced per million tokens⁴:

Model	Input Credits/1M	Cached Input Credits/1M	Output Credits/1M
GPT-5.4	62.50	6.250	375
GPT-5.3-Codex	43.75	4.375	350
GPT-5.4-Mini	18.75	1.875	113
GPT-5.1-Codex-Mini	6.25	0.625	50

Fast mode doubles credit consumption⁴. The cached input discount is 1/10 of the regular input rate, providing significant savings for repeated repository context⁴.

For teams using API key mode directly (via preferred_auth_method = "apikey" in the CLI config), dollar-denominated pricing applies⁴:

Model	API Input $/1M	API Cached $/1M	API Output $/1M
gpt-5.1-codex-mini	$0.25	$0.025	$2.00
gpt-5.3-codex	$1.75	$0.175	$14.00
gpt-5.4	$2.50	$0.25	$15.00

The cost differential between GPT-5.1-Codex-Mini and GPT-5.4 is roughly 10× on input and 7.5× on output. This gap is the primary lever for multi-agent cost optimisation.

Modelling Multi-Agent Costs

Single-Agent Baseline

A typical single-agent Codex session performing multi-file feature work (touching ~5 files) costs approximately $0.16 on GPT-5.4 API pricing⁵. A full day of intensive single-agent coding runs to roughly $2.00⁵. These figures assume standard (non-fast) mode.

Scaling to Parallel Agents

Multi-agent workflows multiply token consumption roughly linearly with agent count. Five parallel codex exec processes each maintaining their own context window will consume approximately 5× the tokens of a single session⁶. However, three factors modify this naive multiplier:

Cached input tokens: When multiple agents work on the same repository, subsequent agents benefit from the 10× cached input discount. In practice, repository context (the largest input component) is heavily shared.
Model mixing: Not every agent needs GPT-5.4. Routine tasks—linting, test generation, documentation—can use GPT-5.1-Codex-Mini at 1/10 the cost.
Fast mode selectivity: Reserving fast mode (2× credits) for the orchestrating agent while workers run in standard mode halves the multiplier on worker costs.

Worked Example: Five-Agent Feature Pipeline

Consider a staged pipeline where an orchestrator delegates to four specialist workers:

graph LR
    O[Orchestrator<br/>GPT-5.4 Fast] --> A[Planner<br/>GPT-5.3-Codex]
    O --> B[Implementer<br/>GPT-5.4]
    O --> C[Test Writer<br/>GPT-5.1-Codex-Mini]
    O --> D[Reviewer<br/>GPT-5.3-Codex]
    A --> B
    B --> C
    C --> D

Assuming each agent processes approximately 500K input tokens (with 80% cache hit rate) and generates 50K output tokens per run:

Agent	Model	Input Cost	Cached Input Cost	Output Cost	Total
Orchestrator	GPT-5.4 (fast, 2×)	$0.25	$0.10	$1.50	$1.85
Planner	GPT-5.3-Codex	$0.175	$0.07	$0.70	$0.95
Implementer	GPT-5.4	$0.25	$0.10	$0.75	$1.10
Test Writer	GPT-5.1-Codex-Mini	$0.025	$0.01	$0.10	$0.14
Reviewer	GPT-5.3-Codex	$0.175	$0.07	$0.70	$0.95
				Pipeline Total	$4.99

Running this pipeline 10 times per day across a working month (22 days) yields approximately $1,098/month for the team. Compare this with five Standard Business seats at $100/month ($500 total) with capped usage—the pay-as-you-go model becomes cheaper only if daily pipeline runs drop below ~4.5.

The Break-Even Calculation

graph TD
    subgraph Decision
        Q{Daily pipeline<br/>runs per dev?}
        Q -->|< 5 runs| PAY[Pay-as-you-go<br/>Codex-only seats]
        Q -->|5-10 runs| HYBRID[Hybrid: Standard seats<br/>+ Codex-only overflow]
        Q -->|> 10 runs| SUB[Standard Business<br/>seats with Pro upgrade]
    end

The crossover point depends heavily on model mix. Teams that aggressively route routine work to GPT-5.1-Codex-Mini can push the break-even threshold significantly higher.

CI/CD Billing Under the New Model

For codex exec in CI/CD pipelines, API key mode is the recommended authentication method⁷. Key considerations:

# CI configuration: use API key auth with the cheapest viable model
export OPENAI_API_KEY="${CODEX_CI_TOKEN}"
codex exec --model gpt-5.1-codex-mini \
           --ephemeral \
           --json \
           --output-last-message \
           "Run the test suite and fix any failures"

--ephemeral skips session persistence, reducing overhead in stateless CI environments⁷
Model selection is critical: a CI lint-and-fix job on GPT-5.1-Codex-Mini costs ~$0.14 per run versus ~$1.85 on GPT-5.4 with fast mode
--json --output-last-message enables machine-readable output for pipeline integration⁷
Each CI invocation is independently billable, making per-pipeline cost attribution straightforward under token-based billing

A team running 50 CI pipeline invocations per day on GPT-5.1-Codex-Mini would spend approximately $7/day or $154/month—comparable to a single Pro subscription but with full auditability.

Comparison with Claude Code Max

For teams evaluating both tools, the pricing models differ fundamentally:

Dimension	Codex Pay-as-You-Go	Claude Code Max 5×	Claude Code Max 20×
Monthly cost	Variable (token-based)	$100/month fixed	$200/month fixed
Multi-agent multiplier	Linear with agents	~3× for 3 agents, ~7× with plan mode⁸	~3× for 3 agents, ~7× with plan mode⁸
Cost transparency	Per-token granularity	Opaque (usage counted against limit)	Opaque (usage counted against limit)
Rate limits	None (Codex-only seats)	Plan-based caps	Higher plan-based caps
Overage model	Pay more tokens	Hit limit, wait for reset	Hit limit, wait for reset

Claude Code’s flat-rate model is cheaper for heavy users—a $200/month Max 20× subscription can deliver $1,000–$5,000 worth of equivalent API compute⁸. However, Anthropic’s April 2026 restriction on third-party tool access means this value is locked within the Anthropic ecosystem⁹.

Codex’s token-based model offers superior cost attribution for enterprise teams who need to allocate spend across projects, teams, and CI pipelines. The trade-off is predictability: flat-rate subscriptions cap downside risk, while pay-as-you-go can surprise teams with unexpectedly high bills during intensive sprints.

Practical Recommendations

Start with the $500 credit promotion to benchmark your team’s actual token consumption before committing to a billing model
Instrument token usage early: use --json output to capture per-task token counts and build a consumption baseline
Model-mix aggressively: route test generation, documentation, and lint fixes to GPT-5.1-Codex-Mini; reserve GPT-5.4 for architectural decisions and complex refactoring
Use cached context intentionally: structure multi-agent workflows so agents share repository context, maximising the 10× cached input discount
Set billing alerts: with no rate limits on Codex-only seats, a runaway agent loop can burn through credits rapidly
Hybrid seat strategy: assign Standard Business seats to developers who need ChatGPT workspace features; use Codex-only seats for CI/CD service accounts and occasional contributors

Citations

OpenAI, “Codex now offers pay-as-you-go pricing for teams,” 2 April 2026. https://openai.com/index/codex-flexible-pricing-for-teams/ ↩ ↩² ↩³ ↩⁴ ↩⁵
Winbuzzer, “AI Coding: OpenAI Switches Codex to Pay-as-You-Go, Cuts Seat Cost to $20,” 4 April 2026. https://winbuzzer.com/2026/04/04/openai-switches-codex-pay-as-you-go-pricing-cuts-business-seat-cost-xcxwbn/ ↩
Lilting Channel, “OpenAI Codex Moves from Message-Based Credits to Token-Based Pricing,” April 2026. https://lilting.ch/en/articles/openai-codex-token-based-pricing-rate-card ↩
OpenAI, “Codex rate card,” April 2026. https://help.openai.com/en/articles/20001106-codex-rate-card ↩ ↩² ↩³ ↩⁴
Flowith Blog, “OpenAI Codex Pricing 2026: API Costs, Token Limits, and Which Tier Makes Sense for Your Dev Workflow.” https://flowith.io/blog/openai-codex-pricing-2026-api-costs-token-limits/ ↩ ↩²
Get AI Perks, “OpenAI Codex Pricing 2026: Credits, Limits & API Costs.” https://www.getaiperks.com/en/articles/codex-pricing ↩
OpenAI Developers, “Command line options – Codex CLI.” https://developers.openai.com/codex/cli/reference ↩ ↩² ↩³
SSD Nodes, “Claude Code Pricing in 2026: Every Plan Explained.” https://www.ssdnodes.com/blog/claude-code-pricing-in-2026-every-plan-explained-pro-max-api-teams/ ↩ ↩² ↩³
PYMNTS, “Anthropic’s Claude Subscription Shift Signals New AI Pricing Era,” April 2026. https://www.pymnts.com/artificial-intelligence-2/2026/third-party-agents-lose-access-as-anthropic-tightens-claude-usage-rules/ ↩