Workspace Agents Credit Pricing Starts July 6: A Codex CLI Practitioner’s Budget Preparation Guide

OpenAI’s free period for Workspace Agents ends on 6 July 2026, when credit-based billing takes effect for Business and Enterprise accounts ¹. That gives teams twenty-two days to audit their agent usage, configure cost controls, and set budget expectations. This article walks through the credit arithmetic, the CLI configuration levers that directly affect spend, and a concrete preparation checklist.

What Changes on 6 July

Workspace Agents — the cloud-hosted Codex instances that run asynchronously on GitHub repositories, respond to Slack threads, and execute scheduled workflows — have been free since their launch ¹. From 6 July, every agent run will consume credits drawn from a workspace’s purchased pool ².

The key distinction for CLI practitioners: local Codex CLI sessions authenticated with an API key are unaffected by this change. API-key usage continues on standard per-token billing ³. The credit system applies to ChatGPT-authenticated sessions (subscription plans) and Workspace Agent runs initiated through the Codex App, IDE extensions, or cloud integrations ².

flowchart LR
    subgraph Unaffected["Unaffected by July 6"]
        A[API Key Auth] --> B[Per-Token Billing]
    end
    subgraph CreditBilling["Credit Billing from July 6"]
        C[ChatGPT Auth] --> D[Subscription Credits]
        E[Workspace Agent] --> F[Workspace Credits]
    end
    style CreditBilling fill:#fff3cd,stroke:#856404
    style Unaffected fill:#d4edda,stroke:#155724

The Credit Rate Card

Credits are consumed based on three token categories, with rates varying by model ²⁴:

Model	Input (per 1M tokens)	Cached Input (per 1M tokens)	Output (per 1M tokens)
GPT-5.5	125 credits	12.5 credits	750 credits
GPT-5.4	62.5 credits	6.25 credits	375 credits
GPT-5.4 mini	18.75 credits	1.875 credits	113 credits
Codex-Spark	3.75 credits	0.375 credits	15 credits

The critical ratio to internalise: output tokens cost 6× more than input tokens, and cached input costs 10× less than fresh input ⁴. Every cost-optimisation strategy flows from these two facts.

What a Typical Run Costs

A GPT-5.5 Workspace Agent run consuming 20,000 fresh input tokens, 80,000 cached input tokens, and 5,000 output tokens costs approximately 7.25 credits ². Across a team, typical monthly spend falls between $100 and $200 per developer depending on model selection and usage intensity ⁵.

CLI Configuration Levers That Cut Spend

The same config.toml profiles that control local CLI behaviour also influence how Workspace Agents consume credits when they inherit configuration from your project’s .codex/config.toml. Here are the levers that matter most.

1. Model Routing with Named Profiles

Route cheap tasks through cheaper models. A single profile switch halves credit consumption ⁶:

# ~/.codex/config.toml

[profiles.default]
model = "gpt-5.5"
model_reasoning_effort = "medium"

[profiles.bulk]
model = "gpt-5.4-mini"
model_reasoning_effort = "low"
service_tier = "flex"

[profiles.review]
model = "gpt-5.5"
model_reasoning_effort = "high"

GPT-5.4 mini consumes roughly 3–4 credits per message compared to GPT-5.5’s 10–14 credits ⁶. Routing subagent fan-out, test generation, and lint-fix tasks through the bulk profile stretches your credit allocation considerably.

2. Service Tier Selection

The service_tier key controls latency-cost tradeoffs ⁷:

# Flex tier: ~50% lower credit rates, higher latency
service_tier = "flex"

# Fast tier: 2× credit rates, lower latency
service_tier = "fast"

For Workspace Agent runs that execute asynchronously (the user is not waiting), flex is almost always the correct choice. Reserve fast for interactive CLI sessions where latency matters ⁷.

3. Token Budget Controls

Cap output verbosity and context window usage to prevent runaway credit consumption ⁸:

# Limit output tokens per tool call
tool_output_token_limit = 8000

# Trigger compaction before context fills
model_auto_compact_token_limit = 80000

# Reduce model verbosity
model_verbosity = "concise"

Since output tokens cost 6× input tokens, even modest reductions in output length produce significant savings.

4. Maximise Cache Hits

Cached input tokens cost 90% less than fresh input ⁴. To maximise cache hit rates:

Maintain long sessions within a single repository rather than spawning many short sessions
Structure AGENTS.md with stable preamble sections that rarely change
Pin MCP server configurations in project config so the tool schema preamble is consistent across runs

graph TD
    A[Fresh Input Token] -->|125 credits/1M| B[GPT-5.5]
    C[Cached Input Token] -->|12.5 credits/1M| B
    D[Output Token] -->|750 credits/1M| E[Response]

    style A fill:#f8d7da,stroke:#721c24
    style C fill:#d4edda,stroke:#155724
    style D fill:#f8d7da,stroke:#721c24

Enterprise Admin Controls

For Business and Enterprise workspaces, admins should configure credit governance before 6 July ⁹:

Monthly Credit Limits

Enterprise admins can set monthly credit limits per workspace through the admin dashboard, introduced in Codex CLI v0.137.0 ¹⁰. This prevents unexpected cost overruns during the transition period.

Managed Configuration Bundles

Deploy requirements.toml policies through the Codex Policies page to enforce cost-conscious defaults across your organisation ⁹:

# requirements.toml — deployed via admin panel
[policy]
default_model = "gpt-5.4"
max_model = "gpt-5.5"
service_tier = "flex"
approval_policy = "unless-allow-listed"

[policy.workspace_agents]
enabled = true
max_concurrent = 3

Analytics Dashboard

The self-serve analytics dashboard provides usage breakdowns by surface (CLI, IDE, cloud, desktop), model, and user ⁹. Export data as CSV or JSON for integration with internal cost-tracking systems. Note that dashboard data can lag by up to 12 hours ¹¹.

The 22-Day Preparation Checklist

Use the remaining free period to establish your cost baseline:

Week 1 (14–20 June): Measure

Export current usage from the analytics dashboard — note agent run counts, model distribution, and token volumes per developer
Identify your top-5 workspace agent workflows by credit consumption using the rate card above
Run codex doctor across your team’s environments to verify configuration consistency ¹⁰

Week 2 (21–27 June): Optimise

Create named profiles for cost-tiered workflows (default, bulk, review) in your project .codex/config.toml
Set service_tier = "flex" on all Workspace Agent configurations — they run asynchronously and do not need fast-tier latency
Review AGENTS.md files for oversized content that inflates fresh input token counts on every run
Configure tool_output_token_limit and model_auto_compact_token_limit to cap output spend

Week 3 (28 June – 4 July): Govern

Set monthly credit limits in the admin dashboard to establish guardrails
Deploy requirements.toml with organisation-wide model and tier defaults
Establish a credit burn alert — export usage weekly and compare against your budget projection

Final Check (5 July)

Validate that all Workspace Agent configurations inherit the optimised project config
Brief your team on the per-run credit cost expectations so nobody is surprised on Monday

API Key as a Cost Escape Valve

Teams running heavy agent workloads may find that API-key billing breaks even below 10–40 sessions per month compared to subscription credit consumption ⁴. For CI/CD pipelines and scheduled automation, API-key authentication with --ephemeral and --ignore-user-config flags avoids credit consumption entirely, billing directly against your API account at standard token rates ³.

The hybrid approach — subscription for interactive sessions, API key for automation — remains the most cost-effective pattern:

# Project config: automation profile uses API key
[profiles.ci]
model = "gpt-5.4-mini"
service_tier = "flex"
# API key set via OPENAI_API_KEY in CI environment

What to Watch After 6 July

OpenAI has signalled potential token price cuts ahead of their IPO ¹², and the competitive pressure from Anthropic’s credit-inclusive Claude Max plans and Google’s Gemini Pro subscriptions suggests that credit rates may not remain static. Build your cost monitoring infrastructure now so you can respond to rate card changes quickly.

The workspace agent pricing transition is not a crisis — it is a signal to treat agent compute as a first-class engineering cost, the same way teams learned to manage cloud compute budgets a decade ago. The teams that measure, profile, and optimise during the free period will barely notice the switch. The teams that do not will get an unpleasant invoice in August.

Citations

OpenAI Workspace Agents Free Ride Ends July 6 — TechTimes, 10 June 2026 ↩ ↩²
Codex Pricing — OpenAI Developers — OpenAI, June 2026 ↩ ↩² ↩³ ↩⁴
Codex CLI Authentication Paths: ChatGPT Login vs API Key — Codex Knowledge Base, 13 June 2026 ↩ ↩²
Codex Pricing (2026): Free vs $20 Plus vs $100 Pro, with Credit Burn Rates and Real Session Costs — Morph, June 2026 ↩ ↩² ↩³ ↩⁴
OpenAI Codex Pricing Explained: Credits, Seats, Fast Mode, and What Teams Actually Pay in 2026 — Nerova, 2026 ↩
The June 2026 Coding Agent Billing Reset — Codex Knowledge Base, 1 June 2026 ↩ ↩²
Dynamic Model Routing in Codex CLI: Mid-Session Switching, /fast Mode, and Service Tier Workflows — Codex Knowledge Base, 12 April 2026 ↩ ↩²
Codex CLI Configuration Complete Guide: Hierarchy, Profiles, and Trust — Codex Knowledge Base, 16 April 2026 ↩
Admin Setup — Codex Enterprise — OpenAI Developers, June 2026 ↩ ↩² ↩³
Codex Updates — June 2026 — Releasebot, June 2026 ↩ ↩²
Codex Enterprise Analytics and Compliance APIs — Codex Knowledge Base, 11 May 2026 ↩
AI Token Price War: OpenAI Pre-IPO Cuts — Codex Knowledge Base, 12 June 2026 ↩