The Carbon Footprint of Coding Agents: What 250,000 Tonnes of CO₂ Means for Codex CLI Token Strategy

AI coding agents are no longer a novelty — they are infrastructure. With adoption doubling in new GitHub projects over six months ¹, the aggregate environmental cost has become measurable and, frankly, uncomfortable. A March 2026 analysis estimates that AI coding agents now emit approximately 250,000 tonnes of CO₂ per year ², a figure that grew 5.4× in the six months from September 2025 ². That places coding agents in the same carbon bracket as a small airline.

This article breaks down where those emissions come from, examines the emerging measurement standards, and maps the concrete Codex CLI configuration levers that reduce token waste — and therefore energy consumption — without sacrificing output quality.

Where the Carbon Comes From

The emissions chain for a single AI-assisted code commit follows four stages:

flowchart LR
    A["Tokens generated<br/>~766K output tokens/task"] --> B["GPU inference time<br/>~3.27 hours/commit"]
    B --> C["Energy consumed<br/>~2.14 kWh/commit"]
    C --> D["Carbon emitted<br/>~613g CO₂e/commit"]

Each commit through an agentic coding workflow produces roughly 613 grams of CO₂e ². The calculation chains output token count → GPU hours → kilowatt-hours (including data centre Power Usage Effectiveness) → carbon intensity of the regional electricity grid ³.

The numbers diverge dramatically depending on model choice. Research from early 2026 found that GPT-4-class models emit between 5× and 19× more CO₂e than a human programmer completing the same task ⁴. Smaller models can approach human-equivalent emissions when they succeed, but their higher failure rate often forces retry loops that erode the saving ⁴.

Agentic workloads are not typical queries

A median Claude Code session consumes approximately 41 Wh — 138× more than a typical single LLM query of ~0.3 Wh ⁵. The gap exists because coding agents operate with system prompts and tool descriptions consuming ~20,000 tokens before the first user message, make 5–10 tool calls per user message, and chain dozens of requests per session ⁵. The Green Software Foundation notes that agentic tasks can consume up to 1,000× more tokens than equivalent code-reasoning interactions ⁶.

Token consumption also exhibits alarming variance: runs on identical tasks can differ by up to 30×, yet higher consumption does not correlate with improved accuracy ⁶. This is the clearest signal that waste — not capability — drives a significant share of emissions.

Measuring What Matters: SCI for AI

The Green Software Foundation ratified the Software Carbon Intensity for AI (SCI for AI) specification in December 2025, extending the ISO/IEC 21031:2024 standard to AI workloads ⁷. The methodology converts energy to carbon using location-specific grid intensity factors, includes embodied hardware emissions, and defines standardised functional units for meaningful comparison ⁷.

For coding agents, the natural functional unit is per commit or per resolved task. The SCI formula decomposes into:

SCI = ((E × I) + M) / R

Where E is energy consumed, I is the location-based carbon intensity, M is embodied emissions, and R is the functional unit ⁸.

Critically, measurement tools remain immature. The Carbonlog methodology, which estimates energy from first principles using real-time latency and tokens-per-second benchmarks, found that comparable tools produce estimates differing by up to 19× for identical queries ³. Batch size assumptions alone can shift estimates by 80% ³. Treat any per-token carbon figure as an order-of-magnitude indicator, not a precise measurement.

Regulatory pressure is arriving

The European Commission’s Corporate Sustainability Reporting Directive (CSRD) amended standards are expected to be formally adopted by mid-to-late 2026, applicable from Financial Year 2027, with early adoption permitted from FY2026 ⁹. Organisations running agentic coding workflows at scale will increasingly need to account for these Scope 3, Category 1 (purchased goods and services) emissions ³.

Codex CLI Levers for Carbon Reduction

Codex CLI’s configuration primitives map directly to the efficiency strategies the Green Software Foundation recommends: routing to appropriate models, minimising context, bounding retries, and caching aggressively ⁶. None of these require sacrificing developer experience.

1. Named profiles for model-appropriate routing

The single highest-impact lever is routing routine tasks to smaller, cheaper, faster models. A frontier model burning through 766,000 output tokens on a simple file rename is environmental waste.

# ~/.codex/config.toml

[profile.lightweight]
model = "o4-mini"
reasoning_effort = "medium"

[profile.deep]
model = "o3"
reasoning_effort = "high"

Activating with codex --profile lightweight for routine file operations, linting fixes, and boilerplate generation reserves the frontier model for architectural decisions ¹⁰. The o4-mini model consumes a fraction of the tokens and GPU time of o3 for tasks within its capability envelope ¹⁰.

2. Context compaction thresholds

The Microsoft context engineering study from June 2026 demonstrated that full conversation history is actively harmful — recency pruning plus summarisation achieves 91.6% task completion at 63.9% fewer tokens compared to full history at 71% completion ¹¹. Fewer tokens means proportionally less GPU time and less energy.

# Trigger compaction before the context window fills
model_auto_compact_token_limit = 80000

# Limit tool output to prevent context bloat
tool_output_token_limit = 4096

3. Prompt caching

Cached input tokens are billed at roughly 10% of the uncached rate ¹⁰, reflecting the dramatically lower compute required. For repeated operations — CI pipeline runs, batch refactors, multi-file migrations — the energy saving compounds:

flowchart TD
    A["First request<br/>Full inference cost"] --> B["Response cached"]
    B --> C["Subsequent requests<br/>~10% compute cost"]
    C --> D["Energy saving:<br/>~90% per cached request"]

4. Bounded retry budgets via hooks

Unbounded tool-call loops are a primary source of token waste. PostToolUse hooks can enforce exit conditions that prevent runaway retries:

#!/bin/bash
# .codex/hooks/post-tool-use.sh
# Exit if test suite has been run more than 3 times
if [ "$CODEX_TOOL_NAME" = "shell" ]; then
  count=$(grep -c "pytest\|npm test\|go test" "$CODEX_SESSION_LOG" 2>/dev/null || echo 0)
  if [ "$count" -gt 3 ]; then
    echo "STOP: Test retry limit reached — escalate to human review"
    exit 1
  fi
fi

5. `codex exec` for batch operations

The non-interactive codex exec mode skips the TUI entirely, eliminating overhead tokens from interactive session management ¹⁰. For CI/CD pipelines and scripted workflows, this is the most token-efficient execution path:

codex exec --model o4-mini --quiet "Fix all lint warnings in src/"

6. `/usage` for visibility

The /usage command, added in v0.140.0, provides daily, weekly, and cumulative token activity views ¹². You cannot optimise what you cannot measure. Establishing a baseline of per-task token consumption is the prerequisite for any meaningful efficiency strategy.

A Practical Carbon Budget

Combining the available data, we can sketch a rough per-developer carbon budget for agentic coding:

Scenario	Commits/day	Tokens/commit	Est. CO₂e/day	Est. CO₂e/year
Heavy agentic (frontier model, no optimisation)	8	766,000	4.9 kg	1,274 kg
Moderate agentic (mixed models, compaction)	8	250,000	1.6 kg	416 kg
Optimised (small model routing, caching, hooks)	8	100,000	0.6 kg	156 kg

The optimised scenario represents an 88% reduction from the unoptimised baseline — achievable entirely through Codex CLI configuration, not behaviour change.

The Uncomfortable Arithmetic

The 250,000-tonne annual figure is a snapshot of a curve that was growing at 25× per year as of March 2026 ². If coding agent adoption continues its doubling trajectory ¹, and per-agent efficiency remains static, we are looking at millions of tonnes within two years.

The counterargument — that agents increase developer productivity, reducing the need for additional developers and their associated commuting, office energy, and hardware emissions — has merit but remains unquantified. No peer-reviewed study has yet performed a full lifecycle comparison.

What is quantified is the token-to-carbon chain, and the levers to compress it. Codex CLI already ships with every configuration primitive needed to cut per-task emissions by an order of magnitude. The question is whether teams choose to use them.

Citations

Robbes, R. et al. (2026). “Agentic Very Much: Coding Agent Adoption Has Doubled in New GitHub Projects.” arXiv:2606.07448 ↩ ↩²
CNaught (2026). “AI Coding Agents Are Emitting 250,000 Tonnes of Carbon Emissions Each Year.” https://www.cnaught.com/blog/ai-coding-agents-are-emitting-250-000-tonnes-of-carbon-emissions-each-year-heres-how-we-got-there ↩ ↩² ↩³ ↩⁴
CNaught (2026). “How to Actually Measure the Carbon Footprint of AI Code.” https://www.cnaught.com/blog/how-to-actually-measure-the-carbon-footprint-of-ai-code ↩ ↩² ↩³ ↩⁴
Castano, J. et al. (2025). “A Comparative Study of AI and Human Programming on Environmental Sustainability.” Scientific Reports. https://www.nature.com/articles/s41598-025-24658-5 ↩ ↩²
Couch, S.P. (2026). “Electricity Use of AI Coding Agents.” https://simonpcouch.com/blog/2026-01-20-cc-impact/ ↩ ↩²
Green Software Foundation (2026). “Tokens and Greens: Measuring the Impacts of Agentic AI.” https://greensoftware.foundation/articles/tokens-and-greens-measuring-the-impacts-of-agentic-ai/ ↩ ↩² ↩³
Green Software Foundation (2025). “SCI for AI — Software Carbon Intensity for Artificial Intelligence.” https://greensoftware.foundation/standards/sci-ai/ ↩ ↩²
Green Software Foundation. “SCI — Software Carbon Intensity.” ISO/IEC 21031:2024. https://greensoftware.foundation/standards/sci/ ↩
Green Software Foundation (2026). “Software Carbon Intensity for AI and EU AI Act Environmental Compliance.” https://greensoftware.foundation/policy/research/sci-ai-eu-ai-act/ ↩
Codex Knowledge Base (2026). “Codex CLI After the Pro Boost: Rate Limit Reality, Token Economics, and Cost Optimisation for June 2026.” https://codex.danielvaughan.com/2026/06/02/codex-cli-post-promotion-rate-limits-token-economics-cost-optimisation-june-2026/ ↩ ↩² ↩³ ↩⁴
Lodha, A. et al. (2026). “Efficient Context Engineering for Agentic Software Development.” arXiv:2606.10209 ↩
OpenAI (2026). “Codex CLI Changelog — v0.140.0.” https://developers.openai.com/codex/changelog ↩

The Carbon Footprint of Coding Agents: What 250,000 Tonnes of CO₂ Means for Codex CLI Token Strategy

Where the Carbon Comes From

Agentic workloads are not typical queries

Measuring What Matters: SCI for AI

Regulatory pressure is arriving

Codex CLI Levers for Carbon Reduction

1. Named profiles for model-appropriate routing

2. Context compaction thresholds

3. Prompt caching

4. Bounded retry budgets via hooks

5. codex exec for batch operations

6. /usage for visibility

A Practical Carbon Budget

The Uncomfortable Arithmetic

Citations

5. `codex exec` for batch operations

6. `/usage` for visibility