The ExecPlan Pattern: Structuring 7-Hour Codex Sessions with PLANS.md
The ExecPlan Pattern: Structuring 7-Hour Codex Sessions with PLANS.md
The single biggest determinant of whether a Codex session completes a complex task or drifts into incoherence is not the model — it is the planning scaffold. OpenAI’s official ExecPlan pattern, documented in the OpenAI Cookbook by Aaron Friel1, has enabled sessions lasting seven hours from a single prompt2. A separate 4-file variant pushed that to 25 hours and ~30,000 lines of generated code3. This article covers both patterns, when to use each, and how to integrate them into agentic pod workflows.
Why Planning Scaffolds Matter
Without external planning documents, a coding agent relies entirely on its context window to maintain coherence. As sessions extend beyond 30 minutes, three failure modes emerge:
- Context drift — the agent forgets constraints stated early in the conversation
- Oscillation — the agent reverses earlier decisions because the rationale has scrolled out of context
- Scope creep — without an explicit boundary, the agent takes on adjacent work
ExecPlans solve all three by externalising the plan to disk. The agent reads the plan file before each major step, treating it as ground truth rather than relying on chat history1. Alex Embiricos, product lead for Codex at OpenAI, described this as the core insight: “don’t jump into coding — make Codex plan first”4.
The Single-File ExecPlan Pattern
Setup: AGENTS.md Integration
Add the following to your project’s AGENTS.md:
# ExecPlans
When writing complex features or significant refactors, use an ExecPlan
(as described in .agent/PLANS.md) from design to implementation.
This gives the model a shorthand term — ExecPlan — that triggers structured planning behaviour1. You can also install Friel’s dedicated Codex skill by cloning the .codex/skills/execplans directory into your project5.
The 12 Required Sections
Every ExecPlan must include these sections1:
| # | Section | Purpose |
|---|---|---|
| 1 | Purpose / Big Picture | User-visible behaviour gained; how to observe it working |
| 2 | Progress | Timestamped checklist reflecting actual current state |
| 3 | Surprises & Discoveries | Unexpected behaviours with evidence |
| 4 | Decision Log | All decisions with rationale and date/author |
| 5 | Outcomes & Retrospective | Summary of achievements, gaps, lessons |
| 6 | Context and Orientation | Full-path file names; defined non-obvious terms |
| 7 | Plan of Work | Prose sequence of edits with concrete file locations |
| 8 | Concrete Steps | Exact commands, working directories, expected output |
| 9 | Validation and Acceptance | Observable behaviour verification with specific inputs/outputs |
| 10 | Idempotence and Recovery | Safe retry/rollback paths |
| 11 | Artifacts and Notes | Key transcripts and diffs as indented examples |
| 12 | Interfaces and Dependencies | Prescriptive library/module specs with signatures |
Non-Negotiable Rules
The Cookbook specifies several hard constraints1:
- Fully self-contained — a stateless agent or human novice must be able to read the plan top-to-bottom and produce a working result, with no external references
- Living document — updated continuously as work proceeds; every revision maintains self-containment
- Observable outcomes — acceptance criteria frame success as “the server returns HTTP 200 with body X”, not “the function is implemented correctly”
- Idempotent steps — every operation must be safely repeatable after a crash or interruption
- Prose-first — reserve checklists for the Progress section; all other sections use narrative prose
Formatting Constraints
ExecPlans follow strict formatting rules to avoid confusing the model1:
- Single fenced markdown code block (labelled
md) when embedded in a prompt - Two newlines after every heading
- No nested triple-backtick fences — use indentation for code examples instead
- Include terminal transcripts and diffs as indented blocks
Invoking the Pattern
Once configured, invoke ExecPlans with a prompt like:
Use the execplans skill. Create an ExecPlan for this migration,
keep it updated while you work, and implement the plan end to end.
The agent will create (or refresh) the plan file, work through each section, and update Progress timestamps as milestones complete5.
The 4-File Durable Memory Pattern
For sessions exceeding eight hours or involving extreme complexity, OpenAI’s engineering team demonstrated a 4-file variant during a 25-hour autonomous session that produced a complete design tool from a blank repository3.
The Four Files
graph TD
A["Prompt.md<br/>Specification Anchor"] --> B["Plan.md<br/>Milestone Sequencing"]
B --> C["Implement.md<br/>Execution Runbook"]
C --> D["Documentation.md<br/>Status & Audit Log"]
D -->|"Agent revisits"| A
style A fill:#e1f5fe
style B fill:#f3e5f5
style C fill:#fff3e0
style D fill:#e8f5e9
| File | Role | Key Contents |
|---|---|---|
| Prompt.md | Immutable specification anchor | Goals, non-goals, hard constraints, “done when” criteria3 |
| Plan.md | Milestone sequencing | Discrete milestones with acceptance criteria and validation commands3 |
| Implement.md | Execution runbook | Operational discipline: milestone-by-milestone adherence, scoped diffs, mandatory validation3 |
| Documentation.md | Status and audit log | Current milestone status, decision rationale, known issues, smoke-test commands3 |
Session Metrics
The 25-hour session achieved3:
- ~13 million tokens consumed
- ~30,000 lines of code generated
- Model:
gpt-5.3-codexat “Extra High” reasoning - Deliverables: Canvas editing, live collaboration, prototype mode, threaded comments, multi-format export
The critical enabler was the verification loop — after every milestone, the agent ran lint, typecheck, test suites, and build validation before proceeding3.
Choosing Between the Two Patterns
flowchart TD
Start["Complex task identified"] --> Q1{"Estimated duration?"}
Q1 -->|"< 8 hours"| Single["Single-file ExecPlan<br/>.agent/PLANS.md"]
Q1 -->|"> 8 hours"| Four["4-File Durable Memory<br/>Prompt.md + Plan.md +<br/>Implement.md + Documentation.md"]
Q1 -->|"Unknown"| Q2{"File count > 50?"}
Q2 -->|"Yes"| Four
Q2 -->|"No"| Single
Single --> Execute["Execute with verification loops"]
Four --> Execute
| Aspect | Single-File ExecPlan | 4-File Pattern |
|---|---|---|
| Duration | Up to ~8 hours | 8–25+ hours |
| Files affected | Moderate (< 50) | Large (50+) |
| Model requirement | gpt-5.2-codex or later |
gpt-5.3-codex at high reasoning |
| Setup overhead | Minimal — one AGENTS.md line | Moderate — four files to template |
| Best for | Feature development, refactors | Full-scale migrations, greenfield builds |
| Token budget | Not reported | ~13M tokens |
For most daily development work, the single-file ExecPlan is sufficient. Reserve the 4-file pattern for sessions where you expect the agent to run unattended for hours3.
Integrating ExecPlans with Agentic Pods
ExecPlans become particularly powerful in multi-agent configurations.
Template ExecPlans for Subagents
Create a .agent/PLANS-TEMPLATE.md that subagents inherit when spawned:
# In your orchestrator's AGENTS.md
[subagent.defaults]
planning = ".agent/PLANS-TEMPLATE.md"
Each subagent creates its own ExecPlan from the template, giving the orchestrator a standardised format to review6.
Worktree + Plan Pairing
When running parallel agents in Git worktrees, each worktree should maintain its own ExecPlan. This prevents cross-contamination between concurrent tasks:
graph LR
O["Orchestrator Agent"] --> W1["Worktree: auth-refactor<br/>PLANS.md: Auth migration"]
O --> W2["Worktree: perf-optimise<br/>PLANS.md: Query caching"]
O --> W3["Worktree: test-coverage<br/>PLANS.md: Integration tests"]
W1 --> M["Merge results"]
W2 --> M
W3 --> M
The orchestrator can read each subagent’s PLANS.md to monitor progress without sharing context windows46.
Hook-Based Plan Enforcement
A PreToolUse hook can verify that high-impact operations are documented in the active plan before execution:
#!/bin/bash
# .codex/hooks/pre-tool-use.sh
# Block file deletions not documented in the active ExecPlan
if [[ "$CODEX_TOOL" == "rm" || "$CODEX_TOOL" == "file_delete" ]]; then
PLAN_FILE=".agent/PLANS.md"
if [[ -f "$PLAN_FILE" ]]; then
if ! grep -q "$CODEX_TARGET" "$PLAN_FILE"; then
echo "BLOCKED: $CODEX_TARGET not documented in ExecPlan"
exit 1
fi
fi
fi
This creates a lightweight governance layer — the plan becomes not just a guide but an enforceable contract7.
Practical Tips from the Community
Community adoption of ExecPlans has surfaced several hard-won lessons28:
-
Start with Progress — if you’re converting an existing session to use ExecPlans mid-flight, write the Progress section first to anchor the agent’s understanding of current state
-
Keep plans under 2,000 words — overly detailed plans consume context budget that the agent needs for actual code generation
-
Use
.ai/plans/for agent-agnostic storage — Kaushik Gopal recommends this directory structure over.agent/for teams using multiple AI tools2:.ai/ ├── plans/ │ ├── PLANS.md # Master template │ ├── auth-migration.md # Active plan │ └── tmp/ # Gitignored scratch plans └── ... -
Gitignore temporary plans — add
.ai/plans/tmp/to.gitignorefor exploratory work; promote plans to the parent directory when they represent committed work -
30-minute sessions are normal — whilst the pattern enables 7+ hour sessions, most practitioners report their sweet spot is 30–60 minutes of focused, plan-guided execution2
The Sora Case Study
OpenAI used the Plans.md technique internally to build the Sora Android app in 28 days4. The technique guided Codex through multi-step architecture and implementation, with each milestone independently reviewable. When approximately 50% of OpenAI staff adopted Codex with planning patterns, they produced roughly 70% more pull requests than non-users4 — a productivity gain attributable largely to the coherence that structured planning provides.
Getting Started
The fastest path to adopting ExecPlans:
- Add the AGENTS.md snippet — two lines, immediate effect
- Copy the Codex skill — clone friel-openai/plans.md into
.codex/skills/execplans/5 - Run your first plan — pick a task that would normally take 2–4 hours and prompt: “Create an ExecPlan for [task], keep it updated, implement end to end”
- Review the living document — after the session, read the plan to understand what the agent decided and why
The ExecPlan pattern is not about constraining the agent — it is about giving it the same tool that experienced engineers use instinctively: a written plan that survives interruptions, captures decisions, and keeps work focused on observable outcomes.
Citations
-
Aaron Friel, “Using PLANS.md for multi-hour problem solving,” OpenAI Cookbook, October 2025. https://developers.openai.com/cookbook/articles/codex_exec_plans ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7
-
Kaushik Gopal, “ExecPlans — How to get your coding agent to run for hours,” kau.sh, 2026. https://kau.sh/blog/exec-plans/ ↩ ↩2 ↩3 ↩4
-
OpenAI, “Run long horizon tasks with Codex,” OpenAI Developer Blog, February 2026. https://developers.openai.com/blog/run-long-horizon-tasks-with-codex ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 ↩8 ↩9 ↩10
-
Alex Embiricos, “Advanced Codex Workflows,” How I AI podcast / Lenny’s Newsletter, January 12, 2026. https://www.lennysnewsletter.com/p/this-week-on-how-i-ai-the-power-users ↩ ↩2 ↩3 ↩4
-
Aaron Friel, “plans.md — Codex skill for file-backed execution plans,” GitHub, 2025–2026. https://github.com/friel-openai/plans.md ↩ ↩2 ↩3
-
OpenAI, “Subagents — Codex,” OpenAI Developer Docs, 2026. https://developers.openai.com/codex/subagents ↩ ↩2
-
OpenAI Community, “Plans.md file mentioned in the Shipping with Codex talk at Dev Day,” OpenAI Developer Forum, 2025. https://community.openai.com/t/plans-md-file-mentioned-in-the-shipping-with-codex-talk-at-dev-day/1361628 ↩
-
ninjaa, “openai-codex-exec-plan,” GitHub, 2026. https://github.com/ninjaa/openai-codex-exec-plan ↩