Codex CLI Pull Request Workflows: Branch to Merge with Agent-Assisted Review and CI Integration
Codex CLI Pull Request Workflows: Branch to Merge with Agent-Assisted Review and CI Integration
Pull requests are where code quality lives or dies. Most teams treat Codex CLI as a code-generation tool and then fall back to manual workflows the moment they push a branch. That wastes most of the agent’s value. Codex CLI can assist at every stage of the PR lifecycle — from branch creation through local review, CI gating, GitHub-native review comments, and post-merge cleanup. This article maps the complete workflow, shows the configuration needed at each stage, and identifies the integration points most teams miss.
The End-to-End PR Pipeline
A modern Codex-assisted PR workflow has six stages. Each stage has a corresponding Codex surface — CLI, GitHub cloud, or CI — and specific configuration requirements.
flowchart LR
A[Branch & Worktree] --> B[Implement with Codex]
B --> C[Local /review]
C --> D[Commit & Push]
D --> E[CI: codex-action]
E --> F[GitHub: @codex review]
F --> G[Fix & Merge]
Stage 1: Branch Isolation with Git Worktrees
Running Codex on your main working tree is a recipe for merge conflicts. The recommended pattern is to create a git worktree for each feature branch, giving Codex an isolated filesystem to operate in while your main checkout stays clean1.
# Create a worktree for the feature branch
git worktree add ../feature-auth-refactor -b feature/auth-refactor
# Start Codex in the worktree directory
cd ../feature-auth-refactor
codex
Each worktree gets its own Codex session, branch, and file state. You can run multiple Codex sessions in parallel across different worktrees without interference1. Project-level .codex/config.toml and AGENTS.md files travel with the repository, so every worktree inherits the same agent configuration.
For teams running multiple agents simultaneously, the --add-dir flag lets Codex write to additional directories beyond the worktree root2:
codex --cd ../feature-auth-refactor --add-dir ../shared-types
Stage 2: Implementation with Agent Guidance
The implementation phase is where most teams already use Codex well, but two patterns improve PR quality downstream.
Write AGENTS.md Review Guidelines Early
AGENTS.md is not just for build commands. Adding a Review guidelines section means both the implementation agent and the review agent follow the same standards3:
## Review guidelines
- Flag any function longer than 40 lines as P1
- All public API endpoints must have integration tests
- Do not log PII — flag as P0 if detected
- Prefer composition over inheritance in new code
- Every database migration must be reversible
Codex discovers these guidelines automatically during both /review and @codex review on GitHub3. Writing them before implementation — not after — prevents the agent from generating code that its own review will flag.
Use Plan Mode for Non-Trivial Changes
For multi-file changes that will touch more than three files, start with plan mode4:
> /plan Refactor the auth middleware to extract token validation into a standalone module
Codex produces a plan without making changes. Review the plan, adjust scope, then switch to implementation. This reduces the diff size per PR and produces cleaner commit histories.
Stage 3: Local Review with /review
The /review command launches a dedicated reviewer that analyses your diff and reports prioritised findings without touching the working tree5. It operates in three modes:
| Mode | What It Reviews | When to Use |
|---|---|---|
| Review against base branch | Merge-base diff vs upstream | Before opening the PR |
| Review uncommitted changes | Staged + unstaged files | Before committing |
| Review a commit | Specific SHA | After committing, before pushing |
> /review
# Select: "Review against base branch"
# Select: main
The reviewer flags P0 (must fix) and P1 (should fix) issues with file paths and line ranges5. It respects the Review guidelines section in your AGENTS.md, so the findings align with your team’s standards3.
Configuring a Review-Specific Model
You can route review tasks to a different model using a named profile6:
# ~/.codex/config.toml
[profiles.review]
model = "gpt-5.5"
model_reasoning_effort = "high"
approval_policy = "on-request"
Then invoke it:
codex -p review
Using GPT-5.5 with high reasoning effort for reviews catches subtle issues — race conditions, security regressions, missing edge cases — that faster models miss7. The extra cost is worth it because review runs once per PR, not once per edit.
Stage 4: Commit and Push
After local review passes, commit with a descriptive message. Codex can generate commit messages that summarise the diff:
> Write a conventional commit message for the current staged changes
For teams enforcing commit conventions, add the format to your AGENTS.md:
## Commit conventions
- Use Conventional Commits format: type(scope): description
- Types: feat, fix, refactor, test, docs, chore
- Scope is the module or directory name
- Description is imperative mood, lowercase, no period
Stage 5: CI Integration with codex-action
The openai/codex-action@v1 GitHub Action installs Codex CLI, starts the Responses API proxy, and runs codex exec with your prompt inside the CI runner8. This gives you an agent-powered quality gate in your pipeline.
Basic PR Review Workflow
name: Codex PR Review
on:
pull_request:
types: [opened, synchronize, reopened]
jobs:
codex-review:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Run Codex Review
id: review
uses: openai/codex-action@v1
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
prompt-file: .github/codex/prompts/review.md
output-file: codex-review.md
model: gpt-5.5
sandbox: read-only
The safety-strategy defaults to drop-sudo, which irreversibly removes sudo privileges from the runner — a sensible default for review tasks8. For tasks that need to run builds or tests, use workspace-write sandbox mode.
Structured Output for Automated Gating
Use --output-schema to enforce machine-readable review output8:
{
"type": "object",
"properties": {
"p0_issues": { "type": "integer" },
"p1_issues": { "type": "integer" },
"summary": { "type": "string" },
"approve": { "type": "boolean" }
},
"required": ["p0_issues", "p1_issues", "approve"]
}
Then gate the merge on the result:
- name: Check Review Result
run: |
P0=$(jq '.p0_issues' codex-review.md)
if [ "$P0" -gt 0 ]; then
echo "::error::Codex found $P0 P0 issues — blocking merge"
exit 1
fi
Security Considerations
Three rules for running Codex in CI8:
- Restrict triggers. Use
allow-usersandallow-botsto prevent untrusted actors from injecting prompts via PR descriptions. - Sanitise inputs. Never pass raw PR body text as a prompt — it is an injection vector.
- Isolate the step. Run Codex as the final job step to contain side effects. Rotate API keys immediately if exposure is suspected.
Stage 6: GitHub-Native Review with @codex
Once the PR is open, Codex can review it directly on GitHub as a cloud-based coding agent3. This is separate from the CI action — it posts standard GitHub review comments visible to all reviewers.
Manual Trigger
Comment on the PR:
@codex review
Codex reacts with a eyes emoji, reads the diff against the base branch, and posts a review with P0 and P1 findings as inline comments3.
Automatic Reviews
Enable automatic reviews in Codex settings to have every new PR reviewed without a manual trigger3. This is useful for high-volume repositories where human reviewers want an initial triage before investing time.
Custom Review Instructions
Add context to the trigger:
@codex review for security regressions in the auth module
Codex scopes its analysis accordingly3.
Fix-and-Push Loop
After Codex posts findings, you can ask it to fix the issues in the same PR3:
@codex fix the P1 issue in auth/validate.ts
Codex starts a cloud task using the PR as context, implements the fix, and pushes a commit to the branch — provided it has write access to the repository. This closes the loop without switching back to your terminal.
Connecting the Stages: A Complete Configuration
Here is the minimum configuration set for the full pipeline:
Repository Structure
project/
AGENTS.md # Review guidelines + build commands
.codex/
config.toml # Project-level Codex config
.github/
codex/
prompts/
review.md # CI review prompt
workflows/
codex-review.yml # GitHub Action workflow
Project-Level config.toml
# .codex/config.toml
model = "gpt-5.5"
approval_policy = "on-request"
sandbox_mode = "workspace-write"
[features]
web_search = "disabled"
CI Review Prompt (.github/codex/prompts/review.md)
Review the changes in this pull request against the base branch.
Focus on:
1. Correctness — logic errors, off-by-one, null handling
2. Security — injection, auth bypass, secrets in code
3. Performance — N+1 queries, unbounded loops, missing indices
4. Test coverage — new code paths without tests
Output structured JSON with p0_issues, p1_issues, summary, and approve fields.
Follow the Review guidelines in AGENTS.md.
The Decision Matrix: Which Review Surface When
flowchart TD
A[Changes ready for review] --> B{Still local?}
B -->|Yes| C[/review in CLI]
C --> D{P0 issues found?}
D -->|Yes| E[Fix locally, re-review]
D -->|No| F[Push branch, open PR]
B -->|No, PR is open| F
F --> G[codex-action runs in CI]
G --> H{CI gate passes?}
H -->|No| I[Fix issues, push again]
H -->|Yes| J[@codex review on GitHub]
J --> K{Findings?}
K -->|Yes| L[@codex fix or manual fix]
K -->|No| M[Human reviewer approves]
L --> G
M --> N[Merge]
Running /review locally before pushing catches approximately 60-70% of issues that would otherwise appear in CI or GitHub review9. This reduces CI costs, shortens feedback loops, and keeps the PR comment thread focused on genuinely ambiguous decisions that need human judgement.
Common Mistakes
Running all reviews at the same reasoning level. Local /review during active development can use medium reasoning effort. CI and GitHub reviews — which run once and gate the merge — should use high effort with GPT-5.57.
Skipping AGENTS.md review guidelines. Without explicit guidelines, Codex applies generic heuristics. Teams report a 40-50% increase in actionable findings after adding project-specific rules to AGENTS.md3.
Using codex-action without fetch-depth: 0. The action needs full git history to compute the merge-base diff. Shallow clones produce incomplete reviews8.
Treating @codex review as a replacement for human review. Agent review catches mechanical issues — bugs, missing tests, security patterns. Human reviewers catch architectural misalignment, team convention drift, and whether the change solves the right problem. Use both9.
Citations
-
Codex Worktrees Documentation — OpenAI Developers, accessed June 2026. ↩ ↩2
-
Codex CLI Features — Multiple Writable Roots — OpenAI Developers, accessed June 2026. ↩
-
Code Review in GitHub — Codex Integration — OpenAI Developers, accessed June 2026. ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 ↩8 ↩9
-
Best Practices — Plan First for Difficult Tasks — OpenAI Developers, accessed June 2026. ↩
-
Codex App Review — Local Review Modes — OpenAI Developers, accessed June 2026. ↩ ↩2
-
Advanced Configuration — Profiles — OpenAI Developers, accessed June 2026. ↩
-
Codex Models — GPT-5.5 Capabilities — OpenAI Developers, accessed June 2026. ↩ ↩2
-
Codex GitHub Action — OpenAI Developers, accessed June 2026. ↩ ↩2 ↩3 ↩4 ↩5
-
What 33,000 Agentic Pull Requests Reveal: Empirical Lessons for Codex CLI Practitioners — Codex Knowledge Base, April 2026. ↩ ↩2