Codex CLI Diagnostic Toolkit: Tracing, Sandbox Testing, and the Built-In Debugging Commands
Codex CLI Diagnostic Toolkit: Tracing, Sandbox Testing, and the Built-In Debugging Commands
Codex CLI ships with a surprisingly deep set of diagnostic tools that most developers never discover. When an agent session stalls, a sandbox blocks a legitimate command, or a config key silently fails to take effect, knowing how to reach for RUST_LOG, codex sandbox, or /debug-config can save hours of guesswork. This article is a systematic reference to every built-in diagnostic surface in Codex CLI as of v0.118.0.
The Diagnostic Surface Area
Codex CLI’s diagnostic capabilities span four layers: runtime tracing via environment variables, interactive slash commands inside the TUI, standalone CLI subcommands for offline testing, and post-session analysis via JSONL rollout files.
graph TD
A[Codex CLI Diagnostics] --> B[Runtime Tracing]
A --> C[TUI Slash Commands]
A --> D[Standalone Subcommands]
A --> E[Post-Session Analysis]
B --> B1["RUST_LOG env var"]
B --> B2["LOG_FORMAT=json"]
B --> B3["OpenTelemetry export"]
C --> C1["/status"]
C --> C2["/debug-config"]
C --> C3["/feedback"]
D --> D1["codex sandbox"]
D --> D2["codex execpolicy check"]
D --> D3["codex debug"]
D --> D4["codex login status"]
E --> E1["JSONL rollout files"]
E --> E2["codex-tui.log"]
Runtime Tracing with RUST_LOG
Since Codex CLI is built in Rust atop the standard tracing crate1, the RUST_LOG environment variable controls verbosity at module granularity. The default level for Codex crates is info2.
Basic Usage
# Global debug logging
RUST_LOG=debug codex
# Trace-level logging (extremely verbose)
RUST_LOG=trace codex
# Debug logging in non-interactive mode
RUST_LOG=debug codex exec "refactor the auth module"
Module-Targeted Tracing
The real power lies in per-module targeting. Codex’s Rust workspace exposes several key tracing targets2:
# Debug the core agent loop while keeping everything else at info
RUST_LOG=info,codex_core=debug codex
# Trace shell command execution specifically
RUST_LOG=codex_exec=trace,codex_core=debug codex
# Debug sandbox behaviour
RUST_LOG=codex_sandbox=debug,codex_process_hardening=debug codex
# Trace API request/response details
RUST_LOG=codex_core::api=trace codex
# Debug MCP server connections
RUST_LOG=codex_core::mcp=debug codex
# Trace configuration resolution
RUST_LOG=codex_core::config=trace codex
# Trace authentication flows
RUST_LOG=codex_core::auth=trace codex
Structured Log Output
For machine-parseable logs — useful when piping into log aggregation — set the format to JSON2:
RUST_LOG=debug LOG_FORMAT=json codex exec "run tests" 2>&1 | tee codex-debug.log
The compact format is also available via RUST_LOG_FORMAT=compact2.
Log File Locations
Codex writes TUI logs to ~/.codex/log/codex-tui.log, with automatic rotation3. In codex exec mode, timestamped log files appear at ~/.codex/logs/codex-tui-<timestamp>.log2. These can be safely deleted when no longer needed, but they are invaluable for post-mortem debugging.
# Monitor logs in real time during a session
tail -f ~/.codex/logs/codex-tui-*.log
⚠️ Performance warning: Debug and trace levels can reduce throughput by 10–50%2. Reserve them for active troubleshooting, not production workflows.
TUI Slash Commands for Live Diagnostics
Three slash commands provide in-session diagnostic information without leaving the TUI.
/status — Session Overview
The /status command displays the current session configuration and token usage4. This is your first stop when something feels off — it confirms which model is active, the current reasoning effort level, token consumption, and the effective sandbox mode.
/debug-config — Configuration Layer Diagnostics
When a config key appears to have no effect, /debug-config reveals the full configuration resolution stack5. It prints:
- Layer order (lowest to highest precedence)
- The effective value of each key and which layer set it
- Policy details:
allowed_approval_policies,allowed_sandbox_modes,mcp_servers,rules,enforce_residency, andexperimental_network
This is particularly useful in enterprise environments where requirements.toml may silently override your config.toml settings5. If your sandbox_mode = "danger-full-access" is being ignored, /debug-config will show you that a managed policy is enforcing workspace-write.
/feedback — Structured Bug Reports
The /feedback command collects diagnostic information and submits it directly to OpenAI’s maintainers3. When invoked, it captures:
- Request ID (essential for OpenAI support tickets)
- Session ID
- Connection status (connected/reconnecting/disconnected)
- Last error message
- Active tools count
- MCP server connection status
Always run /feedback before closing a session that exhibited unexpected behaviour — the request ID is the single most useful datum when filing issues on GitHub3.
The codex sandbox Subcommand
The codex sandbox subcommand6 lets you test arbitrary commands under the exact same sandbox enforcement that Codex applies during agent sessions — without starting an agent session. This is indispensable when diagnosing why a build tool or test runner fails under sandboxing.
Platform-Specific Syntax
# macOS — test a command under Seatbelt enforcement
codex sandbox macos -- npm run build
# macOS — with full-auto permissions and denial logging
codex sandbox macos --full-auto --log-denials -- cargo test
# Linux — test under Landlock/bubblewrap enforcement
codex sandbox linux -- pytest tests/
# Linux — full-auto mode (workspace-write equivalent)
codex sandbox linux --full-auto -- make install
# Windows — test under restricted token enforcement
codex sandbox windows --full-auto -- dotnet test
The --log-denials flag on macOS is particularly valuable: it prints every Seatbelt denial to stderr, showing exactly which filesystem path or network operation was blocked6.
Legacy Aliases
The older codex debug seatbelt and codex debug landlock commands still work as aliases7:
# These are equivalent:
codex sandbox macos -- ls /etc
codex debug seatbelt -- ls /etc
Practical Use: Diagnosing Build Failures
A common scenario: your Rust project builds fine outside Codex but fails under the agent’s sandbox. Use codex sandbox to isolate the issue:
# Step 1: Test the build under sandbox
codex sandbox linux -- cargo build 2>&1 | grep -i denied
# Step 2: If failures appear, try with full-auto (workspace-write)
codex sandbox linux --full-auto -- cargo build
# Step 3: If it still fails, the issue is network access
# (e.g., crates.io downloads blocked by sandbox)
This workflow avoids the cost of starting a full agent session just to debug sandbox restrictions.
Platform Implementation Details
On macOS 12+, codex sandbox invokes Apple’s Seatbelt framework via /usr/bin/sandbox-exec with a runtime-generated profile controlling filesystem and network access6. On Linux, the sandbox uses a dual-mode pipeline: Landlock LSM by default, or bubblewrap (vendored in codex-rs/vendor/bubblewrap/) when enabled via features.use_linux_sandbox_bwrap = true6. The bubblewrap path provides stronger isolation through PID namespace separation (--unshare-pid), network namespace isolation (--unshare-net), and seccomp filters6.
flowchart LR
subgraph macOS
A[codex sandbox macos] --> B[sandbox-exec]
B --> C[Seatbelt profile]
C --> D[Command runs isolated]
end
subgraph Linux
E[codex sandbox linux] --> F{bwrap enabled?}
F -->|Yes| G[bubblewrap]
F -->|No| H[Landlock + seccomp]
G --> I[Namespace isolation]
H --> I
I --> J[Command runs isolated]
end
The codex execpolicy check Subcommand
Before deploying Starlark .rules files, validate them offline with codex execpolicy check8. This subcommand evaluates one or more rule files against a proposed command and reports the decision without executing anything.
# Test a command against your rules
codex execpolicy check \
--pretty \
--rules ~/.codex/rules/default.rules \
-- gh pr view 7888 --json title,body,comments
The output shows:
- Effective decision: the strictest severity across all matched rules (
forbidden>prompt>allow)8 - matchedRules: every rule whose prefix matched, with the exact
matchedPrefixshown8
You can combine multiple rule files:
codex execpolicy check \
--pretty \
--rules ~/.codex/rules/default.rules \
--rules .codex/rules/project.rules \
-- rm -rf node_modules
Unit Tests in Rules Files
The match and not_match fields in prefix_rule() function as inline unit tests8. Codex validates these examples when it loads your rules — if a match example does not trigger the rule, or a not_match example does, loading fails. Always populate these fields:
prefix_rule(
pattern = "rm -rf",
decision = "forbidden",
match = ["rm -rf /", "rm -rf node_modules"],
not_match = ["rm file.txt", "rmdir empty"]
)
The codex debug Subcommand
The codex debug command is the entry point for lower-level debugging utilities7:
# List available debug subcommands
codex debug --help
# Test the V2 app-server protocol with a single message
codex debug app-server send-message-v2 "Hello, world"
The send-message-v2 subcommand initialises the app-server, starts a thread, sends a single user message, and streams all server notifications back to the terminal7. This is useful for verifying that the app-server protocol is functioning correctly without starting the full TUI.
Authentication Diagnostics
When sessions fail to start with authentication errors, two commands help isolate the issue:
# Check current auth state without triggering a login flow
codex login status
# Inspect the auth token file directly
cat ~/.codex/auth.json | jq '.expires_at'
The codex login status command reports whether you are authenticated, the method used (browser OAuth, device code, or API key), and whether the token is valid7. A common failure pattern is a corrupted or expired auth.json file — the fix is to run codex logout followed by codex login3.
OpenTelemetry Integration
For production observability beyond ad-hoc tracing, Codex CLI supports OpenTelemetry export via the [otel] config section9:
[otel]
enabled = true
endpoint = "http://localhost:4317"
sampling_ratio = 1.0
service_name = "codex-cli"
This exports spans covering API calls, tool invocations, and sandbox operations to any OTLP-compatible backend (Jaeger, Grafana Tempo, SigNoz)9. Environment variables OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_SERVICE_NAME also work2.
⚠️ Note: codex exec does not yet export OTel metrics, and codex mcp-server mode has no telemetry support as of v0.118.09.
Post-Session Analysis with JSONL Rollout Files
Every Codex session writes a JSONL rollout file to ~/.codex/sessions/10. These files contain RolloutItem events (SessionMeta, UserMessage, ResponseItem, EventMsg, ApprovalDecision) and are invaluable for understanding what happened during a session that went wrong.
# Find the latest session rollout
ls -t ~/.codex/sessions/*.jsonl | head -1
# Count tool calls in a session
cat ~/.codex/sessions/<session>.jsonl | \
jq 'select(.type == "ResponseItem") | .item.type' | \
sort | uniq -c | sort -rn
# Extract all approval decisions
cat ~/.codex/sessions/<session>.jsonl | \
jq 'select(.type == "ApprovalDecision")'
The community codex-replay tool renders these JSONL files as browsable HTML, and the ccusage project provides daily and monthly cost reports parsed from rollout token counters10.
A Diagnostic Workflow Checklist
When something goes wrong, work through this sequence:
- Check config: Run
/debug-configto verify your settings are taking effect - Check auth: Run
codex login statusto rule out credential issues - Check sandbox: Use
codex sandbox <platform> -- <command>to test commands in isolation - Check rules: Use
codex execpolicy check --pretty --rules <file> -- <command>to validate execution policies - Enable tracing: Restart with
RUST_LOG=debug codexand monitor~/.codex/log/codex-tui.log - Review the rollout: Inspect the JSONL session file for the failed session
- File a report: Run
/feedbackto capture diagnostic context before closing
This top-down approach moves from cheap (no restart required) to expensive (restart with tracing), minimising disruption to your workflow.
Citations
-
Tracing & Verbose Logging — Codex CLI Advanced Documentation ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7
-
Codex CLI Logs: Location, Debug Flags & 401 Error Fix — SmartScope ↩ ↩2 ↩3 ↩4
-
Sandboxing Architecture — Codex CLI Documentation ↩ ↩2 ↩3 ↩4 ↩5
-
Command Line Options — Codex CLI Reference — OpenAI Developers ↩ ↩2 ↩3 ↩4
-
Execution Policy (execpolicy) README — OpenAI Codex GitHub ↩ ↩2 ↩3 ↩4
-
Codex CLI OpenTelemetry: Observability and Metrics in Production — Codex Resources ↩ ↩2 ↩3
-
Codex CLI Session Analytics: Mining the JSONL Rollout Format — Codex Resources ↩ ↩2