Hook Gotchas: Seven PreToolUse and PostToolUse Timing Mistakes That Break Deterministic Enforcement in Codex CLI
Hook Gotchas: Seven PreToolUse and PostToolUse Timing Mistakes That Break Deterministic Enforcement in Codex CLI
Hooks are the deterministic enforcement layer that makes agent compliance guaranteed rather than probable. When a prompt says “never write secrets to files”, there is always a non-zero probability the model ignores it. When a PreToolUse hook exits with code 2, the tool call is blocked with mathematical certainty. No probability. No retries. No reasoning.
This makes hooks the single most powerful compliance mechanism in Codex CLI. It also makes misconfiguring them uniquely dangerous — because a broken hook does not fail loudly. It fails silently, downgrading your “guaranteed block” to “nothing happened at all.”
After reviewing community issue reports, enterprise deployment audits, and the patterns documented in the Claude Certified Architect exam preparation materials, seven hook mistakes recur consistently. Each one either (a) silently disables enforcement, (b) blocks the wrong things, or (c) creates performance problems that lead teams to disable hooks entirely.
Gotcha 1: Using PostToolUse When You Need PreToolUse
The mistake: Putting compliance logic in a PostToolUse hook to prevent dangerous operations.
Why it breaks: PostToolUse fires after the tool has already executed. If your hook checks whether a file write contains secrets, the file has already been written by the time the hook runs. Your “block” is actually a “detect after the damage is done.”
The correct pattern:
{
"hooks": {
"PreToolUse": [
{
"hooks": [
{
"type": "command",
"command": "bash .hooks/pre-write-secrets-check.sh",
"timeout": 5000
}
]
}
]
}
}
PreToolUse receives the tool call before execution. Exit code 2 blocks the call entirely. The tool never runs. The file is never written. The secret never touches the filesystem.
The CCA-F principle: “When the question uses words like must, never, or guarantee, the guarantee always wins.” If something must never happen, PreToolUse is the only option. PostToolUse can only report, transform, or clean up — it cannot prevent.
Gotcha 2: Assuming Exit Code 1 Blocks Execution
The mistake: Writing a PreToolUse hook that exits with code 1 when it detects a violation, expecting the tool call to be blocked.
Why it breaks: In Codex CLI’s hook model, exit codes have specific meanings:
| Exit code | Effect in PreToolUse | Effect in PostToolUse |
|---|---|---|
| 0 | Allow (proceed) | Accept output as-is |
| 1 | Hook error (logged, but tool still runs) | Hook error (logged, output unchanged) |
| 2 | Block (tool call prevented) | Transform/reject output |
Exit code 1 means “the hook itself failed” — a script error, missing dependency, or timeout. Codex CLI treats this as a hook malfunction, not as a deliberate enforcement decision. The tool call proceeds on the assumption that you would rather have a working agent with a broken hook than a completely stuck agent.
The fix: Always use exit 2 for deliberate blocks:
#!/bin/bash
# .hooks/pre-write-secrets-check.sh
# Read the proposed file content from stdin
CONTENT=$(cat)
# Check for secret patterns
if echo "$CONTENT" | grep -qE '(AKIA[A-Z0-9]{16}|ghp_[a-zA-Z0-9]{36}|sk-[a-zA-Z0-9]{48})'; then
echo "BLOCKED: Detected potential secret in proposed file write" >&2
exit 2 # EXIT 2 = deterministic block
fi
exit 0 # EXIT 0 = allow
Gotcha 3: Setting Timeouts Too Short for Meaningful Checks
The mistake: Using the default or a very short timeout (1000-2000ms) for hooks that run Gradle, Docker, or any compilation step.
Why it breaks: When a hook times out, it is treated as exit code 1 (hook error) — which means the tool call proceeds. Your 30-second ArchUnit check that verifies layer boundaries never finishes, and the enforcement silently disappears.
Teams often discover this only when reviewing logs weeks later: “Why did the hook allow a domain-to-adapter dependency? Oh — it timed out on every invocation and was never actually checking anything.”
The fix: Set timeouts proportional to what the hook actually does:
{
"hooks": {
"PostToolUse": [
{
"hooks": [
{
"type": "command",
"command": "bash .hooks/post-write-check.sh",
"timeout": 120000
}
]
}
]
}
}
| Check type | Realistic timeout |
|---|---|
| Secret scanning (regex) | 5,000ms |
| Lint/format check | 15,000ms |
| Unit test compilation | 30,000ms |
| Full test suite | 120,000ms |
| Container build + scan | 300,000ms |
Rule of thumb: Measure your hook’s p99 execution time, then set the timeout to 3x that value. If your check cannot run within 5 minutes, it belongs in a ./gradlew verify task triggered less frequently, not on every file write.
Gotcha 4: Running Expensive Hooks on Every Tool Call
The mistake: Attaching a full ./gradlew test run to PostToolUse without filtering which tool calls trigger it.
Why it breaks: PostToolUse fires on every tool call — reads, writes, bash commands, search operations. If your hook runs the full test suite every time the agent reads a file, a session that would take 5 minutes takes 3 hours. The team disables hooks “temporarily” to unblock the agent. Temporarily becomes permanently.
The fix: Filter hooks by tool name or by what changed:
#!/bin/bash
# .hooks/post-write-check.sh
# Only run checks when files were actually modified
TOOL_NAME="${CODEX_HOOK_TOOL_NAME:-}"
# Skip non-write operations
case "$TOOL_NAME" in
write|edit|apply_patch) ;;
*) exit 0 ;;
esac
# Only check modified source files, not config or docs
CHANGED_FILE="${CODEX_HOOK_FILE_PATH:-}"
case "$CHANGED_FILE" in
src/main/java/*|src/test/java/*) ;;
*) exit 0 ;;
esac
# Now run the targeted check
./gradlew spotlessCheck checkstyleMain -q 2>&1 || exit 2
The principle: Hooks should be fast-path by default. Only do expensive work when the trigger justifies it.
Gotcha 5: Using Prompt Hooks When You Need Command Hooks
The mistake: Using hook types that involve LLM evaluation (“prompt” or “agent” hooks) for compliance enforcement.
Why it breaks: The CCA-F exam materials make this distinction explicit and testable:
- Command hooks (
"type": "command") are deterministic. They run a shell script. The exit code decides. No LLM involved. 100% reliable. - Prompt hooks are probabilistic. They ask the model “should this be allowed?” The model’s answer has a non-zero failure rate.
If your compliance requirement is “never write AWS credentials to files,” a prompt hook might evaluate to “this looks like test data, allow it” on a well-crafted test fixture that happens to contain real credentials. A command hook running gitleaks will catch it every time regardless of what the model thinks about context.
The rule: For anything that uses the words “must”, “never”, “always”, or “guaranteed” — use command hooks only. Reserve prompt/agent hooks for advisory guidance (“suggest better variable names”), not enforcement.
Gotcha 6: Not Passing Context to the Hook Script
The mistake: Writing hook scripts that check for violations but have no information about what is being written or where.
Why it breaks: A PreToolUse hook receives information about the proposed tool call via stdin or environment variables. If your script ignores this context and instead scans the whole repository, it is (a) slow, (b) checking unchanged files, and (c) missing the actual content of the proposed write.
The fix: Use the hook’s input context:
#!/bin/bash
# PreToolUse hook receives the tool call payload on stdin
PAYLOAD=$(cat)
# Extract the file path and content from the JSON payload
FILE_PATH=$(echo "$PAYLOAD" | jq -r '.file_path // empty')
CONTENT=$(echo "$PAYLOAD" | jq -r '.content // empty')
# Target check: only scan the specific content being written
if [ -n "$CONTENT" ]; then
echo "$CONTENT" | gitleaks detect --pipe --no-banner 2>/dev/null
if [ $? -ne 0 ]; then
echo "BLOCKED: Secret detected in content being written to $FILE_PATH" >&2
exit 2
fi
fi
exit 0
Gotcha 7: Assuming Hooks Work in the API
The mistake: Writing hook logic assuming it will intercept tool calls when using the Claude/OpenAI API directly (not through Claude Code or Codex CLI).
Why it breaks: Hooks are a feature of the client tool (Codex CLI or Claude Code), not the underlying API. The API has no interception layer — tool calls are returned in the response, and your application code decides what to do with them. There is no PreToolUse event at the API level.
If you are building a custom agent harness on the raw API, you need to implement your own tool call validation in your application code — typically as a middleware layer between the API response parser and the tool executor:
# Custom harness equivalent of PreToolUse
def execute_tool_call(tool_call):
# Your enforcement logic here (equivalent to hook scripts)
if tool_call.name == "write_file":
if contains_secrets(tool_call.arguments["content"]):
return {"is_error": True, "content": "BLOCKED: secret detected"}
# Proceed with execution
return actually_execute(tool_call)
The principle: Hooks are a Codex CLI / Claude Code convenience layer over a pattern you would otherwise implement manually. If you move to a custom harness, you must rebuild the enforcement — it does not come free from the API.
The Meta-Pattern: Enforcement vs Advisory
Every hook configuration decision reduces to one question: Is this a guarantee or a suggestion?
| Requirement | Enforcement type | Hook approach |
|---|---|---|
| “Must never contain secrets” | Guarantee | PreToolUse, command, exit 2 |
| “Should follow naming conventions” | Advisory (strong) | PostToolUse, command, warning in output |
| “Might want to consider shorter methods” | Advisory (soft) | No hook — put it in AGENTS.md |
| “Must pass all tests before commit” | Guarantee | PostToolUse on commit-like operations, command, exit 2 |
| “Should generate documentation” | Advisory | Prompt in AGENTS.md |
If you blur this boundary — using prompt-based enforcement for guarantees, or using deterministic hooks for soft suggestions — you either get false security (thinking something is blocked when it is not) or frustrated agents (blocked from reasonable actions by overly rigid gates).
The hooks are the closest thing to a type system for agent behaviour. Use them like one: strict where it matters, absent where it doesn’t.
References
- OpenAI, “Hooks — Codex CLI,” developers.openai.com, 2026. PreToolUse/PostToolUse event model, exit code semantics, timeout handling.
- Anthropic, “Claude Code Hooks,” code.claude.com, 2026. The same hook model in Claude Code — confirms the shared enforcement pattern.
- CCA-F exam anti-patterns (Domain 1, Domain 3). “Using prompts for guaranteed compliance” is a confirmed distractor; the correct answer is always deterministic enforcement via hooks.