Codex CLI for Consumer-Driven Contract Testing: Pact Generation, Provider Verification, and CI Contract Gates

Codex CLI for Consumer-Driven Contract Testing: Pact Generation, Provider Verification, and CI Contract Gates
Consumer-driven contract testing solves one of the thorniest problems in microservice architectures: how do you know your services are compatible before deploying them together? The Pact framework1 has long been the industry standard for this, but writing and maintaining consumer contracts across dozens of services is tedious, error-prone work. Codex CLI turns this into an agent-driven workflow — generating consumer tests from existing integration patterns, running provider verification in non-interactive pipelines, and enforcing contract hygiene through CI gates.
This article covers Pact V4 specification contracts2 with pact-js and pact-jvm, using Codex CLI v0.130+3 with GPT-5.54 as the recommended model for complex code generation tasks.
Why Contract Testing Needs Agent Assistance
Traditional integration tests verify compatibility by running services together. Contract tests invert this: the consumer declares what it expects from a provider, and each side verifies independently1. The approach scales far better, but introduces its own maintenance burden:
- Consumer test authorship requires understanding the provider’s API surface and encoding expectations as Pact interactions.
- Provider verification must run against every registered consumer contract on every change.
- Contract drift accumulates when teams add endpoints or change payloads without updating contracts.
Codex CLI addresses each of these by reading your existing consumer code, inferring the API calls being made, and generating Pact interactions that match your actual usage patterns rather than aspirational API documentation.
Encoding Contract Testing Conventions in AGENTS.md
Before generating any contracts, encode your team’s conventions so Codex produces consistent output. AGENTS.md files are discovered from the Git root to the working directory, with closer files overriding earlier guidance5.
<!-- services/order-service/AGENTS.md -->
# Contract Testing Conventions
## Pact Configuration
- Use Pact V4 specification for all new contracts
- Consumer names follow the pattern: `{service-name}-consumer`
- Provider names follow the pattern: `{service-name}-provider`
- Pact files are written to `pacts/` directory at project root
- All interactions MUST include provider states
## Interaction Naming
- Format: `a request to {HTTP method} {resource} {condition}`
- Example: `a request to GET /orders when orders exist`
## Prohibited Patterns
- NEVER use `like()` matchers on ID fields — use `uuid()` instead
- NEVER hardcode base URLs in consumer tests
- NEVER skip provider state setup in verification
## Test Structure
- One Pact test file per provider dependency
- Group interactions by resource path
- Include both happy-path and error-case interactions
Place a project-level AGENTS.md at the Git root for cross-service rules, and service-level files for specific provider relationships5.
Generating Consumer Contracts with Codex CLI
Interactive Consumer Test Generation
Point Codex at your consumer service code and ask it to generate Pact tests from existing API client usage:
codex --model gpt-5.5 \
"Analyse the API client in src/clients/payment-client.ts.
Identify every HTTP call made to the payment service.
Generate a Pact V4 consumer test file covering each interaction
with appropriate matchers and provider states.
Write the test to tests/contract/payment.consumer.pact.test.ts"
Codex reads the client code, identifies the HTTP methods, paths, request bodies, and expected response shapes, then generates Pact interactions using the V4 DSL2. Because it reads your actual code rather than an OpenAPI specification, the generated contracts reflect real usage — including query parameters your code actually sends and response fields it actually reads.
Structured Contract Audit with codex exec
For CI integration or batch auditing, use codex exec with --output-schema to produce machine-readable contract gap reports6:
{
"type": "object",
"properties": {
"service": { "type": "string" },
"provider_dependencies": {
"type": "array",
"items": {
"type": "object",
"properties": {
"provider": { "type": "string" },
"endpoints_used": { "type": "integer" },
"endpoints_covered": { "type": "integer" },
"missing_interactions": {
"type": "array",
"items": { "type": "string" }
}
}
}
}
}
}
codex exec \
"Audit contract test coverage for the order service.
Compare API client calls in src/clients/ against
existing Pact tests in tests/contract/.
Report coverage gaps." \
--output-schema contract-audit-schema.json \
-o contract-audit-report.json \
--sandbox read-only
This produces a structured report identifying which provider endpoints lack contract coverage, enabling targeted remediation.
Provider Verification Workflows
Provider verification confirms that the provider’s implementation satisfies all registered consumer contracts. Codex CLI can automate the verification setup and diagnose failures.
Setting Up Verification with Provider States
Provider state handlers are the most error-prone part of contract testing. Codex can generate them from your existing test fixtures:
codex --model gpt-5.5 \
"Read the Pact files in pacts/ directory.
Extract all provider states referenced by consumers.
Generate provider state handlers in
tests/contract/provider-states.ts that use our existing
test fixture factories in tests/fixtures/.
Each state handler should set up the database via
the repository layer, not by direct SQL insertion."
Non-Interactive Provider Verification in CI
Run provider verification as part of your CI pipeline using codex exec6:
codex exec \
"Run pact provider verification for the payment service.
Use the Pact files from the pact-broker at
\$PACT_BROKER_URL with consumer version selectors
for the main branch and deployed environments.
If verification fails, analyse the failure output and
suggest the minimal provider change needed to satisfy
the contract." \
--sandbox workspace-write \
--model gpt-5.5
The Contract Testing Flow
flowchart TD
A[Consumer Code Change] --> B[Codex Analyses API Clients]
B --> C[Generate/Update Pact Consumer Tests]
C --> D[Run Consumer Tests Locally]
D --> E{Tests Pass?}
E -->|No| F[Codex Diagnoses Matcher Issues]
F --> C
E -->|Yes| G[Publish Pact to Broker]
G --> H[Trigger Provider Verification]
H --> I[Codex Sets Up Provider States]
I --> J[Run Provider Verification]
J --> K{Verification Passes?}
K -->|No| L[Codex Analyses Breaking Changes]
L --> M[Generate Migration Suggestions]
K -->|Yes| N[Deploy Consumer Safely]
Building a Reusable Contract Auditor Skill
Encode the contract audit workflow as a reusable Codex skill7:
<!-- .codex/skills/contract-auditor/SKILL.md -->
# Contract Auditor Skill
## Purpose
Audit and maintain Pact consumer-driven contracts across services.
## Capabilities
1. **Coverage audit**: Compare API client calls against existing
Pact interactions to find coverage gaps
2. **Contract generation**: Generate Pact V4 consumer tests from
API client code
3. **Drift detection**: Compare Pact contracts against current
provider OpenAPI specs to find drift
4. **Provider state scaffolding**: Generate provider state handlers
from consumer Pact files and existing test fixtures
## Workflow
1. Read all files matching `src/clients/**/*.{ts,js,java,kt}`
2. Extract HTTP calls (method, path, request/response shapes)
3. Read existing Pact tests in `tests/contract/`
4. Identify uncovered interactions
5. Generate missing Pact tests following AGENTS.md conventions
6. Verify generated tests compile and pass
## Constraints
- Use Pact V4 specification only
- Apply matchers from `@pact-foundation/pact` MatchersV3
- Never generate contracts for internal-only endpoints
marked with @internal JSDoc tag
Enforcing Contract Hygiene with Hooks
Use Codex hooks to enforce contract testing conventions on every agent interaction8. Configure a PostToolUse hook that validates generated test files:
# codex.toml
[hooks.post_tool_use.contract_lint]
event = "PostToolUse"
command = """
if echo "$CODEX_TOOL_RESULT" | grep -q "pact.test"; then
npx pact-lint check tests/contract/ --spec v4 --strict
fi
"""
on_failure = "stop"
This ensures that any Pact test file Codex creates or modifies conforms to the V4 specification and passes your team’s linting rules before the agent continues.
CI Pipeline: Contract Gate
Integrate contract testing as a deployment gate in GitHub Actions9:
# .github/workflows/contract-gate.yml
name: Contract Gate
on:
pull_request:
paths:
- 'src/clients/**'
- 'tests/contract/**'
jobs:
contract-audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: npm ci
- name: Run contract coverage audit
env:
CODEX_API_KEY: $
run: |
npx @openai/codex exec \
"Audit contract test coverage. Compare all API client
calls in src/clients/ against Pact tests in
tests/contract/. Fail if any provider dependency
has less than 80% endpoint coverage." \
--output-schema .codex/schemas/contract-audit.json \
-o contract-report.json \
--sandbox read-only \
--model gpt-5.5
- name: Check coverage threshold
run: |
node -e "
const r = require('./contract-report.json');
const failing = r.provider_dependencies
.filter(d => d.endpoints_covered / d.endpoints_used < 0.8);
if (failing.length > 0) {
console.error('Contract coverage below 80%:', failing);
process.exit(1);
}
"
- name: Run consumer Pact tests
run: npm run test:contract:consumer
- name: Run provider verification
run: npm run test:contract:provider
Bi-Directional Contract Testing with OpenAPI
For teams already maintaining OpenAPI specifications, Codex can bridge between specification-first and consumer-driven approaches. ⚠️ Bi-directional contract testing (BDCT) is a PactFlow commercial feature, not available in Pact OSS10.
codex --model gpt-5.5 \
"Compare the OpenAPI spec at docs/openapi.yaml against
all consumer Pact files in pacts/.
Identify any consumer expectations that exceed what
the OpenAPI spec declares — these represent undocumented
API usage that should either be added to the spec or
removed from consumer code."
This hybrid approach catches both directions of drift: consumers using undocumented endpoints, and specifications declaring endpoints no consumer actually uses.
Model Selection for Contract Testing Tasks
| Task | Recommended Model | Rationale |
|---|---|---|
| Consumer test generation | GPT-5.54 | Complex code analysis across multiple files |
| Contract audit (structured) | GPT-5.5 | Accurate cross-referencing of clients and tests |
| Provider state scaffolding | GPT-5.5 | Requires understanding test fixture patterns |
| Simple contract lint check | GPT-5.3-Codex-Spark4 | Fast, low-cost verification task |
Anti-Patterns
- Generating contracts from OpenAPI specs alone — contracts should reflect actual consumer usage, not theoretical API surface. Use client code as the source of truth.
- Skipping provider states — stateless Pact interactions are brittle and misleading. Always specify the precondition.
- One massive Pact file per provider — split by resource or domain boundary for independent evolution.
- Trusting generated matchers without review — Codex may over-constrain responses with exact matchers where flexible matchers (
like(),eachLike()) are appropriate. Always review generated matcher choices. - Running contract generation with
--sandbox danger-full-access— read-only or workspace-write is sufficient for contract test generation. Over-permissioned sandboxes risk unintended side effects.
Known Limitations
--output-schemaandexec resumecannot be combined — if your contract audit runs across multiple sessions, you must re-specify the schema each time11.- Context window constraints — services with very large API client surfaces (50+ endpoints per provider) may exceed the context window. Split audits by provider in these cases.
- Non-deterministic generation — Codex may produce slightly different Pact interactions on repeated runs. Pin generated contracts in version control and use Codex for updates, not regeneration.
- Sandbox network isolation — provider verification requires network access to the Pact Broker. Use
--sandbox danger-full-accessonly for verification runs, not generation. - Pact V4 plugin support — Codex can generate standard HTTP and message interactions but may not correctly configure Pact V4 plugin-based interactions for custom transport protocols2.