Patrick Debois Formalizes Context Engineering's CI/CD Moment
Patrick Debois — the practitioner who coined "DevOps" in 2009 — presented the Context Development Lifecycle (CDLC) at AI Engineer Europe 2026, arguing that software delivery has completed a full shift: context is now the primary engineering artifact, and it demands the same discipline code received under DevOps. The talk arrives as the Agent Trace Spec v0.1.0 — an RFC backed by Cursor, Cognition, Cloudflare, and Vercel — establishes a multi-vendor standard for agent execution logs.
What the Source Actually Says
The CDLC is a four-phase infinity loop mirroring the SDLC: Generate (making implicit knowledge explicit, from simple prompts through spec-driven development), Evaluate (TDD for context), Distribute (context as a versioned package), and Observe (learning from production agent behaviour). Debois is explicit that the DevOps parallel is not decorative — "what if context had the rigor of code?" — and he is speaking from direct experience having lived through the analogous inflection point.
The Evaluate phase carries the most immediately actionable detail. Debois proposes a context testing ladder: linting checks prompt syntax; a Grammarly-style LLM reviewer scores legibility and completeness (Tessl's eval UI scored a Terraform skill at 75% Discovery, 2/3 Specificity, 3/3 Completeness); LLM-as-Judge runs unit-style assertions (does generated code follow a CLAUDE.md naming convention?); and LLM-as-Judge with tools executes live E2E tests — the judge issues a real curl, confirms HTTP 200 OK, and returns a PASS verdict. For non-deterministic evals, the prescription is practical: run five or more trials, define error budgets rather than exact pass/fail thresholds, and treat production failures as the richest source of new test cases. "Vendor metrics lie — define your own."
On distribution, a three-stage maturity arc emerges: a committed SKILL.md in git (zero friction), a versioned installable package with bundled evals (tessl install acme/skill@1.2.0), and a searchable registry with security scanning. Snyk has already built a skill scanner detecting prompt injection and credential handling issues across 9 checks. Debois is candid that current public registries are early-stage — "99.9% of skills is crap" — and flags context dependency hell (package conflicts analogous to npm) as the next emerging problem. The Observe loop closes via agent logs: the Agent Trace Spec provides a common format so "missing context" patterns are searchable across tools, not siloed per vendor.
Strategic Take
The error-budget eval model and the Agent Trace Spec are the two pieces with immediate leverage for teams already running coding agents. Define your own eval thresholds before your context estate grows unmanageable. With eight companies aligned on the Spec, it will become the observability baseline — instrumenting now avoids retroactive compliance later.

