Methodology
Convergence Engineering Development (CED)
A specification-first methodology for producing systematically reliable AI-generated software, validated through controlled experiments.
Overview
Convergence Engineering Development (CED) is a specification-first methodology for producing enterprise-quality software using AI code generation. Instead of generating code and hoping it works, CED specifies contracts, interfaces, and integration points first — then generates implementations that must satisfy those specifications. Every failure mode discovered during verification is fed back into the methodology and permanently eliminated.
CED evolved from what was originally called Spec-Driven Development (SDD). The rename reflects the methodology's core mechanism: iterative convergence through specification, generation, scoring, and revision — not just specification-first development, but a closed-loop system that converges toward zero failure modes.
Evolution
The methodology has evolved through four phases of experimentation:
| Phase | Scope | Result |
|---|---|---|
| Phase 0 | Layer 0 backend convergence (NestJS + Prisma + PostgreSQL) | 10 trials, 11 methodology iterations, 34 failure modes, full convergence |
| Layered Convergence | 10 full-stack layers (backend through cross-layer integration) | 44 trials, 102 failure modes, all 10 layers converged |
| Discrete Convergence | Replace LLM scoring with deterministic tools across 24 dimensions | 28 trials, 64 failure modes, all 5 phases converged |
| Normative Convergence | 5 epistemic layers, 40 dimensions mapped to ISO/IEC 25010:2023 | 2 trials, 9 scorer bugs found — Goodhart's Law question open |
| Two Roads to Deployment | Pipeline (19K LOC, 8 phases) vs agent loop (833 new lines, model cascade) | 19 trials, 43 failure modes — agent loop produced output; pipeline continues iterating |
Key milestones: the Layer 0 methodology reached its terminal iteration in March 2026. Layered convergence extended the methodology to the full stack. Discrete convergence replaced LLM-based scoring with deterministic tools. Normative convergence added epistemic depth — and surfaced a fundamental Goodhart's Law problem: when the scorer and the code co-evolve, convergence may prove gaming, not quality.
Core Concepts
CED follows a structured phase progression from specification through verification:
| Phase | Name | Purpose |
|---|---|---|
| A | Specification | Define contracts, data models, and API surfaces before any code is generated |
| B–C4 | Implementation | Generate and wire implementations against the specification |
| C5 | Hardening | Systematic verification against the full convention set |
| C6 | Reporting | Score each application, identify new failure modes, revise methodology |
Applications are scored across 6 equally-weighted dimensions. The methodology codifies 31 conventions and 28 anti-patterns, each discovered through a specific failure in an earlier trial. A 49-item reproducibility checklist ensures consistent application across projects.
Research Results
Four experiments across 93 trials. Layered convergence validated the build-score-revise cycle across 10 full-stack layers. Discrete convergence replaced LLM scoring with deterministic tools. Normative convergence added 5 epistemic layers mapped to ISO/IEC 25010:2023 across 40 dimensions — and surfaced a Goodhart's Law problem: the scorer and the code co-evolved, making it impossible to determine whether the 9.79/10.0 score reflects genuine quality or optimization to the metric.
The most recent experiment — Two Roads to Deployment — shifts from measurement to architecture. It compares a gated pipeline (19K lines, 8 phases) against a guided agent loop (833 new lines, model cascade) for AI code generation. The agent loop produced a working application with 99.7% local compute. Both approaches continue development.
Full results are on the experiments page.
Access
This page presents CED at a conceptual level. The full methodology document — including detailed convention definitions, the anti-pattern index, phase gate checklists, scoring criteria, and the 49-item reproducibility checklist — is available on request.
Contact Stephen Deslate to discuss access or collaboration.