Experiment
Layered Convergence
Can a specification-first methodology, iterated through build-score-revise trials, converge across 10 full-stack quality layers?
Hypothesis
Convergence Engineering Development (CED) demonstrated convergence at the backend layer (Layer 0) across 10 trials. But backend API quality is only one dimension of production software. The hypothesis: the same specification-first, build-score-revise methodology can converge across all quality layers of a full-stack application.
Convergence means that the methodology has absorbed enough failure modes in a given domain that new trials produce zero new findings. Each layer introduces a distinct quality domain — from integration testing to cross-layer integration — and must converge independently before the next layer begins.
Method
Ten progressive layers, each targeting a distinct quality domain. For each layer, three enterprise applications are built across different business domains (event management, booking, telehealth), scored across multiple dimensions, and failure modes are tracked. Each discovered failure mode is fed back into the methodology and must be absent from all subsequent trials.
Layers are sequential — each must converge before the next begins. This trades speed for rigor: rather than testing everything at once, each domain receives focused attention until no new failure modes emerge.
Results
44
Trials
132
Applications
102
Failure Modes
10/10
Layers Converged
All 10 layers converged. The methodology progressed from build-breaking structural failures in Trial 1 to zero new findings across all quality domains. Scores dipped when new layers introduced fresh domains — each layer expanded the scoring criteria to match its domain — then recovered as failure modes were absorbed.
Failure Mode Discovery + Cumulative
Score Trajectory
New Failures Per Trial
Severity Tier Progression
Layer Progression
Each layer added a distinct quality domain. Layers converged sequentially — each must reach zero new failure modes before the next begins.
| Layer | Name | Domain | Trials | FMs | Status |
|---|---|---|---|---|---|
| 0 | Backend API | NestJS + Prisma + PostgreSQL | 1-10 | 34 | Converged |
| 1 | Integration Testing | End-to-end test coverage | 11-14 | 4 | Converged |
| 2 | Frontend | Next.js + React | 15-20 | 24 | Converged |
| 3 | Specifications | Structured spec framework | 21-27 | 8 | Converged |
| 4 | Infrastructure | Docker + CI/CD | 28-31 | 3 | Converged |
| 5 | Monorepo | Turborepo + pnpm workspaces | 32-35 | 1 | Converged |
| 6 | Security | Auth, validation, OWASP | 36-37 | 15 | Converged |
| 7 | Performance | Caching, optimization | 38-39 | 3 | Converged |
| 8 | Monitoring | Logging, health, observability | 40-42 | 5 | Converged |
| 9 | Cross-Layer Integration | Full-stack integration | 43-44 | 4 | Converged |
Transparency
Honest accounting of where the experiment deviated from ideal conditions, its known limitations, and how the protocol was corrected during the research.
Protocol Relaxation
Relaxed convergence criteria
Layers 6 (Security), 7 (Performance), and 9 (Cross-Layer Integration) converged with relaxed criteria — fewer consecutive clean trials than the standard protocol required. The methodology was still applied in full, but convergence was declared earlier than the baseline protocol would have allowed.
Known Limitations
No runtime execution
Verification was static analysis and code review, not live database or API testing. Generated applications were not deployed or executed against real infrastructure.
Single AI system
All building, scoring, and auditing was performed by Claude instances. There was no independent third-party verification or cross-model validation.
Tech stack specificity
Only the NestJS + Next.js + Prisma + PostgreSQL stack was validated. Results may not transfer to other frameworks, languages, or database systems.
Self-Correction
Original run invalidated
An original T15-T49 run was invalidated after scientific review found structural gaps in the experimental protocol. The entire layered convergence sequence (Layers 1-9) was re-run with a corrected protocol.
- -No layer progression — trials did not converge each layer before starting the next
- -Copied trials — duplicate trial structures across layers
- -Self-assessment bias — scoring was not sufficiently independent from generation
- -Suspicious regularity — scores showed patterns inconsistent with genuine discovery
- -No cross-layer integration — layers were tested in isolation without verifying interactions
- -Pre-allocated trial ranges — trial numbers were assigned before results were known
Corrected protocol
The re-run addressed all six issues. This self-correction is documented as evidence of scientific rigor — the willingness to invalidate results and re-run with a better protocol is more valuable than getting it right the first time.
Data Access
All trial data is open source. Every convergence point, score, and failure mode count referenced on this page can be verified in the repository.