Journal
Research Journal
Documenting the research as it happens — experiments, convergence data, and what I'm learning about making AI-generated software reliable.
·9 min
Two Roads to Deployment
A 19K-line pipeline failed 13 times. An 833-line command succeeded in 3.5 hours. Both approaches continue.
architecturelocal-modelscost-analysisforge
·8 min
The Measurement Problem
74 trials taught me that measuring AI code quality is at least as hard as generating it.
reflectionmethodologyscoring