- The Factuality LadderLLMs make stuff up. You can't fix that. But you can make the lies predictable, detectable, and blocked from spreading. Five rungs, one meta-pattern — no fine-tu
- Building ClawClamp: Autonomous AI Agents Without Losing SleepOpenClaw has 512 known vulnerabilities and wants access to your email, calendar, and code. I built a containment harness that starts with zero access and kills
- Your AI Tools Are Lying to You (And Each Other)I caught three AI models fabricating security reports — complete with real CVE numbers — for vulnerabilities they never looked up. 686 trials, nine models, and
- Structure Beats ScaleI let a model that costs a tenth of a cent per call review code from a model that costs twelve cents. Across 3,121 runs, the cheap reviewer made the expensive g
- The Only Limit Is NoticingI built a compliance-aware Kubernetes scheduler in a day. The hard part wasn't the code. It was seeing that the problem existed.
- What I Learned Running 19 Sprints With AI AgentsNine more sprints, a third project, and five falsification experiments later, I can tell you what actually matters. The answer is simpler — and more uncomfortab
- Today I Stopped Babysitting AI Agent Teams and Started Coaching ThemI built an encrypted task manager with AI agent teams. I wrote about it. People seemed interested in the workflow — the skills, the waves…
- How I Let Robots Build My Encryption AppAnd how it got me excited to be a Product Manager again
- What I write withWhen I think, I write. Most often with a pen, staining a piece of paper with ink. I have amassed a collection of pens, inks, and paper with…

Stevo's Projects
Stevo's Work
Here's the stuff without NDAs.