Never Trust a Monkey: The Chasm, the Craft, and the Chain of AI-Assisted Code

JCON Europe 2026 April 22, 2026 Video Available

Slides

Video

Abstract

We’re in the middle of another leap in abstraction. Like compilers, cloud, and containers before it, AI coding agents arrived with hype, fear, and broken assumptions. We gave the monkeys GPUs. Sometimes they output Shakespeare. Other times, they confidently ship code that compiles, passes tests, and still does the wrong thing. The problem is the gap between what we mean and what actually runs. This talk delivers a practical framework for working with AI agents, built on three ideas: the Chasm between human intent and the code that actually runs, the Context that replaces guessing with grounding (APIs, conventions, constraints, domain rules), and the Chain that keeps intent alive through a structured flow from prompt to spec to test to code, where every step produces a verifiable artifact validated externally. Through interactive demonstrations and honest war stories, we’ll trace how intent gets lost and build the guardrails that prevent it. You’ll leave with a working model for AI-assisted development where humans own the meaning and machines do the typing. Trust your context. Trust your guardrails. Never trust a monkey.

Resources

Research — AI code quality

CodeRabbit: State of AI vs Human Code Generation (Dec 2025) — 1.7× more issues in AI PRs; +75% logic errors; 8× performance problems (470 real PRs)
Sonar: State of Code Developer Survey (January 2026) — 96% don’t fully trust AI output; only 48% always verify; AI = 42% of committed code
Stack Overflow 2025 Developer Survey — AI — 84% use AI, 46% actively distrust accuracy; 66% frustrated by “almost-right” AI code
Stack Overflow 2026 follow-up: Mind the Gap — the AI trust gap widens as adoption rises
Qodo: 2025 State of AI Code Quality — 60% say AI misses critical context; 1-in-5 suggestions contain factual errors
Sonar: Assessing the Quality and Security of AI-Generated Code (arXiv 2508.14727) — no correlation between Pass@1 test performance and overall code quality
Apiiro: Faster code, greater risks — 322% more privilege-escalation paths in AI code
METR: Task-Completion Time Horizons of Frontier AI Models — the “Moore’s Law for AI agents” chart

Never Trust a Monkey: The Chasm, the Craft, and the Chain of AI-Assisted Code

Slides

Video

Abstract

Resources

Research — AI code quality

The Intent Integrity Chain

Spec-driven development

Foundations

Baruch’s books

One more thing