The Right 300 Tokens Beat 100k Noisy Ones

Devoxx UK 2026 Video Coming Soon
A presentation at Devoxx UK in May 2026 in London, UK by Baruch Sadogursky

Abstract

Your agent has 100k tokens of context. It still forgets what you told it two messages ago. Prompt engineering taught us to craft the perfect instruction. Context engineering asks a different question: what does your model need to see and what should it never see at all? It’s the shift from writing prompts to designing context. In this talk, we dissect four antipatterns killing your agents and the architectural fixes that actually work: - The Stuffed Prompt – You crammed everything upfront and hoped for the best. Static context doesn’t scale. We explore dynamic loading and context refinement: fetching what’s needed when it’s needed. - The Wrong Tool for the Job – You picked one retrieval method and used it everywhere. RAG isn’t always the answer. We break down four wrong tools (similarity for correctness, static docs for process, LLM for determinism, scripts for reasoning) and the four right tools that match them. - The Goldfish Agent – Your agent forgets everything between sessions. Built-in memory gives you no versioning, no backup, and no portability across agents. We explore external memory you control. - The Vibes Eval – You shipped because it “felt right.” We build eval strategies that prove your context choices work or expose the tokens you’re wasting – and the things to watch out for: bleeding, leaking, and negative scenarios. We use a coding agent to explain these patterns so you learn how they work under the hood – but everything also applies to AI agents in general.

Resources

Talk Materials

Context Engineering

Tools Used in Demos

  • Claude Code – The coding agent used in all demos
  • Pidge Plugin – Context plugin for the pidge notification library used in demos
  • presenterm – Terminal-based presentation tool

Mentioned in the Talk

Cheat Sheet

Antipattern Fix
Stuffed Prompt Skills. Lazy-load context on demand. Treat them as context artifacts.
Wrong Tool Four wrong tools, four right tools. Skills route to docs, scripts, rules.
Goldfish Agent External memory you control. Versioned, backed up, portable across agents.
Vibes Eval LLM generates rubrics. You review (watch for bleeding, leaking, negative scenarios). LLM judges. You assess the delta.

The Four Wrong Tools

Wrong Why it fails
Similarity when you need correctness RAG returns “topically similar” — version-blind
Static docs when you need process Context7-style services give you reference, not how-to
LLM reasoning when determinism would do Re-deriving the same math every turn, slightly differently
Script when you need reasoning The regex trap — open-ended input, brittle code

The Four Right Tools

Right When to use it
Versioned docs When you need correctness — exact version, exact API
Skills When you need process — procedural how-to wrapped around docs
Scripts When you need determinism — pure functions, deterministic output
Reasoning When you need judgment — open-ended classification, ambiguous input

Context Artifact

The shippable bundle: Docs + Skills + Scripts + Rules.

Versioned. Tested. Distributed.