The Right 300 Tokens Beat 100k Noisy Ones
Abstract
Your agent has 100k tokens of context. It still forgets what you told it two messages ago. Prompt engineering taught us to craft the perfect instruction. Context engineering asks a different question: what does your model need to see and what should it never see at all? It’s the shift from writing prompts to designing context. In this talk, we dissect four antipatterns killing your agents and the architectural fixes that actually work: - The Stuffed Prompt – You crammed everything upfront and hoped for the best. Static context doesn’t scale. We explore dynamic loading and context refinement: fetching what’s needed when it’s needed. - The Wrong Tool for the Job – You picked one retrieval method and used it everywhere. RAG isn’t always the answer. We break down four wrong tools (similarity for correctness, static docs for process, LLM for determinism, scripts for reasoning) and the four right tools that match them. - The Goldfish Agent – Your agent forgets everything between sessions. Built-in memory gives you no versioning, no backup, and no portability across agents. We explore external memory you control. - The Vibes Eval – You shipped because it “felt right.” We build eval strategies that prove your context choices work or expose the tokens you’re wasting – and the things to watch out for: bleeding, leaking, and negative scenarios. We use a coding agent to explain these patterns so you learn how they work under the hood – but everything also applies to AI agents in general.
Resources
Talk Materials
- Demo Recordings (asciinema) – TBD repo
- Order Service Skill (.claude) – TBD repo
- Pidge Plugin (docs + rules + skills + scripts) – TBD repo
- Pidge Plugin on Tessl Registry
- Four-Antipattern Cheat Sheet
Context Engineering
Tools Used in Demos
- Claude Code – The coding agent used in all demos
- Pidge Plugin – Context plugin for the pidge notification library used in demos
- presenterm – Terminal-based presentation tool
Mentioned in the Talk
- Context7 – Static-docs retrieval service (the Wrong Tool antipattern’s “static docs when you need process” jab)
- Memento (Christopher Nolan) – Visual anchor for the Goldfish Agent antipattern
- Back to the Future – Mr. Fusion – Cold-open metaphor: 100k tokens of energy in, garbage out
- Anthropic Agent Skills announcement
- Model Context Protocol (MCP) – Plumbing, not a channel
Cheat Sheet
| Antipattern | Fix |
|---|---|
| Stuffed Prompt | Skills. Lazy-load context on demand. Treat them as context artifacts. |
| Wrong Tool | Four wrong tools, four right tools. Skills route to docs, scripts, rules. |
| Goldfish Agent | External memory you control. Versioned, backed up, portable across agents. |
| Vibes Eval | LLM generates rubrics. You review (watch for bleeding, leaking, negative scenarios). LLM judges. You assess the delta. |
The Four Wrong Tools
| Wrong | Why it fails |
|---|---|
| Similarity when you need correctness | RAG returns “topically similar” — version-blind |
| Static docs when you need process | Context7-style services give you reference, not how-to |
| LLM reasoning when determinism would do | Re-deriving the same math every turn, slightly differently |
| Script when you need reasoning | The regex trap — open-ended input, brittle code |
The Four Right Tools
| Right | When to use it |
|---|---|
| Versioned docs | When you need correctness — exact version, exact API |
| Skills | When you need process — procedural how-to wrapped around docs |
| Scripts | When you need determinism — pure functions, deterministic output |
| Reasoning | When you need judgment — open-ended classification, ambiguous input |
Context Artifact
The shippable bundle: Docs + Skills + Scripts + Rules.
Versioned. Tested. Distributed.