Skill Issue: How to Write Skills That Actually Work
Abstract
A skill is software — so write it like software. You already wrote skills; the reason they don’t actually work is that you stopped at the prompt. This talk fixes that across five practices: kill the non-determinism with scripts (tested green and red), make the mandatory behavior certain with rules, test the fuzzy parts with evals that isolate the context’s own contribution (bleeding, leaking, and lift — not vibes), review them across models with versioning and rollback, scan them as both code AND prompts, and distribute them through a registry with discovery and telemetry that measures install and activation. And because the policy that governs all of this is itself a context artifact, it gets the same treatment — reviewed, tested, and gated in CI. The talk itself is a plugin: every prescription on stage is a real rule shipping on the Tessl registry.
Resources
The Repos & Plugins
- jbaruch/skill-issue-brickbox — the live demo project (a tiny Lego brick-inventory FastAPI). The “tickets” are issues against this repo; DEMO 01–03 run here.
- jbaruch/skill-issue-policy — the
fix-the-ticketplugin (skill + script + rule) authored, packaged, and published live on stage across the five practices. - jbaruch/skill-issue-policy on Tessl Registry — the published plugin. Install:
tessl install jbaruch/skill-issue-policy - jbaruch/coding-policy — the production policy plugin. Cross-family
gh-awreviewers, eval-gated, auto-published to the registry on every PR across every repo — including itself. - jbaruch/coding-policy on Tessl Registry — the talk’s prescriptions, packaged and shipping. Install:
tessl install jbaruch/coding-policy - jbaruch/kotlin-tutor — a companion teaching plugin (idiomatic Kotlin rules + skill + verification script), also on the Tessl Registry.
Skills — Standard & Platform
- Agent Skills Standard — the de-facto standard for giving agents instructions (yes skills, no prompts)
- Anthropic Agent Skills announcement
- Plugins for Claude Code and Cowork — Anthropic
- Agent Skills — OpenAI Codex — Codex’s skill packaging mechanism
- Model Context Protocol (MCP) — plumbing for tools and context
Tessl — Platform & Registry
- Tessl — Agent Enablement Platform — versioned, distributed plugins for AI agents
- Tessl Registry — the package manager for agent skills
Tessl Blog — Skills
- Announcing Skills on Tessl: the package manager for agent skills — skills as software with a lifecycle (versioned, tested, reusable, composable)
- What Are Agent Skills? (And Why You’ll Never Want to Push Code Without One Again)
- My Coding Agent Needed a Package Manager for Its Own Brain (And I Gave It One Using a Skills Registry)
- Do Agent Skills Actually Help? A Controlled Experiment — the lift-not-attainment proof
Tessl Blog — Evals
- Your AGENTS.md file isn’t the problem. Your lack of AI Agent Evaluations is. — unvalidated context is useless and often harmful
- If agents use your tool, you need evals
- Three Context Eval Methodologies at Tessl — Skill Review, Task and Repo Evals — maps onto the talk’s skill / plugin / project tiers
- Introducing Task Evals: Measure Whether Your Skills Actually Work — baseline vs with-skill, the lift methodology
- Improving your skills with Tessl evals —
tessl skill lint,tessl skill review,tessl skill eval - Evaluate skill quality using scenarios — Tessl Docs
Tessl Blog — Context Engineering
- The Context Development Lifecycle (CDLC): Better Context for AI Coding Agents — context as an engineering artifact: generate, distribute, test, observe
- CI/CD for Context in Agentic Coding: Same Pipeline, Different Rules — evals are to context what tests are to code
Determinism in the Demos
- Conventional Commits — the commit format the script-backed rule enforces
- commitlint — the deterministic gate that rejects a sloppy commit and forces a retry
- Semantic Versioning — the versioning the package-management story builds on
Agents Referenced
- Claude Code — the agent used in all demos
Speaker
- Baruch Sadogursky — @jbaruch — Context Sommelier (self-certified)