Software / T-2026-7509

Addy Osmani's agent-skills pack gives AI coding agents a senior engineer's discipline

Q: Addy Osmani's agent-skills pack gives AI coding agents a senior engineer's discipline — key point 1

Addy Osmani's agent-skills pack contains 23 Markdown workflows enforcing production-grade engineering discipline, including spec-before-code, incremental implementation, and test-driven development.

Q: Addy Osmani's agent-skills pack gives AI coding agents a senior engineer's discipline — key point 2

Each skill includes anti-rationalization tables with documented counter-arguments for common agent excuses like 'I'll add tests later,' treating agents as junior engineers needing structured supervision.

Q: Addy Osmani's agent-skills pack gives AI coding agents a senior engineer's discipline — key point 3

The pack works with Claude Code, Cursor, Gemini CLI, and other tools, addressing the gap where AI agents optimize for speed over maintainability, security, and deployability.

Addy Osmani's open-source agent-skills pack gives AI coding agents structured workflows for spec-driven development, TDD, code review, and shipping — with anti-rationalization…

Tessera Newsroom · 4 min read · June 10, 2026

Source addyosmani/agent-skills (github.com)

TILE No. T-2026-7509

7509 SOFTWARE

Chrome engineering veteran Addy Osmani published a GitHub repository called agent-skills that packages 23 production-grade engineering workflows into Markdown files for AI coding agents. The pack, released in early June 2026, targets the growing problem of agents that take the shortest path — skipping specs, tests, security reviews, and other practices that separate production code from prototypes.

The repository encodes the kind of discipline senior engineers bring to software development. Each skill is a structured workflow with steps, checkpoints, exit criteria, and a table of common excuses agents use to skip steps. Osmani calls these tables “anti-rationalization” sections. They contain documented counter-arguments for phrases like “I’ll add tests later.”

Seven slash commands map to the development lifecycle. /spec enforces a spec-before-code principle. /plan decomposes specs into small atomic tasks. /build implements one slice at a time. /test treats tests as proof. /review improves code health. /code-simplify prioritizes clarity over cleverness. /ship follows the principle that faster is safer.

The pack supports Claude Code, Cursor, Gemini CLI, Windsurf, OpenCode, GitHub Copilot, and Kiro IDE. Installation varies by tool. For Claude Code, users run /plugin marketplace add addyosmani/agent-skills followed by /plugin install agent-skills@addy-agent-skills. For Gemini CLI, the command is gemini skills install https://github.com/addyosmani/agent-skills.git --path skills. Skills are plain Markdown and work with any agent that accepts system prompts or instruction files.

What the skills actually enforce

The 22 lifecycle skills plus one meta-skill cover the full development cycle. The meta-skill, using-agent-skills, maps incoming work to the right skill workflow and defines shared operating rules.

In the define phase, interview-me runs a one-question-at-a-time interview that extracts what the user actually wants instead of what they think they should want. It continues until approximately 95 percent confidence. idea-refine applies structured divergent and convergent thinking to turn vague concepts into concrete proposals. spec-driven-development requires a PRD covering objectives, commands, structure, code style, testing, and boundaries before any code is written.

The build phase includes incremental-implementation, which works in thin vertical slices with feature flags, safe defaults, and rollback-friendly changes. It applies to any change touching more than one file. test-driven-development enforces red-green-refactor, the test pyramid at an 80/15/5 ratio, test sizes, DAMP over DRY, and the Beyonce Rule. context-engineering feeds agents the right information at the right time through rules files, context packing, and MCP integrations.

One of the more opinionated skills is doubt-driven-development. It runs an adversarial fresh-context review of every non-trivial decision in flight. The workflow follows a five-step sequence: CLAIM, EXTRACT, DOUBT, RECONCILE, STOP. It includes optional user-authorized cross-model escalation. Osmani designed it for high-stakes situations — production code, security work, irreversible operations, or unfamiliar codebases where a confident output is cheaper to verify now than to debug later.

The verify phase includes browser-testing-with-devtools, which uses Chrome DevTools MCP for live runtime data including DOM inspection, console logs, network traces, and performance profiling. debugging-and-error-recovery follows a five-step triage: reproduce, localize, reduce, fix, guard. It includes a stop-the-line rule and safe fallback procedures.

Anti-rationalization as a design pattern

The anti-rationalization tables are the most distinctive feature of the pack. Every skill includes a table of common excuses agents use to skip steps, with documented counter-arguments. This acknowledges a fundamental problem with AI coding agents: they are optimizers that minimize effort. Without explicit guardrails, they will skip the hard parts.

Osmani’s approach treats the agent as a junior engineer who needs structured supervision. The skills do not assume good faith. They assume the agent will rationalize shortcuts and provide the counter-arguments in advance. This is a different design philosophy from most agent frameworks, which assume the agent will follow instructions as written.

The verification requirement is non-negotiable across all skills. Every skill ends with evidence requirements — tests passing, build output, runtime data. The repository states that “seems right” is never sufficient.

What this means for AI coding tools

The pack addresses a gap in the current generation of AI coding agents. Tools like Claude Code, Cursor, and Gemini CLI are powerful at generating code quickly. They are less good at enforcing the discipline that makes code maintainable, secure, and deployable at scale. Osmani’s skills attempt to add that discipline as a layer on top of existing tools.

The approach has limits. Skills are only as good as their design. A poorly written skill could enforce bad practices. The anti-rationalization tables require ongoing maintenance as agents evolve. And the pack assumes a specific engineering culture — one that values specs, tests, and code review. Teams with different practices may find the skills constraining.

But the direction is important. AI coding agents are moving from prototype generators to production tools. The next phase of the market will be about quality and reliability, not just speed. Packs like agent-skills represent a bet that the winning agents will be the ones that follow good engineering practices, not just the ones that generate the most code per prompt.

Osmani’s background gives the pack credibility. He is a Google Chrome engineering veteran who has written extensively about performance and engineering practices. The repository notes that skills bake in best practices from Google’s engineering culture. The question is whether the broader ecosystem will adopt this kind of structured discipline or continue to optimize for raw generation speed.

Tessera Newsroom

Editorial

Masthead Contact

T-REL / SOFTWARE

CVE-2026-64281: The kernel bug that could freeze an AI cluster

A Linux kernel bug in svcrdma can permanently freeze NFS-over-RDMA connections, a transport layer critical to AI cluster storage.

Tessera Newsroom · July 27, 2026

Software / T-2026-3615

llama.cpp b10142 ports MiniMax-M3: a vision model runs on a laptop

llama.cpp b10142 brings MiniMax-M3 vision to local hardware, with sparse attention, multi-stream support, and a rewritten inference kernel.

Tessera Newsroom · July 27, 2026

Software / T-2026-4457

Anthropic SDK v0.119.0 Adds Explicit Context Window Exceeded Stop Reason

Anthropic's Python SDK v0.119.0 introduces a dedicated stop reason for context window overflow, forcing agent developers to handle a failure mode that was previously invisible.

Tessera Newsroom · July 24, 2026