More Posts

Seeing the Forest for the Trees: Why AI-Generated Code Needs Better Task Decomposition

AI writes fast but decomposes work badly — producing massive PRs and drifting from the original objective. Here's what I've learned about front-loading decomposition, avoiding the retrofit problem, and building reconciliation into agentic workflows to keep tasks aligned with the broader epic.

Read post

Going Dark: Why I'm Automating My Software Factory From Ticket to Merge

I've reached the point where I trust my AI agent orchestration system more than I trust myself to stay consistent with repetitive code reviews. Here's why I'm implementing a fully automated development pipeline — and why going dark is the right move for the right projects.

Read post

Back to Sandstorm: Why I'm Abandoning Claude Directly for Multi-Agent Orchestration

After a month-long experiment working directly in Claude instead of my own agent orchestration system, I'd become a serial bottleneck in my own workflow — and the quality guardrails I'd built into Sandstorm quietly disappeared along the way. Here's why I'm recommitting to multi-agent orchestration.

Read post

The Future is Here: Automatically Optimizing API Endpoints with AI

What if your slowest API endpoints could fix themselves while you sleep? AI agents that monitor observability data, investigate slow endpoints, implement multiple optimization strategies, and A/B test them in production are only a few months away — and they'll fundamentally change how we think about performance work.

Read post

The Future of Bug Fixes: When Exception Remediation Costs Approach Zero

Today we ration our bug-fixing attention because investigation is expensive. When AI agents can triage, diagnose, and patch exceptions end-to-end, that cost collapses — and so does our tolerance for the long tail of unfixed bugs we've learned to live with.

Read post

The Orchestration Era Is Transitional: What Happens When AI No Longer Needs Our Scaffolding?

Planner agents, reviewer loops, memory systems, decomposition workflows — every team is reinventing the same patterns because models can't yet internalize them. But each model release absorbs more of that scaffolding, and the long-term advantage shifts from orchestration to context, evaluation, and integration.

Read post

When Your LLM Does Too Much: Moving from AI-Everything to Deterministic Workflows

After months of funneling every workflow step through Claude, telemetry showed my orchestration layer was drowning in unnecessary context. Moving ticket creation, PR generation, and stack startup into deterministic scripts dropped token usage sharply and kept the LLM where it actually shines.

Read post

The Hidden Cost of Token Bloat: What Telemetry Taught Me About AI Tool Optimization

After adding telemetry to Sandstorm, I discovered a compound MCP problem was silently ballooning context to 350,000 tokens per task. Migrating to skills cut it to 95,000 — a 70% reduction with no loss of functionality.

Read post

Back to Being a Bottleneck: Why I'm Okay With Slowing Down My AI Software Factory

After chasing parallel agent throughput, I slowed down to chase quality—and discovered the real bottleneck wasn't planning, it was context discovery. Why the harness, not the model, is the next problem to solve.

Read post

Token Limits: The Hidden Cost of Building Production-Grade AI Workflows

I built a sophisticated agentic AI workflow that produces amazing code—then hit token limits hard. Here's how I'm solving the economics of AI-assisted development with observability, model selection, and quality gates.

Read post

Quality Gates: Why Your AI Agents Are Only As Good As Your Tickets

After analyzing dozens of AI-generated pull requests, I discovered two critical quality gates that dramatically improve agent output — and built them into my workflow.

Read post

Building at the Frontier: How I Built My Own AI Agent Orchestrator and Finished a Two-Week Sprint in Three Days

I built Sandstorm Desktop, a cross-platform Electron app that orchestrates multiple AI agents through Docker containers — and used it to complete a two-week sprint in three days.

Read post

Sandstorm: Multi-Agent Containerization for the Rest of Us

How I'm building an open source tool that lets individual developers safely run multiple AI agents in parallel using Docker-based isolation — no cloud infrastructure required.

Read post

How I Rebuilt My i18n Libraries to Scale Across Four Programming Languages

What started as updating some stale Ruby gems evolved into a complete architectural rethink — separating translation data from implementation across Ruby, JavaScript, Go, and Rust.

Read post

The Death of Hand-Coding: Why 2026 Is the Year of AI Builder Factories

I haven't written code in two months. And I'm more productive than ever. We've entered the AI builder factory era — your job isn't writing code anymore, it's running the factory that produces it.

Read post

The Script Solution: Why I Make LLMs Write Code Instead of Doing the Work

LLMs aren't deterministic. Ask the same question twice and you'll get different answers. Here's a simple pattern that fixes that: stop asking the LLM to do the analysis — make it write the script instead.

Read post

How I Built a Battle Card Game in 30 Days Using Agentic AI

AI wrote 100% of the code. I did everything else. Here's what agentic AI development actually looks like in practice.

Read post

What I Mean When I Say Agentic AI Changes Everything

Most developers are using AI wrong. They're hand-holding it through every step. Agentic AI is different, and it's transforming what it means to be a software engineer.

Read post