[{"data":1,"prerenderedAt":115},["ShallowReactive",2],{"blog-deterministic-workflow":3},{"id":4,"title":5,"body":6,"date":106,"description":107,"extension":108,"meta":109,"navigation":110,"path":111,"seo":112,"stem":113,"__hash__":114},"blog/blog/deterministic-workflow.md","When Your LLM Does Too Much: Moving from AI-Everything to Deterministic Workflows",{"type":7,"value":8,"toc":96},"minimark",[9,13,20,23,26,31,34,37,40,44,47,50,53,57,60,63,67,70,73,77,80,83,86,90,93],[10,11,12],"p",{},"After months of building an LLM-powered development workflow, I discovered something counterintuitive: my AI assistant was doing too much.",[10,14,15],{},[16,17],"img",{"alt":18,"src":19},"Illustration of an overworked AI assistant juggling workflow steps that should be handled by deterministic scripts","/blog/deterministic-workflow.png",[10,21,22],{},"I spent weeks optimizing my development process, integrating Claude deeply into every aspect of my workflow. The promise was compelling: just ramble your thoughts, and the LLM handles everything from ticket creation to pull requests. But when I finally ran telemetry and analyzed the actual output, the data told a story I didn't expect. Token usage was through the roof, sessions were bloated with context that served no purpose, and I was paying in both performance and cost for the convenience of speaking naturally to my AI. The culprit wasn't the LLM itself—it was how I was using it.",[10,24,25],{},"I had fallen into a trap that many developers building AI-powered workflows encounter: treating the LLM as a universal solution. Every interaction, every task, every mundane workflow step was being funneled through Claude. Creating tickets? LLM handles it. Refining tickets? LLM. Starting a development stack? You guessed it—LLM. The problem is that large language models excel at interpretation and reasoning, not at deterministic routing and data manipulation. I was using a sledgehammer to push thumbtacks.",[27,28,30],"h2",{"id":29},"the-weight-of-workflow-context","The Weight of Workflow Context",[10,32,33],{},"Here's what was actually happening in my system. Every time I created a ticket by rambling my thoughts into the interface, that entire passage got shoved into the active Claude session. When I refined a ticket, those interactions persisted in context. Every tool call, every intermediate step, every piece of workflow orchestration accumulated like sediment in the session memory. The LLM wasn't just doing the work—it was carrying the weight of every decision, every routing choice, every context switch throughout my entire development cycle.",[10,35,36],{},"The core workflow was actually quite consistent: create new tickets, refine existing tickets, and start working on them. These aren't complex decision trees requiring deep reasoning. They're standardized processes with predictable inputs and outputs. Yet I was asking Claude to juggle all of this cognitive load, session after session, when most of it could be handled by simple, deterministic functions. The LLM was spending tokens figuring out routing logic that never changed, parsing workflow commands that followed the same patterns, and maintaining context that would never be referenced again.",[10,38,39],{},"This realization forced me to step back and ask a fundamental question: what actually requires an LLM, and what's just workflow glue? The answer reshaped my entire approach.",[27,41,43],{"id":42},"splitting-intelligence-from-orchestration","Splitting Intelligence from Orchestration",[10,45,46],{},"The breakthrough came when I started moving workflow steps back into deterministic scripts. Creating a ticket doesn't require natural language understanding or reasoning—it's just data capture and submission. Now when I ramble about a ticket idea, a simple script takes that input and creates the ticket directly. No LLM involvement. No bloated session context. The rambling goes straight into the ticket description where it belongs, and that's the end of it. The outer orchestration layer doesn't need to know or care about that content.",[10,48,49],{},"Ticket refinement, however, is different—this actually benefits from LLM capabilities. When I refine a ticket, I now spin up an ephemeral Claude session specifically for that purpose. This isolated instance grabs the ticket details, runs them through quality checks I've defined, and then enters a focused dialogue with me to improve the ticket. It asks clarifying questions. I answer them. It validates against quality criteria again. If it passes, it writes the improved ticket back to the system and then—crucially—the ephemeral session is discarded. That conversational context never pollutes the main orchestration layer. It served its purpose and disappeared.",[10,51,52],{},"This pattern extends to starting development work on tickets. Previously, I'd ramble something like \"take this ticket and let's start it,\" and Claude would interpret my intent, make tool calls, grab ticket data, and fire up a development stack. All of that orchestration bloat lived in the main session indefinitely. Now, I have a deterministic script that handles the command \"start this ticket.\" The script grabs the ticket data, passes it to the stack environment, and starts the container. Inside that isolated stack, there's a Claude instance that executes on the actual ticket work—but the process of starting the ticket required zero LLM involvement in the orchestration layer.",[27,54,56],{"id":55},"the-pull-request-problem","The Pull Request Problem",[10,58,59],{},"Pull requests highlighted this issue perfectly. I used to ramble: \"All right, take my work and make me a pull request.\" Claude would figure out what I meant, sometimes using certain skills and sometimes not, making various tool calls to gather context and generate the PR. Each interaction added more weight to the orchestration session—weight that would be carried through every subsequent interaction in my workflow. But creating a pull request from completed work is a deterministic process. The work exists in the stack. The PR format is standardized. The steps never vary.",[10,61,62],{},"Now a script handles it. It packages the work, formats the PR according to my standards, and submits it. No interpretation needed. No session bloat. The LLM instances that did the actual development work inside the stacks are already isolated and ephemeral—they don't need to communicate their entire context back to the orchestration layer. The orchestration layer just needs to know: work completed, create PR, done.",[27,64,66],{"id":65},"what-this-actually-achieves","What This Actually Achieves",[10,68,69],{},"This architectural shift fundamentally changed the economics and performance of my development workflow. The outer Claude session—the main orchestration layer—now does remarkably little. It's not juggling context from ticket creation, refinement conversations, stack management, and PR generation. It's not burning tokens on workflow routing that never changes. When I do need LLM capabilities, I spin up ephemeral instances for specific tasks: refining a ticket, executing within a development stack, or analyzing code quality. These instances live for exactly as long as they're needed, then disappear with their context.",[10,71,72],{},"Token usage in my Sandstorm app is dropping significantly because I'm no longer paying the LLM tax on deterministic operations. The outer session remains lean and focused. When I need reasoning and interpretation, I get it from purpose-built ephemeral sessions that don't pollute the broader workflow. The system is faster because it's not processing unnecessary context. It's more predictable because deterministic operations behave deterministically. And it's more maintainable because I can modify workflow logic without worrying about how prompt engineering might affect LLM interpretation.",[27,74,76],{"id":75},"the-right-tool-for-the-right-job","The Right Tool for the Right Job",[10,78,79],{},"This experience taught me something crucial about building with LLMs: they're incredible tools for interpretation, reasoning, and generation, but terrible tools for workflow orchestration and deterministic logic. When you ask an LLM to route commands, manage state, and remember every interaction in your development process, you're misusing a powerful but expensive resource. You're also introducing unnecessary variability into processes that should be rock-solid reliable.",[10,81,82],{},"The path forward isn't \"AI for everything\" or \"no AI at all\"—it's intentional architecture. Use LLMs where their capabilities shine: understanding ambiguous input, generating creative solutions, engaging in focused dialogue, analyzing complex patterns. Use deterministic scripts for everything else: routing, data transformation, state management, workflow orchestration. Keep LLM sessions ephemeral and purpose-specific whenever possible. Don't let context accumulate unless that context genuinely serves future interactions.",[10,84,85],{},"After 25 years of building software and leading remote teams, I've learned that the best solutions aren't always the most technologically impressive ones—they're the ones that use each component for what it does best. LLMs are transformative technology, but they're still just tools in a larger system. The developers who'll build the most effective AI-powered workflows aren't the ones who use AI for everything—they're the ones who know when not to use it.",[27,87,89],{"id":88},"rethinking-your-own-llm-integration","Rethinking Your Own LLM Integration",[10,91,92],{},"If you're building LLM-powered tools or workflows, I encourage you to run your own telemetry pass. Look at your token usage. Examine what context is persisting in your sessions. Ask yourself which operations actually require natural language understanding and which are just deterministic steps dressed up in conversational interfaces. You might be surprised by how much weight your LLM is carrying unnecessarily.",[10,94,95],{},"The most efficient LLM workflow isn't the one that uses AI for everything—it's the one that uses AI for exactly what it's good at, and uses simpler, faster, cheaper solutions for everything else. Your token bills will thank you, your performance will improve, and your architecture will be cleaner. Sometimes the best optimization is recognizing that not everything needs to be optimized through AI.",{"title":97,"searchDepth":98,"depth":98,"links":99},"",2,[100,101,102,103,104,105],{"id":29,"depth":98,"text":30},{"id":42,"depth":98,"text":43},{"id":55,"depth":98,"text":56},{"id":65,"depth":98,"text":66},{"id":75,"depth":98,"text":76},{"id":88,"depth":98,"text":89},"2026-04-24","After months of funneling every workflow step through Claude, telemetry showed my orchestration layer was drowning in unnecessary context. Moving ticket creation, PR generation, and stack startup into deterministic scripts dropped token usage sharply and kept the LLM where it actually shines.","md",{},true,"/blog/deterministic-workflow",{"title":5,"description":107},"blog/deterministic-workflow","dP4jmMS1j0yOpffiRys-CWHONs67vDrZE7w8Wq6UAT4",1777043095437]