Sandstorm: Multi-Agent Containerization for the Rest of Us

The most frustrating problems often hide in plain sight—and the best solutions come from scratching your own itch. Right now, I'm deep into building something that solves a problem I kept running into with AI-assisted development: how do you safely run multiple AI agents working on different tasks without them stepping on each other's toes or accidentally destroying your codebase? The answer is Sandstorm, an open source project I've been developing that makes multi-agent containerization simple enough for individual developers and small teams.

The problem I'm solving isn't just theoretical—it's something I've wrestled with across multiple projects in my software factory harness. When you're working with AI agents that can execute code, you want them running in "dangerous mode" where they can do what they need to do without constantly asking for permission. But how do you give them that freedom without risking your actual codebase? The traditional answer involves pushing containerized stacks up into virtualized cloud environments, which works great if you have enterprise-level resources. For individual developers or small teams like the ones I've led throughout my career, that approach is prohibitively expensive and unnecessarily complex.

The Multi-Agent Problem Nobody's Talking About

Here's the core challenge: you want to leverage AI agents to work on multiple tasks in parallel, but you need complete isolation between them. If Agent A is refactoring your authentication system while Agent B is adding a new API endpoint, you can't have them working on the same codebase instance. They'll create conflicts, overwrite each other's work, and create a mess that's harder to untangle than if you'd just done the work yourself. Beyond the code isolation issue, you also need each agent to have access to a full development environment—not just the code, but the entire stack so you can run tests, verify code quality, and ensure what the agent produces actually works.

The existing solutions I've seen fall into two camps, and neither quite fits the bill for most developers. The enterprise approach involves spinning up separate cloud-based virtualized environments for each agent, complete with full containerization and orchestration through Kubernetes or similar tools. This works, but it requires significant infrastructure investment and ongoing cloud costs that make it impractical for solo developers or small consultancies. The alternative is to run agents sequentially on a single environment, which defeats the entire purpose of having multiple agents—you lose all the parallel processing benefits that make AI-assisted development compelling in the first place. I needed something in between: lightweight enough to run on local development machines, robust enough to provide true isolation, and simple enough that you don't need a DevOps team to configure it.

How Sandstorm Solves the Isolation Problem

Sandstorm takes a different approach that prioritizes simplicity without sacrificing safety. The core concept is straightforward: when you need multiple agents to work on different tasks, Sandstorm spins up separate Docker stacks for each one, with each stack containing its own isolated codebase and a Claude instance that can execute on the assigned task. Think of it as creating temporary, disposable development environments that exist only for the duration of a specific task. Each stack has everything it needs—the full application code, all dependencies, database instances, whatever your Docker Compose setup defines—but it's completely isolated from every other stack.

The architecture uses an outer orchestrator pattern that I've refined through years of building distributed systems. You have one main Claude instance acting as the orchestrator, and when you say "work on these five tasks," it spins up five new containerized stacks and distributes the tasks to the Claude instances running within each stack. Each of those inner instances can operate in dangerous mode because they're working on isolated copies of your codebase—there's no way for them to affect your main repository or step on each other's work. This isolation extends to the entire runtime environment, which means you can run full test suites within each stack to verify code quality before anything gets promoted out of the sandbox.

What makes this practical for individual developers is that it all runs locally using Docker Compose, which most developers already have installed and understand. You're not paying for cloud compute time or learning a new orchestration system. You're just leveraging Docker's native containerization capabilities in a way that creates the isolation you need for parallel agent work. The resource footprint is reasonable because you're only running these stacks temporarily while tasks are being processed.

Built for Real-World Development Workflows

The development of Sandstorm has been driven by my actual projects, which has kept it grounded in practical needs rather than theoretical possibilities. I'm testing it across different repository structures because real development doesn't fit into neat boxes—some of my projects are monorepos, others use multi-repo structures, and each has its own Docker Compose configuration. The goal is for Sandstorm to be flexible enough to work with whatever setup you already have, rather than forcing you to restructure your projects around the tool.

The installation and usage workflow reflects the simplicity principle I've tried to maintain throughout my career. You run Sandstorm, it checks your Docker Compose setup, builds what it needs, and you're ready to go. The security model is intentionally conservative—you provide a GitHub read-only token so the agents running in the sandboxed environments have no ability to write directly to your repository. This means even if something goes wrong inside a container, the worst that happens is contained within that disposable environment. Your actual codebase remains untouched until you explicitly choose to integrate the work an agent has completed.

Where Sandstorm Stands Today

I want to be transparent about the current state of the project: this is early-stage open source software that's actively under development. It's functional and I'm using it successfully across different side projects right now, but it's not a polished, production-ready product. I'm still working through how to abstract the concept of multi-agent containerization in a way that works elegantly across different project types and development environments. Some workflows are smooth, others need refinement, and I'm actively iterating based on what I learn from using it in real development scenarios.

The code is available on my GitHub account in a repository called Sandstorm, and I'm developing it in the open because I think this problem affects more people than just me. If you've ever wished you could safely run multiple AI agents in parallel without risking your codebase or paying for expensive cloud infrastructure, this project might be relevant to your work. I'm particularly interested in feedback from developers working with different tech stacks and repository structures, because that real-world testing is what will make Sandstorm truly useful beyond my specific use cases.

The Bigger Picture: Making AI Development Accessible

Beyond the technical implementation, Sandstorm represents a broader principle I believe in: powerful development tools shouldn't require enterprise budgets. The companies pushing virtualized cloud-based agent environments are solving a real problem, but their solutions aren't accessible to the individual developers and small teams that make up most of the software development community. My experience founding a small software consultancy and leading remote teams taught me that the best tools are the ones that meet developers where they are—working with familiar technologies, fitting into existing workflows, and solving concrete problems without requiring massive infrastructure changes or additional service costs.

That's what I'm aiming for with Sandstorm. It's not trying to be an enterprise orchestration platform or compete with the big cloud-based solutions. It's trying to give individual developers and small teams a practical way to leverage multi-agent AI development safely and effectively. The containerization approach provides the isolation you need, the Docker-based architecture keeps it familiar and lightweight, and the orchestrator pattern creates a mental model that makes sense if you've ever worked with distributed systems.

Contributing and What's Next

The roadmap for Sandstorm is driven by making it work reliably across more diverse development environments and smoothing out the rough edges in the workflow. I'm focusing on improving how it handles different types of repository structures, making the setup process even more streamlined, and refining the orchestration layer to be more robust. There's also work to be done around error handling, logging, and giving developers better visibility into what's happening inside each containerized stack.

If you're interested in this problem space, I'd encourage you to check out the Sandstorm repository and see if it might be useful for your projects. Even more valuable would be trying it out and letting me know what works, what doesn't, and what use cases I haven't considered. Open source projects get better when they're tested against real-world scenarios by developers with different needs and perspectives. I've built this to solve my own problems, but I suspect many developers are encountering similar challenges as AI-assisted development becomes more prevalent.