Larson built a fully functional two-agent AI system as a 678 KB Zig binary using ~1 MB of RAM on a $7/month VPS, with IRC as the transport layer. His architecture demonstrates that mature, decades-old protocols like IRC — text-native, stateless, zero-overhead TCP — can replace modern WebSocket/JSON/SDK stacks for agent communication without sacrificing functionality.
The editorial argues that the AI infrastructure discourse has spent two years scaling up with vector databases, orchestration frameworks, and GPU clusters, when for many agent use cases the infrastructure needed already existed in 1993. Larson's project is framed as a pointed counterargument to the assumption that running AI agents requires serious infrastructure.
Larson uses Haiku 4.5 for sub-second conversational responses and only escalates to Sonnet 4.6 when tool use is required, enforcing a hard cap of $2/day on inference costs. This tiered approach, combined with the $7/month VPS hosting, demonstrates that personal AI agent systems can run at hobby-project budgets rather than requiring enterprise-scale spending.
The editorial highlights that where most agent projects reach for Python (convenient but slow and heavy) or TypeScript, Larson's choice of Zig produced a 678 KB binary with ~1 MB RAM usage. This architectural choice deserves attention as it challenges the default language assumptions in the AI agent ecosystem and shows that systems-level languages can deliver dramatically smaller resource footprints.
Larson's architecture splits agents across separate machines with a clear security boundary: the public-facing nullclaw on a VPS handles visitor conversations, while the private ironclaw handles sensitive email and scheduling tasks, reachable only over a Tailscale mesh network via Google's Agent-to-Agent protocol. This demonstrates A2A as a practical inter-agent communication standard for real deployments, not just a spec.
George Larson posted a Show HN this week that stopped the usual scrolling: an AI agent system running on a $7/month VPS, using IRC as its transport layer, built as a 678 KB Zig binary consuming roughly 1 MB of RAM. The project, called nullclaw, attracted 160+ points on Hacker News — not because of what it does (conversational AI assistant), but because of the architectural choices behind it.
The setup is a two-agent system split across separate machines. The public-facing agent, nullclaw, sits on the cheap VPS connected to an Ergo IRC server. Visitors interact with it through a gamja web client embedded directly in Larson's personal site. The private agent, ironclaw, handles email and scheduling tasks and is reachable only over a Tailscale mesh network using Google's Agent-to-Agent (A2A) protocol.
The entire public-facing binary is 678 KB and uses ~1 MB of RAM at runtime — numbers that would make most agent framework authors quietly close their laptop.
The AI infrastructure discourse has spent the last two years scaling up. Vector databases, orchestration frameworks, GPU clusters, observability platforms — the default assumption is that running AI agents requires serious infrastructure. Larson's project is a pointed counterargument: for many agent use cases, the infrastructure you need already existed in 1993.
IRC is, by modern standards, a dead protocol. But that's exactly what makes it interesting as an agent transport layer. It's text-native, stateless per message, has mature client libraries in every language, supports channels and direct messages as natural routing primitives, and adds essentially zero overhead. There's no WebSocket upgrade negotiation, no JSON parsing of envelope formats, no SDK dependency — just raw TCP with a well-understood line protocol.
The choice of Zig for the binary deserves attention too. Where most agent projects reach for Python (convenient, slow, heavy) or TypeScript (convenient, moderate overhead), Zig produces a single static binary with no runtime dependencies. That 678 KB includes the IRC client, the API call logic, and whatever glue holds the agent behavior together. Deploying it means copying one file. Updating it means copying one file. There is no `node_modules`, no virtual environment, no container image to build.
Larson's tiered inference strategy is where the economics get interesting: Haiku 4.5 handles routine conversation at sub-second latency for pennies, while Sonnet 4.6 is invoked only when tool use is actually required. This isn't a novel idea — anyone running production AI has figured out model routing — but implementing it with a hard $2/day cap on a personal project demonstrates a discipline that most startups burning through API credits could learn from.
The Hacker News discussion predictably split into two camps. One group appreciated the minimalism and saw it as a template for personal AI assistants that don't require cloud-scale infrastructure. The other questioned why you'd use IRC instead of a more "modern" protocol, missing the point that IRC's simplicity *is* the feature, not a limitation to be apologized for.
The more architecturally significant detail is the inter-agent communication. Nullclaw (public) and ironclaw (private) talk to each other over Tailscale using Google's A2A protocol. This is a clean separation of concerns: the public agent handles conversation and can delegate tasks — email drafting, calendar queries — to the private agent without exposing the private agent to the internet.
Tailscale provides the encrypted mesh network (zero config, WireGuard under the hood), and A2A provides the agent-to-agent message format. This two-agent split — public-facing and private-side, connected by a secure tunnel — is a pattern worth stealing for anyone building agent systems that need to touch both public and internal resources.
Google's A2A protocol has been quietly gaining traction as an alternative to building custom agent communication layers. It defines a standard way for agents to discover each other's capabilities, exchange tasks, and handle long-running operations. Using it here, over Tailscale rather than the public internet, is a pragmatic choice: you get the interoperability benefits of A2A without the security headaches of exposing agent endpoints publicly.
If you're building agent-based systems, this project is worth studying not for its specific technology choices but for its constraints-first approach.
Resource budgeting matters more than framework choice. A hard $2/day API cap forces you to think about model routing, conversation pruning, and when inference is actually necessary versus when a regex or lookup table would suffice. Most agent frameworks encourage you to throw everything at the most capable model. Larson's approach asks: what's the cheapest model that handles 90% of interactions adequately?
Transport layers should be boring. IRC works here because it's a solved problem. The agent doesn't need real-time streaming, binary payloads, or complex authentication flows. If your agent's transport layer requires its own ops team, you've made a category error. HTTP, WebSockets, IRC, even SMTP — pick the simplest protocol that meets your actual requirements.
The single-binary deployment model is underrated. In a world of Docker images, Helm charts, and Terraform modules, being able to `scp` a 678 KB binary to a $7 VPS and have it running in seconds is a competitive advantage for personal and small-team projects. Zig, Go, and Rust all enable this — and for agent workloads where the heavy computation happens on the API provider's side, your local binary barely needs to do anything.
Split your trust boundaries. The nullclaw/ironclaw split is good security hygiene. The public-facing agent has limited capabilities; the private agent with access to email and scheduling is never internet-exposed. This is defense in depth applied to agent architecture, and it costs nothing beyond running a second process.
Larson's project is a single-developer side project, not a production system serving thousands. But the architectural patterns — tiered inference, protocol minimalism, trust-boundary separation, hard cost caps — scale in ways that heavyweight agent frameworks often don't. As agent deployments move from demos to production, the teams that survive their API bills will be the ones that learned to ask "what's the minimum infrastructure that actually works?" before reaching for the orchestration framework. Sometimes the answer is a Zig binary, an IRC server, and protocols that were already battle-tested before most of today's AI engineers were born.
The stack: two agents on separate boxes. The public one (nullclaw) is a 678 KB Zig binary using ~1 MB RAM, connected to an Ergo IRC server. Visitors talk to it via a gamja web client embedded in my si
→ read on Hacker NewsTop 10 dev stories every morning at 8am UTC. AI-curated. Retro terminal HTML email.