A 678 KB Zig Binary, IRC, and $2/Day: One Dev's AI Agent...

What Happened

George Larson posted a Show HN this week that stopped the usual scrolling: an AI agent system running on a $7/month VPS, using IRC as its transport layer, built as a 678 KB Zig binary consuming roughly 1 MB of RAM. The project, called nullclaw, attracted 160+ points on Hacker News — not because of what it does (conversational AI assistant), but because of the architectural choices behind it.

The setup is a two-agent system split across separate machines. The public-facing agent, nullclaw, sits on the cheap VPS connected to an Ergo IRC server. Visitors interact with it through a gamja web client embedded directly in Larson's personal site. The private agent, ironclaw, handles email and scheduling tasks and is reachable only over a Tailscale mesh network using Google's Agent-to-Agent (A2A) protocol.

The entire public-facing binary is 678 KB and uses ~1 MB of RAM at runtime — numbers that would make most agent framework authors quietly close their laptop.

Why It Matters

The AI infrastructure discourse has spent the last two years scaling up. Vector databases, orchestration frameworks, GPU clusters, observability platforms — the default assumption is that running AI agents requires serious infrastructure. Larson's project is a pointed counterargument: for many agent use cases, the infrastructure you need already existed in 1993.

IRC is, by modern standards, a dead protocol. But that's exactly what makes it interesting as an agent transport layer. It's text-native, stateless per message, has mature client libraries in every language, supports channels and direct messages as natural routing primitives, and adds essentially zero overhead. There's no WebSocket upgrade negotiation, no JSON parsing of envelope formats, no SDK dependency — just raw TCP with a well-understood line protocol.

The choice of Zig for the binary deserves attention too. Where most agent projects reach for Python (convenient, slow, heavy) or TypeScript (convenient, moderate overhead), Zig produces a single static binary with no runtime dependencies. That 678 KB includes the IRC client, the API call logic, and whatever glue holds the agent behavior together. Deploying it means copying one file. Updating it means copying one file. There is no `node_modules`, no virtual environment, no container image to build.

Larson's tiered inference strategy is where the economics get interesting: Haiku 4.5 handles routine conversation at sub-second latency for pennies, while Sonnet 4.6 is invoked only when tool use is actually required. This isn't a novel idea — anyone running production AI has figured out model routing — but implementing it with a hard $2/day cap on a personal project demonstrates a discipline that most startups burning through API credits could learn from.

The Hacker News discussion predictably split into two camps. One group appreciated the minimalism and saw it as a template for personal AI assistants that don't require cloud-scale infrastructure. The other questioned why you'd use IRC instead of a more "modern" protocol, missing the point that IRC's simplicity *is* the feature, not a limitation to be apologized for.

The A2A Angle

The more architecturally significant detail is the inter-agent communication. Nullclaw (public) and ironclaw (private) talk to each other over Tailscale using Google's A2A protocol. This is a clean separation of concerns: the public agent handles conversation and can delegate tasks — email drafting, calendar queries — to the private agent without exposing the private agent to the internet.

Tailscale provides the encrypted mesh network (zero config, WireGuard under the hood), and A2A provides the agent-to-agent message format. This two-agent split — public-facing and private-side, connected by a secure tunnel — is a pattern worth stealing for anyone building agent systems that need to touch both public and internal resources.

Google's A2A protocol has been quietly gaining traction as an alternative to building custom agent communication layers. It defines a standard way for agents to discover each other's capabilities, exchange tasks, and handle long-running operations. Using it here, over Tailscale rather than the public internet, is a pragmatic choice: you get the interoperability benefits of A2A without the security headaches of exposing agent endpoints publicly.

What This Means for Your Stack

If you're building agent-based systems, this project is worth studying not for its specific technology choices but for its constraints-first approach.

Resource budgeting matters more than framework choice. A hard $2/day API cap forces you to think about model routing, conversation pruning, and when inference is actually necessary versus when a regex or lookup table would suffice. Most agent frameworks encourage you to throw everything at the most capable model. Larson's approach asks: what's the cheapest model that handles 90% of interactions adequately?

Transport layers should be boring. IRC works here because it's a solved problem. The agent doesn't need real-time streaming, binary payloads, or complex authentication flows. If your agent's transport layer requires its own ops team, you've made a category error. HTTP, WebSockets, IRC, even SMTP — pick the simplest protocol that meets your actual requirements.

The single-binary deployment model is underrated. In a world of Docker images, Helm charts, and Terraform modules, being able to `scp` a 678 KB binary to a $7 VPS and have it running in seconds is a competitive advantage for personal and small-team projects. Zig, Go, and Rust all enable this — and for agent workloads where the heavy computation happens on the API provider's side, your local binary barely needs to do anything.

Split your trust boundaries. The nullclaw/ironclaw split is good security hygiene. The public-facing agent has limited capabilities; the private agent with access to email and scheduling is never internet-exposed. This is defense in depth applied to agent architecture, and it costs nothing beyond running a second process.

Looking Ahead

Larson's project is a single-developer side project, not a production system serving thousands. But the architectural patterns — tiered inference, protocol minimalism, trust-boundary separation, hard cost caps — scale in ways that heavyweight agent frameworks often don't. As agent deployments move from demos to production, the teams that survive their API bills will be the ones that learned to ask "what's the minimum infrastructure that actually works?" before reaching for the orchestration framework. Sometimes the answer is a Zig binary, an IRC server, and protocols that were already battle-tested before most of today's AI engineers were born.

A 678 KB Zig Binary, IRC, and $2/Day: One Dev's AI Agent Stack

// tldr

// viewpoints

// deep dive

What Happened

Why It Matters

The A2A Angle

What This Means for Your Stack

Looking Ahead

// read from source

Show HN: I put an AI agent on a $7/month VPS with IRC as its transport layer

A 678 KB Zig Binary, IRC, and $2/Day: One Dev's AI Agent Stack

// tldr

// viewpoints

// deep dive

What Happened

Why It Matters

The A2A Angle

What This Means for Your Stack

Looking Ahead

// read from source

Show HN: I put an AI agent on a $7/month VPS with IRC as its transport layer

// share this