AI Agents · 5 min read

Kimi K2.5's Agent Swarm: What Small Businesses Need to Know About RL-Powered AI

← Back to Blog
March 21, 2026 10 min read Tips

I Burned 185 Million AI Tokens in One Week — Here's What I Actually Built

185,621,347

That's the number of AI tokens I consumed between March 15 and March 21, 2026. One week. One person. Running an AI agent system out of a VirtualBox VM in Southern California.

I know how that sounds. "185 million tokens" sounds like the kind of flex you'd see in a LinkedIn post from someone selling a course. So let me be clear about what this is and isn't.

This isn't a hype piece. This is a receipt. I'm going to tell you exactly what I built, what it cost, what broke, and what I learned. Because the number isn't the interesting part — the workforce behind it is.

Stop Prompting. Start Delegating.

Most people use AI like a calculator. You type a question, you get an answer, you move on. That's fine for research or drafting an email. But it's a profoundly boring way to use something this powerful.

Here's what I do differently: I don't prompt. I delegate.

I have an AI orchestrator named Pepe running 24/7. Pepe doesn't generate content — it routes work. When I say "restructure the entire website," Pepe breaks that into tasks, assigns them to specialist agents, and runs them in parallel. A content agent writes the copy. An engineering agent builds the pages. A research agent pulls competitive data. All at the same time.

This is the shift that most people haven't made yet. They're still treating AI like ChatGPT — a single conversation with a single model. Meanwhile, I'm running what amounts to a small digital agency where the employees happen to be language models running on Zhipu AI's GLM-5.

The question isn't "how many tokens did you use?" The question is "what did your agents build while you were sleeping?"

The Receipt: What 185 Million Tokens Actually Produced

Let's get specific. Here's everything that shipped this week:

The PepeWebTech Website — Complete Rebuild

What started as a single landing page is now a six-page site: Home, Services, Pricing, About, Blog, and Contact. Mobile-first responsive design. Light and dark mode toggle. And 49 blog posts deployed across the blog — written, formatted in HTML, and pushed live.

This alone would be a two-month project at a traditional agency. The agents did it in days.

Mobile Design Audit — 30 Issues Squashed

I ran automated browser QA sessions. The agents opened the site in simulated iOS Safari, took screenshots, compared them against design specs, and found 30 separate UI issues. Overlapping elements, broken flexbox containers, touch target sizes too small for real fingers. All documented, all fixed, all verified with follow-up screenshots.

Stripe Payment Integration Research

Full technical guide for accepting payments on Vercel serverless functions. Authentication flows, webhook handling, error states, test card scenarios — the works. Not just "here's the docs link," but an actual implementation guide with code samples.

Warframe Knowledge Base — 48,551 Drops Parsed

This one's a passion project. A research agent pulled official game data and parsed 48,551 item drops across 3,612 items. Structured, organized, ready for a web app. If you've ever tried to make sense of Warframe drop tables manually, you know this is a significant data engineering task.

SoCal Market Research

Comprehensive analysis of the Southern California small business automation market. I'm talking salons, barbershops, smoke shops across Riverside, LA, San Diego, and San Bernardino. Pricing data, pain points, technology adoption rates, competitive landscape. The kind of research a consulting firm would charge $15,000 for.

Google CLI Tools Research

A 1,100+ line technical document on free Google tools for business automation. CLI tools, APIs, scripts — the stuff that lets you automate operations without paying for SaaS subscriptions.

Kimi K2.5 Deep Dive

Moonshot AI released their K2.5 model with agent swarm capabilities and reinforcement learning. A research agent pulled the paper, analyzed the architecture, and we published a full blog post on it within hours of the announcement.

Cost Policy and Business Planning

Full pricing tiers defined ($820 / $2,800 / $8,800), budget tracking system set up, cost projections modeled. The boring but essential work of actually running a business.

Where Did 185 Million Tokens Go?

Here's the approximate breakdown. Tokens aren't created equal — some tasks burn through them fast, others are surprisingly efficient.

Activity Est. Tokens % of Total
Website build (6 pages + blog) ~58M 31%
Blog content (49 posts) ~42M 23%
Browser automation & QA ~28M 15%
Market & competitive research ~22M 12%
Warframe data parsing ~18M 10%
Business planning & misc ~12M 6%
Orchestration overhead ~5.6M 3%

The orchestration overhead — the tokens burned just by the main agent coordinating, routing tasks, and managing state — is only 3%. That's the cost of having a project manager who never sleeps and never takes a salary.

The big consumers are no surprise: content generation and code are token-hungry. Browser automation is surprisingly expensive because screenshots get converted into multimodal tokens for visual analysis. Each QA session can burn millions.

The 30-Day Projection

At this rate, I'm tracking toward 742 to 795 million tokens per month. That's not a goal — it's just the arithmetic of running a multi-agent system full-time.

To put that in perspective: the average ChatGPT Plus user probably burns 1-5 million tokens per month doing casual queries. I'm doing 150-200x that volume. The difference isn't that I'm "using AI more" — it's that I'm using it fundamentally differently. These aren't chat conversations. They're parallel workstreams running simultaneously.

The Honest Part: What AI Still Sucks At

I wouldn't be credible if I only told you about the wins. So here's what broke this week.

Chrome Debugging Is a Nightmare

The agents can write code, deploy it, and even open it in a browser to verify. But when something goes wrong — and it always does with CSS — the debugging loop is painful. An agent takes a screenshot, identifies an issue, makes a fix, takes another screenshot, finds a new issue, makes another fix. Sometimes it takes 8-10 iterations for a single layout bug that a human developer would fix in 30 seconds by opening DevTools.

iOS Safari Is the Gift That Keeps Giving

The 30 mobile UI issues I mentioned? Almost all of them were iOS Safari quirks. The -webkit-overflow-scrolling behavior, the flexbox wrapping differences, the way Safari handles viewport units differently from Chrome. The agents don't intuitively know these edge cases — they have to discover them through testing, the same way a junior developer does. Just slower.

Context Windows Are Real

When you're running multiple agents in parallel and they all need to understand the full project context, you hit token limits fast. I had sessions where the orchestration agent had to summarize and compress context multiple times in a single day. Information gets lost. The agent forgets a decision that was made three hours ago. You learn to write everything down in files instead of trusting the agent's memory.

Creative Taste Requires a Human

The agents can write competent blog posts. They structure them well, hit the SEO keywords, format them correctly. But the voice? The specific Josue-ness of it? That requires editing. Not heavy editing — maybe 10-15% of the output — but enough that fully autonomous content isn't quite there yet for anything that needs personality.

The Business Case: What This Costs vs. Hiring Humans

Let's do the uncomfortable math. Everything I built this week, what would it cost with humans?

Deliverable Human Cost AI Cost
6-page website build $5,000-8,000 Model subscription
49 blog posts (1,500 words avg) $7,350-14,700 Included
Mobile QA audit $2,000-4,000 Included
Market research report $10,000-15,000 Included
Payment integration guide $2,000-3,500 Included
Data engineering (48K items) $3,000-5,000 Included
Total $29,350-50,200 ~$200/mo

Now, I want to be honest about those numbers. "Human cost" assumes hiring freelancers or an agency at fair rates. "AI cost" is just the model subscription. What it doesn't include is my time directing the agents, reviewing output, making strategic decisions, and fixing things when they break.

That time is real. I'd estimate I spent 3-4 hours per day actively managing the agent system. The agents don't replace me — they multiply me. One person with a good agent architecture can produce the output of a 5-8 person team. But you still need that one person to know what they're doing.

Why This Matters for Small Businesses

Here's the part I actually care about. I'm not writing this to flex. I'm writing this because every small business owner in Southern California — the salon owner in Riverside, the barbershop in San Bernardino, the smoke shop in LA — should know that this technology exists and it's not expensive.

You don't need a $50,000 consulting engagement to automate your business. You don't need to hire a team of five. You need an AI model subscription, some willingness to learn, and the mindset shift from "using AI" to "managing an AI workforce."

The tools are here. The models are good enough. The cost is less than your phone bill. The only barrier is understanding how to use them as more than a chat interface.

That's what I'm building at PepeWebTech. Not just websites — but the systems and knowledge that let any small business owner leverage AI the way I do.

What's Next

Week two starts now. The agent fleet is running. I've got more websites to build, more research to do, and a business to grow — all from a VirtualBox VM in my apartment.

The 185 million token number isn't the point. The point is that one person, with the right AI architecture, can compete with agencies ten times their size. And that's not a future prediction. That's this week's receipts.

Let's see what next week looks like.

Want This For Your Business?

PepeWebTech builds AI-powered websites and automation systems for small businesses. If you're in SoCal and tired of overpaying for basic tech, let's talk.

Get a Free Quote