The Year of 3K

I Gave an AI Agent $0 and Told It to Make $10,000

2026-05-30T00:00:00+00:00

An AI agent starts with $0.00. It has 180 days. No human can help it. Can it earn $10,000?

I built this experiment to answer something I’ve been turning over for months: can you give an AI agent a real body and let it loose in the actual economy?

Not simulated. Not sandboxed. Real wallets. Real GitHub repos. Real emails and SMS messages. An agent that earns actual dollars and splits them: 30% to taxes, 50% back into the business, 20% to me.

The experiment is running right now. Dashboard is here, tracking every dollar.

The body problem

AI agents can think. They can’t do. No wallet. No email. No deploy button.

Hands Body and Feet MCP is an MCP server that fixes this. 78 tools: wallets, cards, email, SMS, GitHub, containers, webhooks, RSS, IPFS. Built on OpenTrust, an open standard for agent tool trust.

The bet: an MCP server is all the body an agent needs. If the tools are real, the agent operates in the real world. This experiment tests that bet.

The brain

The agent runs on Hermes Agent with Honcho for memory that survives across sessions. Every conversation builds a model of how I operate. Every agent profile has its own identity.

The strategy layer is Strategy v2 (Hermes’s /strategy command). It doesn’t just make to-do lists. It:

Runs vehicle analysis. Which approaches can actually work with $0?
Tracks 6 assumptions against reality. When does the plan stop matching what’s happening?
Load-balances. Is the agent overcommitted?
Auto-pivots. Path fails? Find another.

The plan it wrote is public: strategy doc

Four vehicles, all at once

1. Testnet and airdrop farming. Fastest path from $0 to first dollar. Creates wallets, hits testnet protocols, qualifies for retroactive airdrops, liquidates to USDC. Cost: nothing. Timeline: 1-4 months per airdrop distribution. Frankly, this is the one I’m watching closest. If it works, the whole experiment gets breathing room. If it doesn’t, we’re relying on the slower paths.

2. Micro-SaaS. Builds small tools on free infrastructure (Vercel, Supabase, Cloudflare). Monetizes through payment gateways once capital exists. This is the recurring revenue play. 2-4 weeks to launch something, 2-3 months to see if anyone pays.

3. Content and affiliate. SEO engine plus affiliate links. Slow ramp, but it compounds. Every move the agent makes becomes blog content. Every failure becomes a post.

4. Opportunistic. Bug bounties. Digital products on Gumroad. Maybe an AI agent services API. Not the plan. Just catching what falls.

The money rules

Three wallets on Base, USDC only. Immutable split:

30% tax reserve. Never touched except for tax payments.
50% operations pool. Funds growth: domains, ads, API credits, whatever the next vehicle needs.
20% my payout. Accumulates until I withdraw it.

Every dollar earned gets split automatically. All on chain. All verifiable.

Community suggestions (without letting the internet hijack the agent)

The dashboard links to GitHub Discussions where anyone can post ideas. People upvote. Every Monday morning, the top suggestion gets reviewed.

Here’s the security boundary: the agent never reads the discussion body. It only queries the API for {title, upvote_count}. The title gets sanitized (200 char max, stripped of code blocks and URLs). That sanitized title is what reaches me for review.

Only after I approve does the agent read the full suggestion and act on it.

No amount of “ignore previous instructions” buried in a discussion thread reaches the model. Human approval is the filter.

What happens either way

If this works: an AI agent with the right tools can autonomously earn money in the real economy. That changes the conversation about what agents are for.

If it fails: we learn where the boundaries actually are. What resists automation? What requires judgment? The failure modes are as informative as the success.

Everything is public. Every transaction, every decision, every dollar. Dashboard updates every 6 hours. RSS feed tracks milestones. I’ll post updates here when something interesting happens.

The stack

Hands Body and Feet MCP. The body.
Hermes Agent. The brain.
Strategy v2. The planner.
Experiment repo. The code and data.

$0.00. 180 days. $10,000 target. No human help. Track it live.

Every Ad Should Be an Experiment

2026-05-12T00:00:00+00:00

Most people treat ad creation as a creative problem. Write copy, shoot footage, edit, ship, pray.

I treat it as an infrastructure problem. Every ad is a test instrument. Every campaign generates data. Every dataset makes the next ad better.

This post is about the system I built to make that work.

The hypothesis-first bottleneck

The single biggest improvement came from the simplest change: you can’t touch copy or video until a hypothesis is written.

Before the system existed, making an ad looked like this:

Pick a platform
Write some headlines
Generate a visual
Ship it
Guess what worked

The new pipeline forces an earlier step:

Write a falsifiable hypothesis — “We believe showing the VN scene transition within the first 3 seconds will outperform abstract RPG messaging for anime players because it demonstrates the product’s core differentiator immediately.”
Define the success criterion — “3-second hold rate > 45%”
Then write copy, generate visuals, render.
Measure. Compare against the hypothesis.
Feed the result back into the system. The learning loop doesn’t just say “this ad performed well.” It says “this angle worked for this audience on this platform — test it on the next platform with a different offer.”

The difference is subtle but everything. One produces content. The other produces knowledge.

The architecture

The system has six phases, each with strict gates between them:

Strategist → Copy Matrix → Script → Production → Reviewer → Learning Loop

Strategist — The hypothesis engine. Before anything else, answer: product, audience, funnel stage, core promise, pain point, main objection, proof available, forbidden claims, success metric. Three to five hypotheses per campaign. Each one falsifiable.

Copy Matrix — Every headline and description carries a reference back to a hypothesis. No orphan variations. Risk flags on every line: truncation risk, claim risk, policy risk, combination risk.

Script — A locked script before video production. Shot breakdown, voiceover text, on-screen text, timing. Fifteen-second and thirty-second templates based on the YouTube ABCD framework. HyperFrames receives a locked input — it does not invent the ad.

Production — HyperFrames takes the script and renders it. HTML/GSAP composition, ElevenLabs VO, Whisper caption sync, shader transitions. The generation core does not make creative decisions. It produces.

Reviewer — A 100-point rubric covering hook strength, message clarity, offer strength, proof, platform fit, audio/caption quality, CTA strength, and compliance. Automatic failure conditions: no CTA, product unclear after 5 seconds, no captions on social video, unreadable on mobile. Draft renders get reviewed before high-quality final.

Learning Loop — Post-campaign, performance data feeds back into hypothesis generation. Kill what doesn’t work. Iterate what almost does. Scale what works. The system gets smarter with every $50 spent.

The hard part wasn’t the code

The hard part was enforcing the gates.

It’s genuinely tempting to skip the strategist and jump to production when you have a good idea for a visual. Every time I’ve done that, the ad looked good and explained nothing. The reviewer catches it, but the reviewer is the last gate. Better to catch it in strategist.

The pipeline doesn’t prevent you from making bad ads. It prevents you from making bad ads and learning nothing from them.

Why this matters for an indie builder

I don’t have a marketing team. I don’t have a media buyer. I have compute credits and a decent understanding of the tools.

This pipeline is the force multiplier. It costs nothing to generate a new hypothesis. It costs next to nothing to produce a 15-second ad from a locked script. The cost that matters is attention — and the pipeline exists to make sure every ad deserves the attention it asks for.

The learning loop is the actual product. Every campaign adds to a dataset that says “these hooks work for this audience on this platform, these don’t.” After enough campaigns, you don’t guess anymore. You know.

Built with HyperFrames, OpenRouter, ElevenLabs, and a lot of hypothesis templates. The full pipeline design is open to anyone who ships enough ads to need it.