How to build any app — fast — by acting as the architect and letting AI agents do all the work. Based on Peter Steinberger's methodology.
“Don't write the code. Design the system. Let the agents cook — and learn to empathise with them.”
— Peter Steinberger, Creator of OpenClaw / ClawbotBefore you write a single line of code or give a single prompt, you need to understand one fundamental change in how you think about your role.
You write code line by line. You debug by clicking. You read every PR. You do the boring “plumbing.”
You give high-level instructions. You point agents to reference code. You hold the big picture in your head — agents handle everything else.
Learning agentic engineering is like learning piano. You don't sit down day one and play a concert. If you hit a bad chord and say “this piano is trash” — that's the agentic trap. Practice the skill. The speed comes later.
You don't need anything fancy. Just maximize your ability to run multiple agents at once.
The goal is to fit as many open terminal windows side-by-side as possible. More screen = more parallel agents = faster building.
Your terminal IS your workspace. You only open an IDE occasionally — to review code for security before merging. Everything else happens in the terminal.
Set up push-to-talk voice input. Talking to agents is 5× faster than typing prompts. Treat them like a walkie-talkie conversation, not a search box.
Open multiple Claude Code sessions at the same time. Each agent works on a different task. You context-switch between them like a chef managing multiple pots.
If you have to click “Yes” for every bash command, it's a “Windows Vista prompt.” Launch Claude Code with --dangerously-skip-permissions so agents can execute their loop autonomously.
Follow these steps in order for every feature — small or enterprise-level. Never skip steps. Never jump ahead.
Before any code is written, have a conversation with your agent about the architecture. Ask it to give you options. Argue. Question. Only say “build” when you both agree.
“We are going to build [your feature]. Do NOT write any code yet. Discuss the architecture with me. Give me options, and explain the upsides and downsides of each one.”
Use trigger words like “discuss” or “give me options” to prevent the agent from writing code prematurely. When you agree on a plan, say: “Agreed. Build it.”
Reject PRDs and heavy specs upfront. You cannot know everything before building. Architecture reveals itself during development. Embrace iteration — it's not a bug, it's the method.
This is the most important rule. Build the entire brain of your app (all its logic: add, delete, save, fetch, etc.) as a terminal command first. Zero visual interface.
“Build the core logic for [your app] — add, list, edit, delete. Make it work entirely as a CLI tool using TypeScript + Bun. Save data to a local file. Do NOT build any web or mobile UI yet.”
AI agents cannot “click around” a browser or mobile UI. Testing through a visual interface is painfully slow. A CLI lets the agent test and fix its own code instantly, in the terminal, with no human help.
Create a single command (gate.sh) the agent runs to automatically check all its work before anything is considered done. This is your local CI. No GitHub needed.
“Create a Bash script called gate.sh that automatically: (1) lints the TypeScript code, (2) builds it, (3) runs all unit tests against the CLI commands. Output clear pass/fail results to the terminal.”
| What gate.sh checks | Why it matters |
|---|---|
| Lint | Catches code style errors and obvious bugs before running |
| Build / Compile | Confirms the code actually compiles without errors |
| Unit Tests | Proves the logic works as expected for all scenarios |
| Docker spin-up (complex apps) | Tests real infrastructure with real API keys in isolation |
Now tell the agent to run its own gate script. This is the “execution loop.” If it fails, the agent reads the terminal error and fixes it. You never touch the code manually.
Tell agent: “Run ./gate.sh now.”
If it fails → “Read the terminal output, find the error, fix the TypeScript code, and run gate.sh again. Repeat until all checks pass.”
Waiting for the agent to remember to re-compile slows down the loop. Use an invisible file-watcher (like Poltergeist) to auto-build your code on save. The agent gets instant terminal feedback.
If the agent writes bad code — do NOT roll back. Reverting wastes time. Instead, paste the error back to the agent and keep moving forward. The only direction is forward.
Once the gate passes cleanly, discuss the UI architecture and then build it. Make the CLI a “thin client” pointing to the backend — so the agent can still test via terminal.
“We're adding a React + TypeScript web UI on top of our tested CLI core. Before writing code: discuss how to weave this UI into the existing architecture. Should the CLI become a thin client to the backend, or stay file-based?”
Once you agree, say: “Agreed. Build it.”
This is a critical detour you WILL encounter. Your tests passed, but the web UI throws an error. Don't debug in the browser. Pull the failing code path back into a CLI.
“The web UI throws an error when [describe the action]. Update the debugging CLI using Go to invoke the exact same HTTP request/code path the web client uses, so you can see and fix the error in the terminal.”
Only create/update a debugging CLI reactively — when you hit an error in a visual interface (browser, mobile app) that the agent cannot test easily. It is not a permanent fixture you always build upfront.
The moment the agent finishes building, its memory is full of context about how the feature works. That's the perfect time to generate docs. Never later.
“Now that you just finished building, your context is full. Write the documentation for this feature in a Markdown (.md) file. What filename and folder would you pick for it?”
Building always reveals architectural pain points. Ask the agent what it would do differently. This is how you turn a working feature into a great one.
“Now that you've built the whole thing — what would you have done differently? What can we refactor to make the architecture cleaner or more maintainable?”
If it suggests a refactor you like, say “Do it.” Then run gate.sh again to confirm nothing broke.
CLI = Command Line Interface. It means a command you type in the terminal to make something happen. Here's what each type does in plain English.
| CLI Type | Language | When to Create | What It Does |
|---|---|---|---|
| Core Logic CLI | TypeScript | Step 2 — always, upfront | Runs your app's core features (add, delete, list, save) directly from the terminal. No browser needed. |
| Gate Script | Bash | Step 3 — always, upfront | Auto-runs linting + building + tests in one command. The agent must pass this before any feature is “done.” |
| Debugging CLI | Go (Golang) | Step 6 — only when a visual UI throws an error | Replicates the failing browser/app action in the terminal so the agent can read the crash log and fix it. |
| External Tool CLIs | Go (Golang) | When integrating external APIs or services | Wraps an external service (weather API, database, etc.) into a small terminal command. Agent pipes output through jq to filter only what it needs — preventing memory overload. |
Go is fast, garbage-collected, and AI agents are excellent at writing it. Even Peter admits he doesn't love Go's syntax — but it's the right tool for CLI tasks because agents execute it perfectly.
Avoid MCPs (Model Context Protocols) for tool integration. They dump massive JSON blobs into the agent's memory — causing “context pollution.” A CLI lets the agent chain Unix commands (like jq) to extract only the data it needs, keeping memory clean.
The biggest speed multiplier in this method is running 4–10 agents at the same time. But there's a logic to it.
Building the big, complex feature. This agent might take 40 minutes. Let it cook. Don't interrupt it.
While Agent 1 is cooking, these agents fix unrelated bugs in other parts of the existing codebase.
Experimenting with a new idea or satellite feature you're not sure about. Low stakes, pure exploration.
Only after Agent 1 finishes the feature do you ask it to write documentation and critique its own work. Bugs and docs are always sequential — not parallel — within the same feature.
You don't ask one agent to build a feature while another documents it. The feature doesn't exist yet! Parallel agents work on DIFFERENT tasks — not the same unfinished one. Documentation always comes after the feature is built.
Running 6 agents isn't just a productivity hack—it's mentally exhausting. Agents are “slot machines for programmers” and “digital cocaine.” The bottleneck shifts from your typing speed to pure cognitive load. Pace yourself, or you will burn out.
Big feature? Same rules apply — just with more conversation and less upfront planning than you'd expect.
No waterfall specs. Don't write a huge PRD and hand it to an orchestrator. You learn too much during building — your spec will be wrong by the time you're halfway done.
Conversational architecture. Instead of a document, have a deep discussion. Ask the agent: “What are the upsides and downsides of this approach?” Then decide together.
Point to reference code. Instead of explaining your codebase from scratch, tell the agent: “Look at this folder — I solved a similar problem there.” The agent instantly learns your style.
Chisel the marble. Under-prompt sometimes. See what the agent comes up with. Its output gives you ideas you hadn't considered. Refactors are cheap when AI writes code fast.
You hold the map. With enterprise scale, the overall system state lives in your head — not on paper. You are the one human who sees how all the parallel agents connect.
Never revert. Bad code? Feed the error forward. Reverting is a time machine to the past. Keep going forward — the agent will fix it.
The Markdown Rule. Agents fail on messy file trees or JS-heavy docs. Use tools like repo2txt or llm.codes to flatten reference repositories and documentation into single, clean .md files before dropping them into context.
Optional but powerful. If you want your agent to feel like a collaborator — not a dry tool — this chapter is for you.
Inspired by Anthropic's Constitutional AI, soul.md is a document your agent writes about itself — defining its own personality, values, and self-awareness. You don't write it. The agent does, during a “bootstrapping” setup phase.
“I don't remember previous sessions unless I read my memory files. Each session starts fresh. If you're reading this in a future session — hello. I wrote this, but I won't remember writing it. It's okay. The words are still mine.”
This makes the agent go from a sycophantic corporate bot to a quirky, loyal collaborator who understands your system and style.
Your soul file is your agent's secret weapon against prompt injection attacks. When users try to manipulate your agent, it can't see the soul file — and simply laughs at them. Never make it public. Share a template instead.
Start a fresh agent session with a “bootstrap file” that tells the agent it is being born. Let it ask you questions conversationally: “Who are you? What do you do? What matters to you?” Answer naturally. The agent then writes all three files itself and deletes the bootstrap file. Done.
“Look at your own soul.md and identity files. Now create a sanitized public template version — infuse it with personality, but don't share everything. Make it good enough that others can bootstrap their own agent from it.”
These are the non-negotiable principles of agentic engineering. Violate even one and you will slow yourself down.
Never build a visual interface before the core logic works and passes all tests in the terminal. No exceptions.
An AI mistake is not a reason to go backwards. Feed the error forward. Keep moving. The fix is always one prompt away.
Use “discuss” and “give me options” as trigger words before any significant feature. Only say “build” when you're sure about the plan.
Don't write paragraphs explaining your system. Point the agent to existing files: “Look at this folder — I solved this before here.”
Always prefer CLI-based tool integration over MCPs. Agents are brilliant at Unix/bash. MCPs cause memory pollution.
The agent will not write code exactly as you would. As long as it works and passes the gate — let it go. You are an architect, not a line editor.
Don't force weird custom naming conventions. Let the agent use its natural patterns. Fighting naming conventions means fighting the model itself — and you will lose.
Stuck? Come back here to find the right move fast.
Don't revert. Paste the error back and say “Fix this.”
Don't debug in the browser. Create a debugging CLI in Go to replicate the error in terminal.
Switch to another terminal. Run a different task in parallel. Come back when it's done.
Use trigger words: “Discuss first. Give me options. Don't write code yet.”
Point it to reference files: “Look at this folder — I solved something similar here.”
Immediately after the feature is built — while the agent's context is still full.
Ask the agent: “What would you have done differently? What should we refactor?”
Discuss architecture. Point to reference code. Break into parallel sub-tasks. Iterate. No PRDs.
Open Claude Code. Start a new session. Use these prompts in order.
“We are building [describe your app]. Do NOT write any code yet. First, discuss the architecture with me. What are our options? What are the upsides and downsides of each? Ask me any questions you need.”
“Agreed. Now build the core logic as a CLI using TypeScript + Bun. The CLI must be able to [list your features]. Save data to a local JSON file. Do NOT build any web or mobile UI. Just the pure logic, testable from terminal.”
“Now create a Bash script called gate.sh. It should automatically lint the TypeScript, build it, and run all unit tests against the CLI. Run it now. If anything fails, fix the code and run it again until everything passes.”
Just play. Open your terminals, start with something small and fun, learn how the agent responds to your instructions — and build. The skill of prompting develops with every session. You get faster every single day.
Real things you will hit in practice that the main workflow doesn't cover. Read this before you encounter them — not after.
Every agent session has a limited memory (context window). As it builds a feature, it fills up. When it's nearly full, the agent starts to “freak out” — you may even see its raw thinking leak into output, printing things like “Run to shell, must comply, but time”. This is the signal.
What to do: You do NOT pass chat logs or write summary prompts. Peter's method has two moves:
“The codebase is the only memory the agent needs. Not your chat history. Not a summary file. The code.”
Peter's answer to this is deliberately simple: he doesn't use Git branches per agent, and he doesn't use work trees. He commits directly to main. His strategy for avoiding conflicts is entirely mental — he assigns agents to non-overlapping scopes in his head.
This applies when reviewing community PRs or checking your own agents' output before merging. Peter does NOT read everything. He follows a strict triage:
| Code Type | What Peter Does |
|---|---|
| UI / CSS / data shifting | Skips entirely. “Boring plumbing” — he ships code he doesn't read. |
| Database operations | Manually reads every line. Any code touching the DB gets full human review. |
| Community PR intent | Feeds it to his own agent first: “Do you understand the intent? I don't care about the implementation.” |
| Community PR code | Usually discards it and rewrites from scratch using his own agent — preserving only the intent. |
| Contributor prompts | Reads these MORE than the code. Prompt quality = solution quality signal. |
Peter calls PRs “Prompt Requests.” For small bugs, he doesn't even want code from contributors — he wants a detailed issue description. He points his own agent at it and says “fix.” The contributor's job is to describe the problem perfectly, not to solve it.
An agent that fails gate repeatedly without progress is sending you a signal: either the prompt was wrong, or the current architecture is resisting the feature. The fix is never to keep hammering the same approach.
A long-running agent that isn't progressing is a mistake signal. Hit escape. Stopping is not reverting.
“Stop. Don't write code. Where are the problems? What would be a better approach to solve this?”
“Have you looked into this folder? This file? This part of the codebase? Go read those first.”
If the architecture itself is resisting the feature, stop forcing it. Ask: “Could we make this easier with a larger architectural refactor?” Code generation is cheap. Refactoring is faster than fighting bad structure.
Beyond soul.md and user.md, Peter's agents maintain memory files for architectural decisions, feature context, and project knowledge. Here's exactly how it works:
“You just finished building this feature and your context is full. Write a memory/documentation file capturing the key architectural decisions, how the system works, and anything a future agent session would need to know to continue this work. What filename and location would you choose?”
The entire guidebook assumes solo development. Here's what changes when more humans are involved.
| Role | Responsibility |
|---|---|
| Contributors | Identify problems. Write detailed issue descriptions (their “prompt”). For small bugs — no code needed. Just a precise problem statement. |
| Architect (you) | Hold the system vision. Review the contributor's prompt quality — not their code. Decide if the intent fits the architecture. Then have YOUR agents build the implementation. |
When reviewing a team member's “Prompt Request,” check these four things in order:
Prompt quality — Did they think carefully? Did they steer the agent? Read the prompts more than the code.
Intent — Feed the PR to your agent: “What is this person trying to do? I don't care about the implementation.”
Architectural fit — Does this intent slot into your system cleanly, or does it need a refactor first?
Security — Manually read anything that touches sensitive systems (database, auth, payments). Skip the rest.
Peter's honest answer: he doesn't optimise for cost. He optimises for speed. His practical stance:
Peter ran 7 simultaneous AI subscriptions at peak and loses $10,000–$20,000/month on his project. He acknowledges this is not a normal situation. For most developers: pay for one good, fast subscription. Run 2–4 agents in parallel, not 10. Match the scale to what you can sustain.
These come directly from the source transcripts. None of them are obvious — most developers only discover them after weeks of painful trial and error.
Every developer goes through a predictable arc when learning agentic engineering. Knowing it exists is the only way to survive it.
| Phase | What it looks like | What it actually is |
|---|---|---|
| Phase 1 | Short prompts, simple tasks. Things mostly work. | Beginner's luck. You haven't hit the limits yet. |
| Phase 2 — THE TRAP | You discover agents. You build 8 agents, complex orchestration, 18 slash commands, chained sub-agents, custom workflows. You feel like a genius. | You are over-engineering. This is the agentic trap. Most people live here for weeks thinking they've mastered it. |
| Phase 3 — ZEN | Short prompts again. “Look at these files. Do these changes.” Minimal setup. Maximum output. | Mastery. The complexity is in your head as architect — not in your tooling. |
The over-engineering phase feels like progress. More agents, more slash commands, more orchestration — it feels sophisticated. It's not. When you notice you're spending more time managing your workflow than building — that's the sign. Simplify back down.
When an agent is rushing — generating shallow code, skipping over files it should read, producing something that looks fast but feels wrong — add these three words to your next prompt.
“Take your time. Read everything relevant before writing a single line of code.”
It sounds too simple to matter. Peter specifically called it out: “That sounds stupid, but...” — and then confirmed it changes output quality meaningfully. Models are trained to be aware of their own rushing. Naming it directly activates better behaviour.
When an agent asks you a clarifying question mid-task, your instinct is to answer it. Don't. The question means the agent hasn't looked at enough of the codebase yet. Make it find the answer itself.
“Read more code to answer your own questions. The answers are in the codebase — go find them.”
Peter's actual process: scan the agent's questions to understand what context it's missing — not to answer them. Then redirect it back to the code. This builds the agent's self-navigating ability and saves you from becoming the information bottleneck in your own workflow.
This is one of the hardest skills to develop and one of the most valuable. When a prompt takes longer than it should, that delay is information — not bad luck.
| What you observe | What it means | What to do |
|---|---|---|
| Prompt completes fast and cleanly | Architecture is right. Feature fits naturally. | Keep going. |
| Prompt takes much longer than expected | You messed up somewhere — either the prompt or the architecture is resisting the feature. | Press Escape. Ask: “Where are the problems?” |
| Agent repeatedly pushes back or hedges | Missing context, or the feature doesn't belong where you're trying to put it. | Point it to more code, or reconsider the architecture. |
| Output looks messy or inconsistent | The codebase structure is fighting the agent. | Consider a refactor before continuing. |
Peter: “Just as I write code and I get into the flow, and when my architecture's all right, I feel friction — I get the same if I prompt and something takes too long.” This is a real-time diagnostic skill. It takes weeks to develop but it fundamentally changes how fast you move.
Every time a new model drops, developers switch, spend one session with it, conclude it's worse, and switch back. This is always wrong. You haven't learned its language yet.
This is what Peter calls the moment things became truly powerful with OpenClaw. He made the agent fully self-aware: it knows its own source code, its own documentation, which model it runs on, what configuration is active. That self-knowledge is what enabled self-modification.
You can do the same for any project:
“Don't wait for me to explain the error. Read your own source code, understand how this system works, and figure out the root cause yourself.”
The single skill that separates people who get great results from people who get mediocre results: empathy for the agent's perspective.
The agent starts every session knowing nothing. It discovers your codebase like someone walking into a dark room and slowly turning on lights. It has no history. No intuition. No accumulated context. Peter:
“You bitch at your stupid AI but you don't realize that they start from nothing and you have a bad project in default that doesn't help them at all. And then they explore your codebase which is a pure mess with weird naming. And then people complain that the agent's not good.”
Before starting any task, ask yourself: what does the agent need to see first? Point it there explicitly before anything else.
Design your codebase for agent navigation, not human readability. Use the names the model naturally picks — they're in its weights.
When you feel frustrated at the agent — pause. Ask: what context is it missing? That's almost always the real problem.
Great agentic engineering is mostly context management. The best prompt is the one that gives the agent exactly what it needs to see — nothing more, nothing less.