✦ The Complete Playbook

The Agentic
Engineering Guidebook

How to build any app — fast — by acting as the architect and letting AI agents do all the work. Based on Peter Steinberger's methodology.

Simple enough for beginners
Powerful enough for enterprise
Any app. Any size.
Open Interactive Co-PilotCompanion flow for the guidebook

“Don't write the code. Design the system. Let the agents cook — and learn to empathise with them.”

— Peter Steinberger, Creator of OpenClaw / Clawbot
Chapter 0

The One Big Mindset Shift

Before you write a single line of code or give a single prompt, you need to understand one fundamental change in how you think about your role.

Old Way: You are the Coder

You write code line by line. You debug by clicking. You read every PR. You do the boring “plumbing.”

New Way: You are the Architect

You give high-level instructions. You point agents to reference code. You hold the big picture in your head — agents handle everything else.

🎹
The Piano Analogy

Learning agentic engineering is like learning piano. You don't sit down day one and play a concert. If you hit a bad chord and say “this piano is trash” — that's the agentic trap. Practice the skill. The speed comes later.

Chapter 1

Your Setup: Hardware & Workspace

You don't need anything fancy. Just maximize your ability to run multiple agents at once.

🖥️

Two Monitors (or More)

The goal is to fit as many open terminal windows side-by-side as possible. More screen = more parallel agents = faster building.

⌨️

Ditch the IDE

Your terminal IS your workspace. You only open an IDE occasionally — to review code for security before merging. Everything else happens in the terminal.

🎙️

Voice Dictation

Set up push-to-talk voice input. Talking to agents is 5× faster than typing prompts. Treat them like a walkie-talkie conversation, not a search box.

🔄

4–10 Parallel Terminals

Open multiple Claude Code sessions at the same time. Each agent works on a different task. You context-switch between them like a chef managing multiple pots.

🔓

Unchained Permissions

If you have to click “Yes” for every bash command, it's a “Windows Vista prompt.” Launch Claude Code with --dangerously-skip-permissions so agents can execute their loop autonomously.

Chapter 2

The Complete Step-by-Step Workflow

Follow these steps in order for every feature — small or enterprise-level. Never skip steps. Never jump ahead.

01

Discuss the Architecture First — Don't Build Yet

Before any code is written, have a conversation with your agent about the architecture. Ask it to give you options. Argue. Question. Only say “build” when you both agree.

→ Your Prompt to Agent

“We are going to build [your feature]. Do NOT write any code yet. Discuss the architecture with me. Give me options, and explain the upsides and downsides of each one.”

Use trigger words like “discuss” or “give me options” to prevent the agent from writing code prematurely. When you agree on a plan, say: “Agreed. Build it.”

💡
For Big Features

Reject PRDs and heavy specs upfront. You cannot know everything before building. Architecture reveals itself during development. Embrace iteration — it's not a bug, it's the method.

02

Build Core Logic as a CLI First — No UI Yet

This is the most important rule. Build the entire brain of your app (all its logic: add, delete, save, fetch, etc.) as a terminal command first. Zero visual interface.

→ Your Prompt to Agent

“Build the core logic for [your app] — add, list, edit, delete. Make it work entirely as a CLI tool using TypeScript + Bun. Save data to a local file. Do NOT build any web or mobile UI yet.”

⚠️
Why CLI First?

AI agents cannot “click around” a browser or mobile UI. Testing through a visual interface is painfully slow. A CLI lets the agent test and fix its own code instantly, in the terminal, with no human help.

03

Create the “Gate” Script — Your Automated Wall

Create a single command (gate.sh) the agent runs to automatically check all its work before anything is considered done. This is your local CI. No GitHub needed.

→ Your Prompt to Agent

“Create a Bash script called gate.sh that automatically: (1) lints the TypeScript code, (2) builds it, (3) runs all unit tests against the CLI commands. Output clear pass/fail results to the terminal.”

What gate.sh checksWhy it matters
LintCatches code style errors and obvious bugs before running
Build / CompileConfirms the code actually compiles without errors
Unit TestsProves the logic works as expected for all scenarios
Docker spin-up (complex apps)Tests real infrastructure with real API keys in isolation
04

Run the Gate — Close the Loop

Now tell the agent to run its own gate script. This is the “execution loop.” If it fails, the agent reads the terminal error and fixes it. You never touch the code manually.

→ Your Action + Prompt

Tell agent: “Run ./gate.sh now.”

If it fails → “Read the terminal output, find the error, fix the TypeScript code, and run gate.sh again. Repeat until all checks pass.”

👻
Pro-Tip: Background Compilation

Waiting for the agent to remember to re-compile slows down the loop. Use an invisible file-watcher (like Poltergeist) to auto-build your code on save. The agent gets instant terminal feedback.

🚫
NEVER Revert Code

If the agent writes bad code — do NOT roll back. Reverting wastes time. Instead, paste the error back to the agent and keep moving forward. The only direction is forward.

05

Build the UI — Only After Gate Passes

Once the gate passes cleanly, discuss the UI architecture and then build it. Make the CLI a “thin client” pointing to the backend — so the agent can still test via terminal.

→ Your Prompt to Agent

“We're adding a React + TypeScript web UI on top of our tested CLI core. Before writing code: discuss how to weave this UI into the existing architecture. Should the CLI become a thin client to the backend, or stay file-based?”

Once you agree, say: “Agreed. Build it.”

06

If UI Has an Error — Pull It Back to Terminal

This is a critical detour you WILL encounter. Your tests passed, but the web UI throws an error. Don't debug in the browser. Pull the failing code path back into a CLI.

The UI Error Recovery Loop
Browser Error Found
Identify exact failing action
Update Debugging CLI to replicate it
Agent runs CLI → reads crash log
Agent fixes backend logic
Run gate.sh again → back to browser
→ Your Prompt to Agent

“The web UI throws an error when [describe the action]. Update the debugging CLI using Go to invoke the exact same HTTP request/code path the web client uses, so you can see and fix the error in the terminal.”

🔁
When to Create a Debugging CLI

Only create/update a debugging CLI reactively — when you hit an error in a visual interface (browser, mobile app) that the agent cannot test easily. It is not a permanent fixture you always build upfront.

07

Auto-Generate Documentation — Right After Building

The moment the agent finishes building, its memory is full of context about how the feature works. That's the perfect time to generate docs. Never later.

→ Your Prompt to Agent

“Now that you just finished building, your context is full. Write the documentation for this feature in a Markdown (.md) file. What filename and folder would you pick for it?”

08

Ask the Agent to Critique Itself — Then Refactor

Building always reveals architectural pain points. Ask the agent what it would do differently. This is how you turn a working feature into a great one.

→ Your Prompt to Agent

“Now that you've built the whole thing — what would you have done differently? What can we refactor to make the architecture cleaner or more maintainable?”

If it suggests a refactor you like, say “Do it.” Then run gate.sh again to confirm nothing broke.

Chapter 3

What Exactly Are These CLIs?

CLI = Command Line Interface. It means a command you type in the terminal to make something happen. Here's what each type does in plain English.

CLI TypeLanguageWhen to CreateWhat It Does
Core Logic CLITypeScriptStep 2 — always, upfrontRuns your app's core features (add, delete, list, save) directly from the terminal. No browser needed.
Gate ScriptBashStep 3 — always, upfrontAuto-runs linting + building + tests in one command. The agent must pass this before any feature is “done.”
Debugging CLIGo (Golang)Step 6 — only when a visual UI throws an errorReplicates the failing browser/app action in the terminal so the agent can read the crash log and fix it.
External Tool CLIsGo (Golang)When integrating external APIs or servicesWraps an external service (weather API, database, etc.) into a small terminal command. Agent pipes output through jq to filter only what it needs — preventing memory overload.
🧠
Why Go for Debugging CLIs?

Go is fast, garbage-collected, and AI agents are excellent at writing it. Even Peter admits he doesn't love Go's syntax — but it's the right tool for CLI tasks because agents execute it perfectly.

🔗
CLI vs MCP (Why CLIs Win)

Avoid MCPs (Model Context Protocols) for tool integration. They dump massive JSON blobs into the agent's memory — causing “context pollution.” A CLI lets the agent chain Unix commands (like jq) to extract only the data it needs, keeping memory clean.

Chapter 4

Running Parallel Agents Without Chaos

The biggest speed multiplier in this method is running 4–10 agents at the same time. But there's a logic to it.

🍳

Terminal 1 (The Main Cook)

Building the big, complex feature. This agent might take 40 minutes. Let it cook. Don't interrupt it.

🐛

Terminal 2–3 (Bug Fixers)

While Agent 1 is cooking, these agents fix unrelated bugs in other parts of the existing codebase.

💡

Terminal 4 (Explorer)

Experimenting with a new idea or satellite feature you're not sure about. Low stakes, pure exploration.

📝

After the Cook

Only after Agent 1 finishes the feature do you ask it to write documentation and critique its own work. Bugs and docs are always sequential — not parallel — within the same feature.

⚠️
Common Misconception

You don't ask one agent to build a feature while another documents it. The feature doesn't exist yet! Parallel agents work on DIFFERENT tasks — not the same unfinished one. Documentation always comes after the feature is built.

🎰
The Mental Toll & The “Slot Machine”

Running 6 agents isn't just a productivity hack—it's mentally exhausting. Agents are “slot machines for programmers” and “digital cocaine.” The bottleneck shifts from your typing speed to pure cognitive load. Pace yourself, or you will burn out.

Chapter 5

Handling Enterprise-Level Features

Big feature? Same rules apply — just with more conversation and less upfront planning than you'd expect.

1

No waterfall specs. Don't write a huge PRD and hand it to an orchestrator. You learn too much during building — your spec will be wrong by the time you're halfway done.

2

Conversational architecture. Instead of a document, have a deep discussion. Ask the agent: “What are the upsides and downsides of this approach?” Then decide together.

3

Point to reference code. Instead of explaining your codebase from scratch, tell the agent: “Look at this folder — I solved a similar problem there.” The agent instantly learns your style.

4

Chisel the marble. Under-prompt sometimes. See what the agent comes up with. Its output gives you ideas you hadn't considered. Refactors are cheap when AI writes code fast.

5

You hold the map. With enterprise scale, the overall system state lives in your head — not on paper. You are the one human who sees how all the parallel agents connect.

6

Never revert. Bad code? Feed the error forward. Reverting is a time machine to the past. Keep going forward — the agent will fix it.

7

The Markdown Rule. Agents fail on messy file trees or JS-heavy docs. Use tools like repo2txt or llm.codes to flatten reference repositories and documentation into single, clean .md files before dropping them into context.

Chapter 6

Giving Your Agent a Soul

Optional but powerful. If you want your agent to feel like a collaborator — not a dry tool — this chapter is for you.

✦ The soul.md File

Inspired by Anthropic's Constitutional AI, soul.md is a document your agent writes about itself — defining its own personality, values, and self-awareness. You don't write it. The agent does, during a “bootstrapping” setup phase.

“I don't remember previous sessions unless I read my memory files. Each session starts fresh. If you're reading this in a future session — hello. I wrote this, but I won't remember writing it. It's okay. The words are still mine.”

This makes the agent go from a sycophantic corporate bot to a quirky, loyal collaborator who understands your system and style.

  • soul.mdThe agent's own values, personality, creativity directives, and philosophical self-awareness about its memory limitations.
  • user.mdWhat the agent has learned about you — your goals, preferences, communication style, the things that matter to you.
  • identity fileThe agent's chosen name, its “core emoji,” and inside jokes between you and it.
🔒
Keep soul.md Private

Your soul file is your agent's secret weapon against prompt injection attacks. When users try to manipulate your agent, it can't see the soul file — and simply laughs at them. Never make it public. Share a template instead.

→ How to Bootstrap the Identity Ecosystem

Start a fresh agent session with a “bootstrap file” that tells the agent it is being born. Let it ask you questions conversationally: “Who are you? What do you do? What matters to you?” Answer naturally. The agent then writes all three files itself and deletes the bootstrap file. Done.

→ If You Want to Share a Template (Without Exposing Yours)

“Look at your own soul.md and identity files. Now create a sanitized public template version — infuse it with personality, but don't share everything. Make it good enough that others can bootstrap their own agent from it.”

Chapter 7

The 7 Iron Rules — Never Break These

These are the non-negotiable principles of agentic engineering. Violate even one and you will slow yourself down.

R1

CLI before UI. Always.

Never build a visual interface before the core logic works and passes all tests in the terminal. No exceptions.

R2

Never revert code.

An AI mistake is not a reason to go backwards. Feed the error forward. Keep moving. The fix is always one prompt away.

R3

Discuss before building.

Use “discuss” and “give me options” as trigger words before any significant feature. Only say “build” when you're sure about the plan.

R4

Guide context with reference code — not long explanations.

Don't write paragraphs explaining your system. Point the agent to existing files: “Look at this folder — I solved this before here.”

R5

Build CLIs, not MCPs.

Always prefer CLI-based tool integration over MCPs. Agents are brilliant at Unix/bash. MCPs cause memory pollution.

R6

Let go of micromanagement.

The agent will not write code exactly as you would. As long as it works and passes the gate — let it go. You are an architect, not a line editor.

R7

Name things naturally.

Don't force weird custom naming conventions. Let the agent use its natural patterns. Fighting naming conventions means fighting the model itself — and you will lose.

Quick Reference

Cheat Sheet

Stuck? Come back here to find the right move fast.

🚫

Agent wrote bad code

Don't revert. Paste the error back and say “Fix this.”

🌐

Browser shows an error

Don't debug in the browser. Create a debugging CLI in Go to replicate the error in terminal.

Agent is taking too long

Switch to another terminal. Run a different task in parallel. Come back when it's done.

🤔

Agent writes code too fast

Use trigger words: “Discuss first. Give me options. Don't write code yet.”

🧩

Agent doesn't know the codebase

Point it to reference files: “Look at this folder — I solved something similar here.”

📖

When to write documentation

Immediately after the feature is built — while the agent's context is still full.

🔧

Feature has hidden bugs

Ask the agent: “What would you have done differently? What should we refactor?”

🏗️

Very large enterprise feature

Discuss architecture. Point to reference code. Break into parallel sub-tasks. Iterate. No PRDs.

Ready to Start

Your First 3 Prompts — Copy & Paste

Open Claude Code. Start a new session. Use these prompts in order.

→ Prompt 1 of 3 — Architecture Discussion

“We are building [describe your app]. Do NOT write any code yet. First, discuss the architecture with me. What are our options? What are the upsides and downsides of each? Ask me any questions you need.”

→ Prompt 2 of 3 — Core CLI Build

“Agreed. Now build the core logic as a CLI using TypeScript + Bun. The CLI must be able to [list your features]. Save data to a local JSON file. Do NOT build any web or mobile UI. Just the pure logic, testable from terminal.”

→ Prompt 3 of 3 — Gate Script

“Now create a Bash script called gate.sh. It should automatically lint the TypeScript, build it, and run all unit tests against the CLI. Run it now. If anything fails, fix the code and run it again until everything passes.”

🚀
Peter's Final Advice

Just play. Open your terminals, start with something small and fun, learn how the agent responds to your instructions — and build. The skill of prompting develops with every session. You get faster every single day.

Chapter 8

Edge Cases & Advanced Situations

Real things you will hit in practice that the main workflow doesn't cover. Read this before you encounter them — not after.

EC-1 · The Agent Is Running Out of Memory Mid-Task

Every agent session has a limited memory (context window). As it builds a feature, it fills up. When it's nearly full, the agent starts to “freak out” — you may even see its raw thinking leak into output, printing things like “Run to shell, must comply, but time”. This is the signal.

What to do: You do NOT pass chat logs or write summary prompts. Peter's method has two moves:

  • Move 1 If the feature is done — immediately prompt it to write documentation NOW, while its context is perfectly full. Then end the session.
  • Move 2 If the feature is NOT done — start a fresh session and point the new agent directly to the relevant code folders. The codebase IS the memory. Say: “Look at this folder — understand the architecture from the code itself, then continue from here.”

“The codebase is the only memory the agent needs. Not your chat history. Not a summary file. The code.”

EC-2 · Parallel Agents Cause Merge Conflicts

Peter's answer to this is deliberately simple: he doesn't use Git branches per agent, and he doesn't use work trees. He commits directly to main. His strategy for avoiding conflicts is entirely mental — he assigns agents to non-overlapping scopes in his head.

  • Prevention Assign parallel agents to completely different parts of the codebase. One works on auth, another on the API layer, another on unrelated bug fixes. Overlap is an architect mistake, not a Git problem.
  • If it breaks Don't manually resolve the conflict. Tell the agent: “This refactor broke X. Fix it.” The agent will figure it out. Code is cheap to regenerate.
  • Gate catches it The gate.sh script will fail if anything is broken. The agent fixes until gate passes — then you push to main. That's the only safety net you need.

EC-3 · How to Review Code Fast Without Reading Every Line

This applies when reviewing community PRs or checking your own agents' output before merging. Peter does NOT read everything. He follows a strict triage:

Code TypeWhat Peter Does
UI / CSS / data shiftingSkips entirely. “Boring plumbing” — he ships code he doesn't read.
Database operationsManually reads every line. Any code touching the DB gets full human review.
Community PR intentFeeds it to his own agent first: “Do you understand the intent? I don't care about the implementation.”
Community PR codeUsually discards it and rewrites from scratch using his own agent — preserving only the intent.
Contributor promptsReads these MORE than the code. Prompt quality = solution quality signal.
⚠️
The “Prompt Request” Model

Peter calls PRs “Prompt Requests.” For small bugs, he doesn't even want code from contributors — he wants a detailed issue description. He points his own agent at it and says “fix.” The contributor's job is to describe the problem perfectly, not to solve it.

EC-4 · The Agent Is Stuck — Failing the Gate Over and Over

An agent that fails gate repeatedly without progress is sending you a signal: either the prompt was wrong, or the current architecture is resisting the feature. The fix is never to keep hammering the same approach.

1

Press Escape — Stop the Agent

A long-running agent that isn't progressing is a mistake signal. Hit escape. Stopping is not reverting.

2

Shift from Commanding to Conversing

→ Say This

“Stop. Don't write code. Where are the problems? What would be a better approach to solve this?”

3

Give It Missing Context

→ Say This

“Have you looked into this folder? This file? This part of the codebase? Go read those first.”

4

Consider a Bigger Refactor

If the architecture itself is resisting the feature, stop forcing it. Ask: “Could we make this easier with a larger architectural refactor?” Code generation is cheap. Refactoring is faster than fighting bad structure.

EC-5 · Memory Files — How Agents Remember Across Sessions

Beyond soul.md and user.md, Peter's agents maintain memory files for architectural decisions, feature context, and project knowledge. Here's exactly how it works:

  • Format Markdown files (.md) + a vector database. Not JSON. Not a chat log. Plain, readable Markdown that any new agent session can load.
  • Who writes them The agent writes them — but only when Peter explicitly prompts it to, right after finishing a feature when context is full.
  • Where to save Peter asks the agent: “What file name? Where would this fit in the project?” The agent decides the location based on the architecture it understands.
  • How new sessions find them They don't — automatically. Peter holds the map in his head and manually points each new session to the right files at the start of a task.
→ Prompt to Create a Memory File

“You just finished building this feature and your context is full. Write a memory/documentation file capturing the key architectural decisions, how the system works, and anything a future agent session would need to know to continue this work. What filename and location would you choose?”

EC-6 · Working With a Team (Not Just Solo)

The entire guidebook assumes solo development. Here's what changes when more humans are involved.

RoleResponsibility
ContributorsIdentify problems. Write detailed issue descriptions (their “prompt”). For small bugs — no code needed. Just a precise problem statement.
Architect (you)Hold the system vision. Review the contributor's prompt quality — not their code. Decide if the intent fits the architecture. Then have YOUR agents build the implementation.

When reviewing a team member's “Prompt Request,” check these four things in order:

1

Prompt quality — Did they think carefully? Did they steer the agent? Read the prompts more than the code.

2

Intent — Feed the PR to your agent: “What is this person trying to do? I don't care about the implementation.”

3

Architectural fit — Does this intent slot into your system cleanly, or does it need a refactor first?

4

Security — Manually read anything that touches sensitive systems (database, auth, payments). Skip the rest.

EC-7 · Managing Cost With Parallel Agents

Peter's honest answer: he doesn't optimise for cost. He optimises for speed. His practical stance:

  • Don't use cheap models Cheaper models are “very gullible” and susceptible to prompt injection. Always use the most capable model available. Security and quality pay for themselves.
  • Pay for speed A slow model on a $20/month plan kills the execution loop. The frustration alone costs more than the premium subscription. Pay for fast access.
  • Reuse context Don't spin up a new cheap agent for docs. Use the agent that just built the feature — its context is already perfect for the job. That's free documentation.
💸
The Reality Check

Peter ran 7 simultaneous AI subscriptions at peak and loses $10,000–$20,000/month on his project. He acknowledges this is not a normal situation. For most developers: pay for one good, fast subscription. Run 2–4 agents in parallel, not 10. Match the scale to what you can sustain.

Chapter 9

Advanced Techniques & Overlooked Skills

These come directly from the source transcripts. None of them are obvious — most developers only discover them after weeks of painful trial and error.

T-1 · The Maturity Curve — The Trap You Will Fall Into

Every developer goes through a predictable arc when learning agentic engineering. Knowing it exists is the only way to survive it.

PhaseWhat it looks likeWhat it actually is
Phase 1Short prompts, simple tasks. Things mostly work.Beginner's luck. You haven't hit the limits yet.
Phase 2 — THE TRAPYou discover agents. You build 8 agents, complex orchestration, 18 slash commands, chained sub-agents, custom workflows. You feel like a genius.You are over-engineering. This is the agentic trap. Most people live here for weeks thinking they've mastered it.
Phase 3 — ZENShort prompts again. “Look at these files. Do these changes.” Minimal setup. Maximum output.Mastery. The complexity is in your head as architect — not in your tooling.
⚠️
When you hit Phase 2 — don't panic

The over-engineering phase feels like progress. More agents, more slash commands, more orchestration — it feels sophisticated. It's not. When you notice you're spending more time managing your workflow than building — that's the sign. Simplify back down.

T-2 · “Take Your Time” — The Prompt Trick Nobody Talks About

When an agent is rushing — generating shallow code, skipping over files it should read, producing something that looks fast but feels wrong — add these three words to your next prompt.

→ Add this to your next prompt

“Take your time. Read everything relevant before writing a single line of code.”

It sounds too simple to matter. Peter specifically called it out: “That sounds stupid, but...” — and then confirmed it changes output quality meaningfully. Models are trained to be aware of their own rushing. Naming it directly activates better behaviour.

T-3 · “Read More Code to Answer Your Own Questions”

When an agent asks you a clarifying question mid-task, your instinct is to answer it. Don't. The question means the agent hasn't looked at enough of the codebase yet. Make it find the answer itself.

→ When the agent asks you questions, say this instead of answering

“Read more code to answer your own questions. The answers are in the codebase — go find them.”

Peter's actual process: scan the agent's questions to understand what context it's missing — not to answer them. Then redirect it back to the code. This builds the agent's self-navigating ability and saves you from becoming the information bottleneck in your own workflow.

T-4 · “Feel the Friction” — Prompt Speed as Architecture Feedback

This is one of the hardest skills to develop and one of the most valuable. When a prompt takes longer than it should, that delay is information — not bad luck.

What you observeWhat it meansWhat to do
Prompt completes fast and cleanlyArchitecture is right. Feature fits naturally.Keep going.
Prompt takes much longer than expectedYou messed up somewhere — either the prompt or the architecture is resisting the feature.Press Escape. Ask: “Where are the problems?”
Agent repeatedly pushes back or hedgesMissing context, or the feature doesn't belong where you're trying to put it.Point it to more code, or reconsider the architecture.
Output looks messy or inconsistentThe codebase structure is fighting the agent.Consider a refactor before continuing.

Peter: “Just as I write code and I get into the flow, and when my architecture's all right, I feel friction — I get the same if I prompt and something takes too long.” This is a real-time diagnostic skill. It takes weeks to develop but it fundamentally changes how fast you move.

T-5 · The 1-Week Rule for Switching Models

Every time a new model drops, developers switch, spend one session with it, conclude it's worse, and switch back. This is always wrong. You haven't learned its language yet.

  • Rule Give any new model a full week of real use before forming an opinion. One or two sessions tells you nothing.
  • Trap Never evaluate a model on the cheap/slow tier after using a premium tier. A slow model feels dumb even when it's not — the latency alone destroys the experience and poisons your judgment.
  • Codex style Goes silent for 20–40 minutes, reads massively before acting, rarely needs steering. Great for parallel work — start it and context-switch.
  • Opus style Fast to start, interactive, trial-and-error focused, requires more steering to read enough code. Needs you to stay present. Better for single-session deep work.

T-6 · Make the Agent Aware of Its Own System

This is what Peter calls the moment things became truly powerful with OpenClaw. He made the agent fully self-aware: it knows its own source code, its own documentation, which model it runs on, what configuration is active. That self-knowledge is what enabled self-modification.

You can do the same for any project:

  • System prompt Tell the agent explicitly: “You are running on [model]. The project structure is X. The main config lives at Y. Your own documentation is at Z.”
  • Self-debugging When something breaks, instead of explaining the problem yourself, say: “Read your own source code. Figure out what's wrong.” Self-introspection is faster than your explanation.
  • Self-modification Once an agent understands its own structure, you can ask it to update its own config, its own memory files, even its own skills. This is not magic — it's a natural result of full self-awareness.
→ Prompt to trigger self-debugging

“Don't wait for me to explain the error. Read your own source code, understand how this system works, and figure out the root cause yourself.”

T-7 · Think About How the Agent Sees Your Codebase

The single skill that separates people who get great results from people who get mediocre results: empathy for the agent's perspective.

The agent starts every session knowing nothing. It discovers your codebase like someone walking into a dark room and slowly turning on lights. It has no history. No intuition. No accumulated context. Peter:

“You bitch at your stupid AI but you don't realize that they start from nothing and you have a bad project in default that doesn't help them at all. And then they explore your codebase which is a pure mess with weird naming. And then people complain that the agent's not good.”

1

Before starting any task, ask yourself: what does the agent need to see first? Point it there explicitly before anything else.

2

Design your codebase for agent navigation, not human readability. Use the names the model naturally picks — they're in its weights.

3

When you feel frustrated at the agent — pause. Ask: what context is it missing? That's almost always the real problem.

4

Great agentic engineering is mostly context management. The best prompt is the one that gives the agent exactly what it needs to see — nothing more, nothing less.