Agentic Software Delivery: How I Ship Features Without Writing Code

What if you could describe what you want built, and an organized team of AI agents would plan it, build it, review it, and validate it — all while you focus on what matters?

Over the past several months, I've been developing a method for delivering software that fundamentally changes the role of the engineer. Instead of writing code line by line, I describe what needs to be built, and an orchestrated system of AI agents handles the how — from generating requirements documents, to creating implementation plans with reviewable milestones, to executing tasks in parallel with built-in quality gates.

I call this approach Agentic Software Delivery, and in this post I'm going to walk you through every step of how it works. This isn't theory. This is what I actually do, every day, to ship production software.

The Big Picture

At a high level, here's the lifecycle:

💡

Idea

You describe it

📋

PRD

AI interviews you

🗺️

Plan

Peer-reviewed

⚡

Execute

Parallel agents

✅

Validated

Production-ready

That's the simple version. Each of those steps contains real depth — product thinking, architectural reasoning, quality gates, and hard-won lessons about what actually works when you hand execution to AI agents. Before I show you the tooling, let me walk you through the philosophy behind each step.

From Messy Ideas to Product Vision

The best features don't start with a spec. They start with a conversation.

Sometimes I'm talking out loud — pacing around my office, working through an idea in real time with Claude. Sometimes it's a voice memo from a brainstorm with a colleague where we riffed on what could be possible. Sometimes it's me typing furiously into a Claude prompt at 11pm, brain-dumping fragments of thought as fast as they come — half-formed user stories, architectural hunches, competitive observations, things that annoy me about existing solutions.

None of this is structured. None of it is "ready." And that's the point.

What I've learned is that the messy, fragmented beginning is where the real insight lives. The trick is having a thinking partner who can take that raw material and help you find the signal. Claude doesn't just accept my brain dump and start coding. Instead, it pushes back. It asks clarifying questions. It finds the contradictions in my thinking. It surfaces assumptions I didn't realize I was making.

"You mentioned this is for coaches, but you also described a parent-facing dashboard. Are these the same user or two different personas with different needs?"

"You said performance is critical, but you're describing real-time updates across hundreds of concurrent users. Have you thought about whether this is a WebSocket architecture or if polling with smart caching would be simpler and sufficient?"

This back-and-forth — sometimes 10 or 15 rounds — is where a vague idea crystallizes into a real product vision. Not a feature list. A vision: the problem we're solving, the opportunity in front of us, and the value we're delivering to actual humans.

Formalizing the PRD

Once the vision is clear, we formalize it into a Product Requirements Document. But even here, the PRD isn't just a list of features to build. It's a document that captures the why alongside the what:

What a PRD captures

🎯

Problem & Opportunity

What pain are we solving? Why now?

👥

Users & Personas

Who is this for? What do they need?

📐

Product Principles

Visual appeal, simplicity, delight

⚙️

Functional Requirements

What the system must do

🛡️

Non-Functional Requirements

Performance, security, accessibility

📏

Success Metrics & Scope

How we measure, what's out

The product principles matter more than most people realize. When I'm building something, I need the downstream agents — the ones writing code, designing interfaces, making architectural choices — to understand what kind of product this is. Is it a power-user tool where speed and information density matter? Is it a consumer app where visual delight and simplicity are non-negotiable? These principles cascade through every decision that follows, from the tech stack to the component library to the error handling patterns.

Before we move on from the PRD, we make sure we haven't lost the thread. The problem we're solving, the opportunity we're seizing, the value we're delivering to customers — all of that needs to be front and center, not buried under feature lists.

Turning Vision Into an Executable Plan

Now comes the part where most AI-assisted development goes wrong.

Most people take a vague idea, paste it into a prompt, and say "build this." What you get back is code that technically runs but doesn't cohere — no architecture, no phasing, no understanding of what matters first and what can wait. It's like handing someone a pile of lumber and saying "make a house" without blueprints.

The implementation plan is the blueprint.

We take the PRD — with its vision, requirements, and principles — and transform it into a plan that could be picked up by a team of developers at any moment. To get there, I ask Claude to look at the problem from multiple perspectives:

A technical architect who understands how to design the systems, APIs, and data models needed to realize the product
A product designer who can translate our product principles into concrete UX decisions that deliver real delight to customers
A tech lead who ensures the plan is thorough and clear enough for her team to execute successfully — adding a sprinkle of low-level reality to a plan that's currently living at a high level

The result is an implementation plan organized into milestones, each containing phases, each containing tasks. And every task reads like a well-composed Jira story: not just the what, but — critically — the why, and a set of acceptance criteria that can prove the task has been completed correctly and successfully.

Delivering Incremental Value

This is where careful planning pays off. Each task in the plan should deliver value, either:

Intrinsic value — foundational work that benefits the architecture or developers. Standing up infrastructure, creating interfaces, implementing a framework, establishing patterns.
Extrinsic value — features a user can actually experience. Screens they can see, workflows they can complete, capabilities they can use.

We plan phases so that each one delivers something meaningful. Not "Phase 1: set up everything, Phase 2: build everything, Phase 3: test everything." Instead, each phase produces a working increment that someone could look at and say "yes, this is moving in the right direction."

Staying in the Loop

This incremental delivery model unlocks something critical: I can interject at any moment.

Between any task, any phase, any milestone — I have an opportunity to course correct. "That design isn't quite right, let me spend some time giving you feedback so you can adjust." "Looks like I described this incorrectly, let's fix it." "Check the test coverage for me." "Explain the architecture so I can make sure it's correct."

And those course corrections aren't lost. Depending on the complexity and blast radius, they can be folded back into the implementation plan as formal updates, or handled on the fly as ad-hoc adjustments. A typo in a label? Fix it inline and move on. A fundamental misunderstanding of the data model? That goes back into the plan so every downstream task reflects the correction.

This is one of the most important checkpoints in the entire process. I'm still in control when I need to be. I'm still monitoring the flight performance, watching outputs, validating decisions. Claude provides the autopilot — but I'm the pilot. I decide when to take the stick.

Dependencies are mapped explicitly, and we optimize aggressively for parallelization. If Task A and Task B don't depend on each other, they should be executable at the same time. This is where agentic delivery starts to pull away from traditional development — we can actually act on that parallelism.

Why a Single Markdown Document

I keep the implementation plan as a single markdown document inside the repository where the product is being delivered. This is a deliberate choice.

A single document represents a product roadmap that can be analyzed in a single operation. An agent can read the entire plan, understand where we are, what's done, what's next, and what's blocked — all in one context load. Compare that with Jira or Linear, where loading the equivalent roadmap requires dozens of API calls to fetch boards, sprints, stories, subtasks, and comments before the agent can even begin to reason about what to do next.

That said, Jira and Linear exist for good reason — they're where the human overlords live. Product managers need burndown charts. Leadership wants dashboards. Teammates need to know what's in flight.

So we pair the implementation plan with Jira or Linear stories by referencing the external ticket directly in each task within the plan. This gives us the best of both worlds: an agent-optimized document for execution, and a business-friendly view for everyone else. And yes, we can use agents to keep the two in sync — but that's a topic for another post.

Here's what a single task looks like inside the plan:


markdown
Task 2.3.1: Real-Time Notification System

Story: ENG-247
Status: IN PROGRESS
Priority: P0
Size: L (5-7 days)
Dependencies: Task 2.1.3 (WebSocket infrastructure),
Task 2.2.1 (User preferences)
Parallel: Can run alongside Task 2.3.2 (Email digest)
and Task 2.3.3 (Mobile push)
Requirement: PRD §4.2 — Users receive immediate feedback
when teammates interact with their content

Description: Build the server-side notification pipeline
and client-side notification center that delivers real-time
updates when collaborators comment, share, or modify shared
documents. This is the foundation that email digest (2.3.2)
and mobile push (2.3.3) build on top of.
Implementation Details:

Notification Data Model: Create Notification table
with polymorphic event_type (comment, share, edit,
mention), recipient, sender, read/unread status, and
timestamps. Index on recipient + unread for fast queries.
Event Pipeline: Listen for domain events
(CommentCreated, DocumentShared, UserMentioned) and
fan out Notification records to all collaborators,
excluding the actor who triggered the event.
WebSocket Delivery: Push new notifications to
connected clients via existing WebSocket infrastructure
(Task 2.1.3). Include catch-up mechanism for notifications
missed while disconnected.
Notification Center UI: Slide-out panel (desktop) or
bottom sheet (mobile) with grouped notifications, unread
badge, mark-as-read, and virtual scrolling for users with
hundreds of notifications.

Acceptance Criteria:

 CommentCreated produces notification for document
owner and all collaborators (excluding the commenter)
 UserMentioned produces notification for the mentioned
user, even if they aren't a collaborator yet
 Notifications delivered via WebSocket within 2s of the
triggering event
 Catch-up delivers missed notifications on reconnect
(max 100, paginated)
 Notification center displays unread count badge
(caps display at "99+")
 "Mark all read" completes in <500ms for 1,000+
notifications
 Duplicate notifications prevented (idempotency key on
event_type + target + recipient)
 Fan-out for a 50-person document completes in <5s
 Mobile bottom sheet has 44px+ touch targets

Testing:

Unit tests for event handlers and fan-out logic (90% target)
Integration tests for WebSocket delivery and reconnection
Load test: 50 concurrent recipients, verify <2s delivery
E2E: Create comment → notification appears in
collaborator's panel within 2s

Observability:

Track notification pipeline latency (event → delivery)
Monitor WebSocket delivery success rate (target >99.5%)
Alert on fan-out failures or delivery latency >5s

There's a lot to unpack there, but the key things to notice are:

Every task traces back to a requirement in the PRD and a ticket in your project tracker. The agent knows why it's building this, not just what.
Implementation details give the agent enough architectural direction to make sound decisions without over-constraining it.
Acceptance criteria are concrete, testable, and provable — including performance targets and edge cases. There's no ambiguity about when a task is "done."
Testing and observability are first-class concerns, not afterthoughts. The agent knows it needs to write tests and instrument the code as part of the task itself.
Dependencies and parallelization are explicit, so the agent (and the orchestration layer) can reason about what's safe to run concurrently.

Parallel Execution: Where It Gets Fun

Now we have a plan. Time to build.

I have to give Geoffrey Huntley↗ a lot of credit here. Reading about his Ralph Wiggum-style↗ approach to agentic coding — spinning up independent agents that each chew through a piece of work — was a major inspiration. You'll see a ton of similarities in what follows, and his writing on the subject is well worth your time.

The execution phase works like this:

Execution Loop

Read the plan

Analyze the implementation plan. Find the current milestone. Resume any in-progress tasks.

Identify the next priorities

Find the highest-priority tasks whose dependencies are satisfied and that can run in parallel.

Spin up isolated worktrees

Each task gets its own git worktree — an independent copy of the repo. No conflicts, no stepping on each other.

Implement, review, validate

Each agent implements the task, runs reviews and security checks, and validates every acceptance criterion.

Merge and advance

Completed worktrees merge back. The plan is updated. Repeat from step 1.

The key insight is step 4. We're not just automating the creation of code. We're automating the safety and quality gates of software development that engineers have come to trust over the last couple of decades. Code reviews. Security reviews. Architectural consistency checks. Test coverage analysis. These aren't optional nice-to-haves — they're the guardrails that keep agentic delivery from becoming "AI-generated code that nobody checked."

And we're constantly improving those gates. Every time we find a pattern of mistakes, we add a check. Every time a security issue slips through, we tighten the review. The system gets smarter over time because the prompts and review criteria evolve with what we learn.

From Philosophy to Practice

That's the method — discovery, planning, execution — with human oversight woven throughout. Before we move on, let's zoom back out and see the full picture one more time, now that you have a feel for the depth behind each phase:

Phase 1

Discovery

Brainstorm with Claude

Crystallize product vision

Formalize the PRD

Phase 2

Planning

Draft implementation plan

Multi-agent peer review

Iterate until approved

Phase 3

Execution

Spawn parallel worktrees

Implement, test, review

Merge & advance plan

The question you're probably asking at this point: "This sounds great in theory, but how do you actually do all of this consistently, project after project, without burning out on prompt engineering?"

That's exactly the problem I ran into. For months, this workflow lived as a collection of carefully crafted prompts. I was copy-pasting the same patterns, adjusting the same review criteria, re-establishing the same agent roles at the start of every session. The philosophy was solid, but the operational overhead was real.

So I turned it into a tool.

Philosophy → Tooling

💡

Method

📝

Prompts

🔧

Synthex

Enter Synthex

Synthex is a Claude Code plugin that encodes this entire process — every phase, every agent role, every review gate — into reusable commands and specialized AI agents. It's open source, part of the LumenAI↗ marketplace, and it evolved alongside the method itself.

Synthex didn't appear fully formed. It grew as I perfected each prompt, and it accelerated as Anthropic shipped new capabilities — sub-agents for parallel work, skills for reusable commands, plugins for distribution, and agent teams for orchestrating specialist roles. Each feature unlocked a new level of sophistication in the workflow.

The Agent Organization

At the heart of Synthex is a structured organization of 15 specialized AI agents — a virtual engineering team where each agent has a clearly defined role and domain expertise.

Orchestration Layer

Coordinates execution

🔧

Tech Lead

Full-stack orchestrator

🎨

Lead Frontend Eng

Frontend coordination

📋

Product Manager

Requirements & planning

Specialist Layer

Domain expertise

🏗️Architect

🔍Code Reviewer

🔒Security Reviewer

🧪Quality Engineer

☁️Terraform Reviewer

🎯Design System

⚡Performance Eng

🛡️SRE Agent

📝Technical Writer

Research & Analysis Layer

Continuous improvement

🔬

UX Researcher

User insights

📊

Metrics Analyst

DORA & delivery health

🔄

Retro Facilitator

Improvement cycles

The Orchestration Layer coordinates execution. The Tech Lead, Lead Frontend Engineer, and Product Manager break down work, delegate to specialists, and roll up results.

The Specialist Layer provides deep domain expertise. The Architect reviews system design. The Security Reviewer checks for vulnerabilities. The Quality Engineer ensures test coverage.

The Research & Analysis Layer drives continuous improvement. UX Researchers inform product decisions. The Metrics Analyst tracks delivery health with DORA metrics. The Retrospective Facilitator helps the team learn.

How Synthex Handles Each Phase

Let's walk through how each phase maps to actual Synthex commands.

Phase 1: The PRD Process

The Product Manager agent conducts the structured interview. It doesn't accept a brief description and start generating requirements autonomously — it asks targeted questions across multiple dimensions, typically 3-5 per round:


Rendering diagram...

The output lands at docs/reqs/main.md — version controlled, reviewable, ready for planning.

Phase 2: Peer-Reviewed Planning

The write-implementation-plan command transforms the PRD into a peer-reviewed plan:


Rendering diagram...

The peer reviewers include an Architect, a Designer, and a Tech Lead. A key design decision: each review cycle spawns fresh agent instances to prevent context exhaustion.

Phase 3: Parallel Execution

The next-priority command drives execution:


Rendering diagram...

Each Tech Lead orchestrates a team of specialists:


Rendering diagram...

Quality Gates: Not Optional

Code review in Synthex isn't a single pass by a single agent. The review-code command runs a multi-perspective review in parallel:


Rendering diagram...

The verdict follows strict rules: if any reviewer reports FAIL, the overall verdict is FAIL. Security review is a mandatory quality gate — the Tech Lead cannot bypass it.

The Full Command Reference

Synthex provides 11 commands spanning the entire delivery lifecycle:

Command	Phase	What It Does
`init`	Setup	Scaffolds project structure and configuration
`write-implementation-plan`	Plan	Creates peer-reviewed implementation plans from PRDs
`next-priority`	Build	Executes highest-priority tasks in parallel
`review-code`	Build	Multi-perspective code review with fix loop
`write-adr`	Build	Documents architectural decisions
`write-rfc`	Build	Creates Requests for Comments for proposals
`test-coverage-analysis`	Build	Analyzes and improves test coverage
`design-system-audit`	Build	Audits UI compliance with design system
`performance-audit`	Ship	Full-stack performance analysis
`reliability-review`	Operate	SLO compliance and operational readiness
`retrospective`	Learn	Structured team retrospectives

These map to the five phases of the delivery lifecycle:

🔍

Discover

🔨

Build

🚀

Ship

⚙️

Operate

📚

Learn

↩ Learnings feed back into Discovery

Why This Works

I've been using this approach for months, and there are a few things that make it fundamentally different from just "asking ChatGPT to write code":

Structure creates quality. By separating requirements, planning, and execution into distinct phases with review gates between them, you avoid the biggest pitfall of AI-assisted coding: generating code without understanding what you're building or why.

Specialization beats generalization. A single AI agent asked to "build a login page" will produce something that works. Fifteen specialized agents — one for architecture, one for security, one for testing, one for frontend, one for code review — will produce something that's production-ready.

Parallel execution changes the game. With git worktrees isolating each task, you can implement three features simultaneously without conflict. What used to take a sprint takes an afternoon.

Review loops catch what you miss. The peer review on implementation plans catches ambiguity before code is written. The code review catches bugs before they ship. The security review catches vulnerabilities before they become incidents.

Everything is traceable. Every task maps to a requirement. Every code change maps to a task. Every architectural decision is recorded.

What This Changes About Your Role

This approach doesn't eliminate the need for engineering skill. It transforms it. Instead of spending your time writing for loops and debugging CSS, you spend it on:

Product thinking — What should we build and why?
Architecture decisions — How should the system be structured?
Quality judgment — Is this output good enough for production?
Strategic prioritization — What matters most right now?

These are the highest-leverage activities an engineer can do. Agentic Software Delivery lets you spend all your time on them.

Try It Yourself

Everything I've described in this post is available today, for free, as an open-source Claude Code plugin. If any of this resonated, the best way to understand it is to try it on a real project.

🔧

Synthex

Part of the LumenAI plugin marketplace

15 specialized AI agents. 11 commands. Full delivery lifecycle coverage — from brainstorming your first idea to shipping validated, reviewed, production-ready code.

Open SourceFreeClaude Code Plugin

Claude Code

1/plugin marketplace add bluminal/lumenai

2/synthex:init

3/synthex:next-priority

Try it on your next feature — you'll be surprised how fast you go from idea to working code.View on GitHub

Start small — pick your next feature, write the PRD with Claude, generate the plan, and let next-priority handle the rest. I think you'll be surprised at how quickly you get from idea to working code.

In future posts, I'll go deeper into each phase — how to write PRDs that produce great plans, how to tune the review loop for your team's needs, advanced patterns for managing complex multi-milestone projects, and how to keep your implementation plan in sync with Jira and Linear.

Agentic Software Delivery: How I Ship Features Without Writing Code🤖

The Big Picture

From Messy Ideas to Product Vision

Formalizing the PRD

Turning Vision Into an Executable Plan

Delivering Incremental Value

Staying in the Loop

Why a Single Markdown Document

Task 2.3.1: Real-Time Notification System

Parallel Execution: Where It Gets Fun

From Philosophy to Practice

Enter Synthex

The Agent Organization

How Synthex Handles Each Phase

Phase 1: The PRD Process

Phase 2: Peer-Reviewed Planning

Phase 3: Parallel Execution

Quality Gates: Not Optional

The Full Command Reference

Why This Works

What This Changes About Your Role

Try It Yourself

Synthex

Written by AJ Brown