Agentic Design Patterns: The Complete Guide to Building Agents That Actually Work

last month, i watched a team spend three weeks building an agent that could have been built in three days. why? they kept throwing tools at the problem instead of stepping back to understand the underlying design pattern they actually needed.

(i've made this exact mistake. multiple times.)

this post breaks down the core agentic design patterns that show up in every successful agent system i've worked on: reflection, planning, tool use, multi-agent coordination, and memory-augmented systems. these aren't academic concepts—they're practical frameworks that determine whether your agent works in production or fails spectacularly.

why design patterns matter (more than you think)

here's the thing about building agents: you can get a demo working in an afternoon. but getting that same agent to handle 1,000 real user requests without falling apart? that requires understanding design patterns.

as the research from Hypermode on agentic design patterns points out, these patterns provide repeatable solutions to common challenges in agent development. they define clear interfaces between agents, tools, models, and data sources.

more importantly, they give you:

scalability - reusable structures that don't become unmaintainable as complexity grows
reliability - standardized interactions lead to predictable behavior
modularity - clear interfaces let you expand functionality without technical debt
team communication - shared vocabulary for discussing complex agent behaviors

pattern 1: reflection (the agent that critiques itself)

reflection is the pattern where your agent evaluates and improves its own output before finalizing an answer.

sounds simple. changes everything.

how reflection actually works

the basic flow:

generate initial output → critique that output → improve based on critique → repeat until satisfied

according to DataKnobs' guide on agent design patterns, this self-refinement mechanism significantly enhances output quality without requiring human intervention.

when i use reflection

building a code generation agent last quarter, initial outputs were... rough. syntax errors, missing imports, incomplete logic. classic first-draft problems.

added a reflection step:

agent generates code
agent runs linter and tests
agent reviews errors and warnings
agent fixes issues
repeat until tests pass

results: code quality improved 3x. more importantly, the agent caught its own mistakes before users saw them.

reflection patterns in practice

i've seen reflection work well in:

content generation - draft, review tone and clarity, revise
code writing - generate, lint, test, fix
data analysis - produce results, validate calculations, correct errors
research synthesis - summarize findings, check for contradictions, refine conclusions

the reflection pitfall (i learned the hard way)

reflection isn't free. each iteration costs tokens and time.

early version of my code agent got stuck in reflection loops—making minor tweaks 10-15 times before deciding it was "good enough." users waited 2+ minutes for simple tasks.

solution: set clear stopping criteria. either the output meets specific quality thresholds (tests pass, no linter errors) or you hit a maximum iteration count. don't let agents philosophize endlessly about perfection.

pattern 2: planning (think before you act)

planning patterns make agents create a strategy before executing actions.

the alternative? agents that jump straight to execution and then backtrack when they realize they're solving the wrong problem.

two approaches to planning

1. plan-and-execute

as described in agent design pattern research, this approach has the agent create a full plan before execution, often using sub-agents or task chains.

analyze task → break into steps → create detailed plan → execute plan sequentially

when to use: complex, multi-step tasks where order matters

example from last month: built an agent to migrate a codebase from javascript to typescript. without planning, it started converting files randomly, breaking imports everywhere.

with plan-and-execute:

agent analyzed dependency graph
created conversion order (leaf dependencies first)
planned type definitions for shared utilities
executed conversions in correct sequence

result: zero broken imports. migration completed in 2 hours instead of 2 days of manual debugging.

2. ReAct (reason + act)

ReAct combines reasoning and action in a step-by-step loop. instead of planning everything upfront, the agent reasons about the current state, takes one action, observes the result, then reasons again.

thought: what do i need to know? → action: search database → observation: found X → thought: now i should... → repeat

when to use: tasks where you need to adapt based on intermediate results

built a customer support agent using ReAct:

thought: i need to look up this customer's order history
action: query orders database
observation: customer has 3 orders, most recent was delivered yesterday
thought: they're asking about a missing item, should check the specific order details
action: get order line items
observation: order contains the item they're asking about
thought: delivery was successful, this might be a packaging issue...

ReAct shines when you can't predict what information you'll need until you see intermediate results.

planning vs. ReAct: which pattern when?

after building both types extensively:

use plan-and-execute when:
- task structure is clear upfront
- order of operations matters
- you can't afford trial-and-error
- example: code refactoring, data migrations, multi-step workflows
use ReAct when:
- you need to adapt based on what you find
- information gathering is exploratory
- optimal path depends on intermediate results
- example: research tasks, customer support, debugging

pattern 3: tool use (giving agents capabilities)

tool use patterns determine how agents interact with external systems, APIs, and functions.

this sounds basic until you realize: tool design directly impacts agent reliability.

the tool selection problem

building a productivity agent last year, i made 23 different tools (createTask, updateTask, completeTask, deleteTask, etc.).

agent got confused. kept choosing wrong tools. high latency from analyzing all options.

refactored to 5 core tools with parameters. problem solved.

two approaches to tool use

1. explicit tool calling

agent explicitly decides which tool to call and when. most common pattern. works well for:

well-defined APIs
operations with clear inputs/outputs
scenarios where you need audit trails

example: email agent with tools like send_email(to, subject, body), search_inbox(query),schedule_meeting(attendees, time)

2. toolformer pattern

as research on toolformer patterns shows, this trains the model to decide when and which tool to call during its reasoning process.

the agent learns:

when it needs external information
which tool will provide that information
how to interpret tool results

more flexible but requires more sophisticated training/prompting.

practical tool design principles

after building dozens of agents with different tool architectures:

fewer tools, richer parameters
instead of: getUserById, getUserByEmail, getUserByName
use: getUser(id?, email?, name?)
clear, descriptive tool names
bad: fetch_data()
good: search_customer_orders(customer_id, date_range)
include examples in tool descriptions
agents perform significantly better when tool descriptions include usage examples. show the agent exactly how to call each tool.
return structured, parseable results
returning clean JSON beats returning formatted strings. makes it easier for agents to extract specific fields.

the tool use mistake that cost me a week

built an agent with a search_documents tool that could return hundreds of results.

agent would call it, get 200 documents back, hit context limits, crash.

solution: tools should return summaries, not everything. give the agent just enough information to decide if it needs to drill deeper. then provide a get_document_details tool for specific documents.

progressive disclosure. works for users, works for agents.

pattern 4: multi-agent orchestration (when one agent isn't enough)

multi-agent patterns split tasks across specialized agents that coordinate to solve complex problems.

powerful. also expensive and complex.

when to actually use multi-agent

i see people jumping to multi-agent way too early. you don't need multiple agents just because your task has multiple steps.

use multi-agent when:

different parts of the task require genuinely different capabilities
you benefit from parallel processing
context separation improves quality
specialized expertise matters

the orchestration design framework

according to research on agentic orchestration patterns, there are four fundamental ways to coordinate multiple agents. choosing the right orchestration pattern determines whether your multi-agent system is elegant or a tangled mess.

1. sequential orchestration

agents process tasks in a linear sequence, each building on the previous agent's output.

agent A → agent B → agent C → final output

when to use: workflows with natural progression where each step depends on the previous one

example: content creation pipeline

research agent: gathers information and sources
writing agent: creates draft based on research
editing agent: refines tone and clarity
fact-checking agent: validates all claims

pros: simple to understand and debug, clear dependencies, easy error isolation

cons: can be slow (no parallelism), bottleneck if one agent is slow

2. parallel orchestration

multiple agents work simultaneously on different aspects of a problem, results combined at the end.

agent A ┐
agent B ├→ aggregator → final output
agent C ┘

when to use: tasks decomposable into independent sub-tasks

example: comprehensive market analysis

agent 1: analyzes competitor pricing
agent 2: reviews customer sentiment
agent 3: examines market trends
aggregator: synthesizes all findings into unified report

built exactly this last quarter. processing time dropped from 45 minutes (sequential) to 8 minutes (parallel).

pros: fast execution, scales horizontally, efficient use of resources

cons: coordination complexity, potential for inconsistent outputs, harder to debug

3. hierarchical orchestration (supervisor-worker)

higher-level supervisor agent coordinates and directs lower-level worker agents in a tree structure.

supervisor
  ├── research agent
  ├── writing agent
  ├── editing agent
  └── fact-checking agent

when to use: complex problems requiring both strategic planning and detailed execution

built a content creation system using this pattern:

supervisor analyzes topic and creates execution plan
assigns research to specialized research agent
passes research to writing agent with specific requirements
delegates editing to editing agent
coordinates fact-checking in parallel
synthesizes all outputs into final content

the supervisor makes high-level decisions (what needs to be done, in what order, with what constraints) while workers focus on execution.

result: output quality improved 2.5x compared to single-agent approach

pros: clear responsibilities, easy to add specialized agents, supervisor optimizes workflow

cons: supervisor becomes bottleneck, added coordination overhead, can be overkill for simple tasks

4. network-based orchestration

agents communicate in flexible, dynamic networks where the interaction structure adapts to the task.

as research on orchestration patterns shows, this is the most flexible pattern—agents dynamically determine who to collaborate with based on the problem.

when to use: complex, unpredictable problems where optimal collaboration can't be predetermined

example: multi-agent research system

research agents explore different information sources
when agent A finds relevant info, it shares with agents B and C
agents dynamically form sub-groups around emerging insights
collaboration patterns evolve as understanding deepens

this is powerful but complex. only use when the problem genuinely benefits from dynamic collaboration.

pros: highly adaptive, can handle complex/unknown problems, emergent intelligence

cons: hardest to design and debug, unpredictable behavior, requires sophisticated coordination

choosing the right orchestration pattern

quick decision guide:

sequential: clear step-by-step workflow, each step depends on previous
parallel: independent sub-tasks that can run simultaneously
hierarchical: need central planning and specialized workers
network: complex problems where best collaboration emerges dynamically

in my experience: 70% of multi-agent systems use sequential or parallel orchestration. only use hierarchical or network when you have clear evidence simpler patterns won't work.

the multi-agent cost problem

here's what nobody tells you: multi-agent is expensive.

as i mentioned in my context engineering guide, Anthropic reported that their multi-agent researcher used up to 15× more tokens than single-agent approaches.

my experience mirrors this. expect 10-15x token usage and 3-5x latency.

multi-agent is worth it when:

quality improvements justify the cost
parallel processing provides meaningful speedup
specialization actually matters

but start with a single agent. add more agents only when you have clear evidence it'll help.

pattern 5: memory-augmented agents (learning from the past)

memory-augmented patterns give agents persistent context across sessions, enabling personalization and learning over time.

this is the difference between an agent that forgets everything after each conversation and one that actually remembers you, your preferences, and past interactions.

why memory matters for agents

most agents are stateless—they start fresh every time. this works for simple tasks but fails for:

personalized assistance that adapts to user preferences
long-running projects that span multiple sessions
learning from past successes and failures
building context about recurring tasks

as covered in my LangGraph memory guide, there are three types of memory that matter:

three types of agent memory

1. episodic memory (remembering what happened)

stores specific past interactions and experiences.

example: customer support agent that remembers:

"this customer had a shipping issue last month"
"we already tried solution A, it didn't work"
"user prefers email over phone contact"

implementation: store conversation summaries, key events, and outcomes in a vector database. retrieve relevant episodes when similar situations arise.

2. semantic memory (remembering facts and knowledge)

stores general knowledge and facts learned over time.

example: coding agent that remembers:

"this project uses React 18 with TypeScript"
"team prefers functional components over class components"
"API endpoints are documented in /docs/api.md"

implementation: maintain a knowledge base of project facts, conventions, and learned information. update as new information is discovered.

3. procedural memory (remembering how to do things)

stores learned procedures, workflows, and successful patterns.

example: writing agent that remembers:

"user always wants headlines in sentence case"
"blog posts need 3 examples minimum"
"avoid jargon, explain technical terms"

implementation: extract successful patterns from past interactions. codify user preferences and proven workflows into reusable procedures.

memory retrieval strategies

storing memory is easy. retrieving the right memories at the right time is the hard part.

recency-based retrieval

prioritize recent memories. simple but effective for most use cases.

"what did we discuss in the last 3 conversations?"

relevance-based retrieval

use semantic search to find memories similar to current context.

"find past conversations about shipping issues"

importance-based retrieval

score and prioritize memories by importance. surface critical information first.

"user explicitly stated this is a hard requirement" → high importance

hybrid retrieval

combine multiple strategies. most production systems use this approach.

example retrieval logic:

always include last 2 conversations (recency)
search for semantically similar past interactions (relevance)
surface any high-importance facts or preferences (importance)
combine and rank by composite score

real example: personalized email agent

built an email drafting agent with memory last quarter. here's what it remembers:

episodic memory:

past email threads with each contact
previous meeting notes and outcomes
follow-up commitments and deadlines

semantic memory:

relationship context (client, colleague, vendor)
project details and current status
organizational knowledge (team structure, processes)

procedural memory:

user's writing style and tone preferences
email structure preferences (greeting, sign-off)
which types of emails need which level of formality

result: drafts now match my writing style 90%+ of the time. agent remembers context from months ago without me having to explain it.

memory management challenges

building memory-augmented agents isn't just about storing everything. you need to manage:

memory decay

old information becomes stale. should an agent remember a preference from 2 years ago?

solution: implement decay scores. recent memories have higher weight. periodically archive or forget low-relevance old memories.

memory conflicts

what if stored memories contradict each other or current information?

solution: timestamp all memories. when conflicts arise, prefer recent information. flag contradictions for user clarification.

privacy and security

storing user data creates privacy obligations and security risks.

solution: implement proper data governance:

clear retention policies
user controls for viewing/deleting memories
encryption for sensitive information
compliance with data protection regulations

memory retrieval cost

searching large memory stores adds latency and token costs.

solution: implement efficient indexing (vector databases), limit retrieval to top-k most relevant memories, cache frequently accessed memories.

when to use memory-augmented patterns

add persistent memory when:

personalization matters - agent needs to adapt to individual users
long-term context is valuable - tasks span multiple sessions
learning improves performance - agent gets better from past experience
consistency is important - need to maintain coherent behavior over time

skip memory when:

tasks are completely independent
privacy concerns outweigh benefits
single-session context is sufficient
added complexity isn't justified by use case

in my experience: memory transforms good agents into great ones—but only when the use case actually benefits from persistent context.

combining patterns (where it gets interesting)

the real magic happens when you combine these patterns.

example: production research agent

built a research agent that combines multiple patterns:

planning (ReAct): adapts research strategy based on findings
tool use: searches databases, web, internal docs
multi-agent: parallel research across different sources
reflection: validates findings and checks for contradictions
memory: remembers previous research topics and key findings to avoid redundant work

the flow:

receives research question
uses ReAct to explore information sources
spins up parallel agents for deep dives into promising areas
each agent uses tools to gather specific information
supervisor synthesizes findings
reflection step validates conclusions and checks sources
final report generation

results: produces research reports in 15 minutes that previously took analysts 4-6 hours

example: customer support automation

different pattern combination:

planning (ReAct): dynamically gathers customer context
tool use: accesses CRM, order systems, knowledge base
reflection: reviews responses for tone and accuracy
memory: remembers customer history and previous interactions
single agent: no multi-agent overhead needed

kept it simple. handles 300+ tickets daily with 85% resolution rate without human intervention. memory of past interactions reduces repeated questions and improves personalization.

the pattern selection framework

according to research on pattern selection frameworks, choosing the right agentic design pattern depends on understanding your task characteristics. here's the structured approach i use:

step 1: analyze task characteristics

before choosing a pattern, answer these questions:

complexity: simple → single-step vs complex → multi-step?
predictability: deterministic path vs exploratory discovery?
dependencies: sequential steps vs independent sub-tasks?
expertise: single domain vs multiple specialized domains?
quality needs: first-pass acceptable vs iteration required?
time constraints: can wait for sequential vs need parallel processing?

step 2: match task to pattern

based on your answers, here's the decision tree:

for single-step or simple tasks:

direct tool use - if it's just calling an API or executing a function
ReAct + reflection - if you need some reasoning and quality checking

for sequential decision-making tasks:

ReAct pattern - when you need to adapt based on intermediate results
example: research, debugging, customer support, data exploration

for structured, multi-step tasks:

plan-and-execute - when you can map out steps upfront
sequential orchestration - if steps depend on each other
example: code migrations, data pipelines, content workflows

for complex, multi-domain tasks:

parallel orchestration - if sub-tasks are independent
hierarchical orchestration - if you need central coordination
example: market analysis, comprehensive reports, system audits

for unpredictable, evolving tasks:

network orchestration - when collaboration patterns need to emerge
example: open-ended research, creative problem-solving

for personalized, long-running tasks:

memory-augmented agents - when agents need to learn and adapt over time
example: personal assistants, custom support, ongoing projects

step 3: add reflection when quality matters

reflection improves output quality across all patterns. add it when:

first-pass outputs often need refinement
you have clear quality criteria to check against
the cost of iteration is less than cost of poor output

practical decision framework

after two years of building agents, here's my simplified approach:

start here: single agent with ReAct + reflection

this handles 70% of use cases. seriously.

ReAct for dynamic information gathering
basic tool use for necessary capabilities
reflection for quality improvement

build this first. only add complexity if you have specific evidence it'll help.

add memory when:

personalization improves user experience
agents need to learn from past interactions
tasks span multiple sessions
consistency over time matters

upgrade to plan-and-execute when:

tasks have clear multi-step structure
order of operations matters
you need to coordinate multiple actions upfront

upgrade to multi-agent orchestration when:

parallel processing provides measurable speedup
specialized expertise significantly improves quality
context separation is necessary for performance
cost increase (10-15x) is justified by results

anti-patterns to avoid

patterns i see people get wrong:

using plan-and-execute for exploratory tasks - makes agents rigid when they need flexibility
multi-agent for simple linear workflows - adds complexity without benefit
reflection without clear quality criteria - endless loops without improvement
network orchestration for predictable problems - complexity that provides no value

match the pattern to the problem. not the other way around.

common mistakes (that i definitely didn't make... multiple times)

1. overengineering from the start

my first production agent had:

sophisticated planning system
five specialized sub-agents
complex memory architecture
elaborate reflection loops

it was slow, expensive, and constantly broke in weird ways.

rewrote it as a simple ReAct agent with reflection. worked better, cost 1/10th as much.

start simple. add complexity only when you have evidence it'll help.

2. wrong pattern for the task

used plan-and-execute for a customer support agent. made it rigid and slow.

customer questions are dynamic. you can't predict what information you'll need upfront. ReAct was the obvious better choice.

match the pattern to task characteristics, not to what sounds impressive.

3. not measuring pattern effectiveness

added reflection to an agent. assumed it helped. never measured.

turns out: for that specific use case, reflection added latency but didn't improve quality. the first-pass outputs were already good enough.

measure everything. patterns should be justified by data, not intuition.

practical implementation advice

1. prototype with minimal patterns

start with the simplest possible implementation:

single agent
basic ReAct loop
2-3 essential tools
no reflection initially

get this working first. then add patterns based on where it fails.

2. measure pattern impact

track:

task success rate
output quality scores
token usage
latency
cost per task

add patterns incrementally. measure impact each time.

3. build pattern libraries

create reusable implementations of each pattern. makes it faster to try different combinations.

my current setup has modular components for:

ReAct loop implementation
reflection wrapper
planning system
multi-agent coordination

can compose these into new agents quickly.

4. test with real data

synthetic test data lies. it's too clean, too well-formatted.

patterns that work perfectly in testing often fail on real data. test with actual production data as early as possible.

key takeaways

patterns beat frameworks - understanding core patterns matters more than picking the "right" framework
five core patterns - reflection, planning (ReAct vs plan-and-execute), tool use, multi-agent orchestration, and memory-augmented agents
orchestration design framework - sequential, parallel, hierarchical, and network patterns for multi-agent coordination
pattern selection framework - analyze task characteristics (complexity, predictability, dependencies) to match the right pattern
start simple - 70% of use cases need just ReAct + reflection. add complexity only when justified by data
reflection improves quality - agents that critique their own work are fundamentally more reliable
planning depends on task type - use plan-and-execute for structured tasks, ReAct for dynamic exploration
tool design matters - fewer, well-designed tools beat many specialized tools
memory transforms agents - persistent context enables personalization and learning, but adds complexity
hierarchical orchestration - layered abstraction helps manage complexity at scale through supervisor-worker patterns
multi-agent is expensive - expect 10-15x cost increase. only use when parallel processing or specialization justifies it
orchestration patterns scale differently - sequential is simplest, network is most complex. choose based on need
combine patterns thoughtfully - the real power comes from combining patterns appropriately for your task
measure everything - patterns should be justified by data, not intuition or what sounds impressive

resources worth reading

Leapfrogger's orchestration design patterns - detailed guide to sequential, parallel, hierarchical, and network orchestration
Analytics Vidhya's pattern selection framework - task-based approach to choosing agentic patterns
Hypermode's guide to agentic design patterns - comprehensive overview of core patterns and implementation
DataKnobs agent AI design patterns - detailed breakdown of ReAct, planning, and tool use patterns
Multi-agent coordination patterns - practical guide to multi-agent architectures
AI agents: the complete guide - foundational concepts and architecture
context engineering for agents - managing context in production agents
building agents with claude agent SDK - practical implementation with a specific framework

struggling with agent architecture decisions? i help companies choose and implement the right agentic design patterns for their specific use cases. let's discuss your agent system