last month, i watched a team spend three weeks building an agent that could have been built in three days. why? they kept throwing tools at the problem instead of stepping back to understand the underlying design pattern they actually needed.
(i've made this exact mistake. multiple times.)
this post breaks down the core agentic design patterns that show up in every successful agent system i've worked on: reflection, planning, tool use, multi-agent coordination, and memory-augmented systems. these aren't academic concepts—they're practical frameworks that determine whether your agent works in production or fails spectacularly.
why design patterns matter (more than you think)
here's the thing about building agents: you can get a demo working in an afternoon. but getting that same agent to handle 1,000 real user requests without falling apart? that requires understanding design patterns.
as the research from Hypermode on agentic design patterns points out, these patterns provide repeatable solutions to common challenges in agent development. they define clear interfaces between agents, tools, models, and data sources.
more importantly, they give you:
- scalability - reusable structures that don't become unmaintainable as complexity grows
- reliability - standardized interactions lead to predictable behavior
- modularity - clear interfaces let you expand functionality without technical debt
- team communication - shared vocabulary for discussing complex agent behaviors
pattern 1: reflection (the agent that critiques itself)
reflection is the pattern where your agent evaluates and improves its own output before finalizing an answer.
sounds simple. changes everything.
how reflection actually works
the basic flow:
generate initial output → critique that output → improve based on critique → repeat until satisfied
according to DataKnobs' guide on agent design patterns, this self-refinement mechanism significantly enhances output quality without requiring human intervention.
when i use reflection
building a code generation agent last quarter, initial outputs were... rough. syntax errors, missing imports, incomplete logic. classic first-draft problems.
added a reflection step:
- agent generates code
- agent runs linter and tests
- agent reviews errors and warnings
- agent fixes issues
- repeat until tests pass
results: code quality improved 3x. more importantly, the agent caught its own mistakes before users saw them.
reflection patterns in practice
i've seen reflection work well in:
- content generation - draft, review tone and clarity, revise
- code writing - generate, lint, test, fix
- data analysis - produce results, validate calculations, correct errors
- research synthesis - summarize findings, check for contradictions, refine conclusions
the reflection pitfall (i learned the hard way)
reflection isn't free. each iteration costs tokens and time.
early version of my code agent got stuck in reflection loops—making minor tweaks 10-15 times before deciding it was "good enough." users waited 2+ minutes for simple tasks.
solution: set clear stopping criteria. either the output meets specific quality thresholds (tests pass, no linter errors) or you hit a maximum iteration count. don't let agents philosophize endlessly about perfection.
pattern 2: planning (think before you act)
planning patterns make agents create a strategy before executing actions.
the alternative? agents that jump straight to execution and then backtrack when they realize they're solving the wrong problem.
two approaches to planning
1. plan-and-execute
as described in agent design pattern research, this approach has the agent create a full plan before execution, often using sub-agents or task chains.
analyze task → break into steps → create detailed plan → execute plan sequentially
when to use: complex, multi-step tasks where order matters
example from last month: built an agent to migrate a codebase from javascript to typescript. without planning, it started converting files randomly, breaking imports everywhere.
with plan-and-execute:
- agent analyzed dependency graph
- created conversion order (leaf dependencies first)
- planned type definitions for shared utilities
- executed conversions in correct sequence
result: zero broken imports. migration completed in 2 hours instead of 2 days of manual debugging.
2. ReAct (reason + act)
ReAct combines reasoning and action in a step-by-step loop. instead of planning everything upfront, the agent reasons about the current state, takes one action, observes the result, then reasons again.
thought: what do i need to know? → action: search database → observation: found X → thought: now i should... → repeat
when to use: tasks where you need to adapt based on intermediate results
built a customer support agent using ReAct:
- thought: i need to look up this customer's order history
- action: query orders database
- observation: customer has 3 orders, most recent was delivered yesterday
- thought: they're asking about a missing item, should check the specific order details
- action: get order line items
- observation: order contains the item they're asking about
- thought: delivery was successful, this might be a packaging issue...
ReAct shines when you can't predict what information you'll need until you see intermediate results.
planning vs. ReAct: which pattern when?
after building both types extensively:
- use plan-and-execute when:
- task structure is clear upfront
- order of operations matters
- you can't afford trial-and-error
- example: code refactoring, data migrations, multi-step workflows
- use ReAct when:
- you need to adapt based on what you find
- information gathering is exploratory
- optimal path depends on intermediate results
- example: research tasks, customer support, debugging
pattern 3: tool use (giving agents capabilities)
tool use patterns determine how agents interact with external systems, APIs, and functions.
this sounds basic until you realize: tool design directly impacts agent reliability.
the tool selection problem
building a productivity agent last year, i made 23 different tools (createTask, updateTask, completeTask, deleteTask, etc.).
agent got confused. kept choosing wrong tools. high latency from analyzing all options.
refactored to 5 core tools with parameters. problem solved.
two approaches to tool use
1. explicit tool calling
agent explicitly decides which tool to call and when. most common pattern. works well for:
- well-defined APIs
- operations with clear inputs/outputs
- scenarios where you need audit trails
example: email agent with tools like send_email(to, subject, body)
, search_inbox(query)
,schedule_meeting(attendees, time)
2. toolformer pattern
as research on toolformer patterns shows, this trains the model to decide when and which tool to call during its reasoning process.
the agent learns:
- when it needs external information
- which tool will provide that information
- how to interpret tool results
more flexible but requires more sophisticated training/prompting.
practical tool design principles
after building dozens of agents with different tool architectures:
- fewer tools, richer parameters
instead of:
getUserById
,getUserByEmail
,getUserByName
use:
getUser(id?, email?, name?)
- clear, descriptive tool names
bad:
fetch_data()
good:
search_customer_orders(customer_id, date_range)
- include examples in tool descriptions
agents perform significantly better when tool descriptions include usage examples. show the agent exactly how to call each tool.
- return structured, parseable results
returning clean JSON beats returning formatted strings. makes it easier for agents to extract specific fields.
the tool use mistake that cost me a week
built an agent with a search_documents
tool that could return hundreds of results.
agent would call it, get 200 documents back, hit context limits, crash.
solution: tools should return summaries, not everything. give the agent just enough information to decide if it needs to drill deeper. then provide a get_document_details
tool for specific documents.
progressive disclosure. works for users, works for agents.
pattern 4: multi-agent orchestration (when one agent isn't enough)
multi-agent patterns split tasks across specialized agents that coordinate to solve complex problems.
powerful. also expensive and complex.
when to actually use multi-agent
i see people jumping to multi-agent way too early. you don't need multiple agents just because your task has multiple steps.
use multi-agent when:
- different parts of the task require genuinely different capabilities
- you benefit from parallel processing
- context separation improves quality
- specialized expertise matters
the orchestration design framework
according to research on agentic orchestration patterns, there are four fundamental ways to coordinate multiple agents. choosing the right orchestration pattern determines whether your multi-agent system is elegant or a tangled mess.
1. sequential orchestration
agents process tasks in a linear sequence, each building on the previous agent's output.
agent A → agent B → agent C → final output
when to use: workflows with natural progression where each step depends on the previous one
example: content creation pipeline
- research agent: gathers information and sources
- writing agent: creates draft based on research
- editing agent: refines tone and clarity
- fact-checking agent: validates all claims
pros: simple to understand and debug, clear dependencies, easy error isolation
cons: can be slow (no parallelism), bottleneck if one agent is slow
2. parallel orchestration
multiple agents work simultaneously on different aspects of a problem, results combined at the end.
agent A ┐
agent B ├→ aggregator → final output
agent C ┘
when to use: tasks decomposable into independent sub-tasks
example: comprehensive market analysis
- agent 1: analyzes competitor pricing
- agent 2: reviews customer sentiment
- agent 3: examines market trends
- aggregator: synthesizes all findings into unified report
built exactly this last quarter. processing time dropped from 45 minutes (sequential) to 8 minutes (parallel).
pros: fast execution, scales horizontally, efficient use of resources
cons: coordination complexity, potential for inconsistent outputs, harder to debug
3. hierarchical orchestration (supervisor-worker)
higher-level supervisor agent coordinates and directs lower-level worker agents in a tree structure.
supervisor ├── research agent ├── writing agent ├── editing agent └── fact-checking agent
when to use: complex problems requiring both strategic planning and detailed execution
built a content creation system using this pattern:
- supervisor analyzes topic and creates execution plan
- assigns research to specialized research agent
- passes research to writing agent with specific requirements
- delegates editing to editing agent
- coordinates fact-checking in parallel
- synthesizes all outputs into final content
the supervisor makes high-level decisions (what needs to be done, in what order, with what constraints) while workers focus on execution.
result: output quality improved 2.5x compared to single-agent approach
pros: clear responsibilities, easy to add specialized agents, supervisor optimizes workflow
cons: supervisor becomes bottleneck, added coordination overhead, can be overkill for simple tasks
4. network-based orchestration
agents communicate in flexible, dynamic networks where the interaction structure adapts to the task.
as research on orchestration patterns shows, this is the most flexible pattern—agents dynamically determine who to collaborate with based on the problem.
when to use: complex, unpredictable problems where optimal collaboration can't be predetermined
example: multi-agent research system
- research agents explore different information sources
- when agent A finds relevant info, it shares with agents B and C
- agents dynamically form sub-groups around emerging insights
- collaboration patterns evolve as understanding deepens
this is powerful but complex. only use when the problem genuinely benefits from dynamic collaboration.
pros: highly adaptive, can handle complex/unknown problems, emergent intelligence
cons: hardest to design and debug, unpredictable behavior, requires sophisticated coordination
choosing the right orchestration pattern
quick decision guide:
- sequential: clear step-by-step workflow, each step depends on previous
- parallel: independent sub-tasks that can run simultaneously
- hierarchical: need central planning and specialized workers
- network: complex problems where best collaboration emerges dynamically
in my experience: 70% of multi-agent systems use sequential or parallel orchestration. only use hierarchical or network when you have clear evidence simpler patterns won't work.
the multi-agent cost problem
here's what nobody tells you: multi-agent is expensive.
as i mentioned in my context engineering guide, Anthropic reported that their multi-agent researcher used up to 15× more tokens than single-agent approaches.
my experience mirrors this. expect 10-15x token usage and 3-5x latency.
multi-agent is worth it when:
- quality improvements justify the cost
- parallel processing provides meaningful speedup
- specialization actually matters
but start with a single agent. add more agents only when you have clear evidence it'll help.
pattern 5: memory-augmented agents (learning from the past)
memory-augmented patterns give agents persistent context across sessions, enabling personalization and learning over time.
this is the difference between an agent that forgets everything after each conversation and one that actually remembers you, your preferences, and past interactions.
why memory matters for agents
most agents are stateless—they start fresh every time. this works for simple tasks but fails for:
- personalized assistance that adapts to user preferences
- long-running projects that span multiple sessions
- learning from past successes and failures
- building context about recurring tasks
as covered in my LangGraph memory guide, there are three types of memory that matter:
three types of agent memory
1. episodic memory (remembering what happened)
stores specific past interactions and experiences.
example: customer support agent that remembers:
- "this customer had a shipping issue last month"
- "we already tried solution A, it didn't work"
- "user prefers email over phone contact"
implementation: store conversation summaries, key events, and outcomes in a vector database. retrieve relevant episodes when similar situations arise.
2. semantic memory (remembering facts and knowledge)
stores general knowledge and facts learned over time.
example: coding agent that remembers:
- "this project uses React 18 with TypeScript"
- "team prefers functional components over class components"
- "API endpoints are documented in /docs/api.md"
implementation: maintain a knowledge base of project facts, conventions, and learned information. update as new information is discovered.
3. procedural memory (remembering how to do things)
stores learned procedures, workflows, and successful patterns.
example: writing agent that remembers:
- "user always wants headlines in sentence case"
- "blog posts need 3 examples minimum"
- "avoid jargon, explain technical terms"
implementation: extract successful patterns from past interactions. codify user preferences and proven workflows into reusable procedures.
memory retrieval strategies
storing memory is easy. retrieving the right memories at the right time is the hard part.
recency-based retrieval
prioritize recent memories. simple but effective for most use cases.
"what did we discuss in the last 3 conversations?"
relevance-based retrieval
use semantic search to find memories similar to current context.
"find past conversations about shipping issues"
importance-based retrieval
score and prioritize memories by importance. surface critical information first.
"user explicitly stated this is a hard requirement" → high importance
hybrid retrieval
combine multiple strategies. most production systems use this approach.
example retrieval logic:
- always include last 2 conversations (recency)
- search for semantically similar past interactions (relevance)
- surface any high-importance facts or preferences (importance)
- combine and rank by composite score
real example: personalized email agent
built an email drafting agent with memory last quarter. here's what it remembers:
episodic memory:
- past email threads with each contact
- previous meeting notes and outcomes
- follow-up commitments and deadlines
semantic memory:
- relationship context (client, colleague, vendor)
- project details and current status
- organizational knowledge (team structure, processes)
procedural memory:
- user's writing style and tone preferences
- email structure preferences (greeting, sign-off)
- which types of emails need which level of formality
result: drafts now match my writing style 90%+ of the time. agent remembers context from months ago without me having to explain it.
memory management challenges
building memory-augmented agents isn't just about storing everything. you need to manage:
memory decay
old information becomes stale. should an agent remember a preference from 2 years ago?
solution: implement decay scores. recent memories have higher weight. periodically archive or forget low-relevance old memories.
memory conflicts
what if stored memories contradict each other or current information?
solution: timestamp all memories. when conflicts arise, prefer recent information. flag contradictions for user clarification.
privacy and security
storing user data creates privacy obligations and security risks.
solution: implement proper data governance:
- clear retention policies
- user controls for viewing/deleting memories
- encryption for sensitive information
- compliance with data protection regulations
memory retrieval cost
searching large memory stores adds latency and token costs.
solution: implement efficient indexing (vector databases), limit retrieval to top-k most relevant memories, cache frequently accessed memories.
when to use memory-augmented patterns
add persistent memory when:
- personalization matters - agent needs to adapt to individual users
- long-term context is valuable - tasks span multiple sessions
- learning improves performance - agent gets better from past experience
- consistency is important - need to maintain coherent behavior over time
skip memory when:
- tasks are completely independent
- privacy concerns outweigh benefits
- single-session context is sufficient
- added complexity isn't justified by use case
in my experience: memory transforms good agents into great ones—but only when the use case actually benefits from persistent context.
combining patterns (where it gets interesting)
the real magic happens when you combine these patterns.
example: production research agent
built a research agent that combines multiple patterns:
- planning (ReAct): adapts research strategy based on findings
- tool use: searches databases, web, internal docs
- multi-agent: parallel research across different sources
- reflection: validates findings and checks for contradictions
- memory: remembers previous research topics and key findings to avoid redundant work
the flow:
- receives research question
- uses ReAct to explore information sources
- spins up parallel agents for deep dives into promising areas
- each agent uses tools to gather specific information
- supervisor synthesizes findings
- reflection step validates conclusions and checks sources
- final report generation
results: produces research reports in 15 minutes that previously took analysts 4-6 hours
example: customer support automation
different pattern combination:
- planning (ReAct): dynamically gathers customer context
- tool use: accesses CRM, order systems, knowledge base
- reflection: reviews responses for tone and accuracy
- memory: remembers customer history and previous interactions
- single agent: no multi-agent overhead needed
kept it simple. handles 300+ tickets daily with 85% resolution rate without human intervention. memory of past interactions reduces repeated questions and improves personalization.
the pattern selection framework
according to research on pattern selection frameworks, choosing the right agentic design pattern depends on understanding your task characteristics. here's the structured approach i use:
step 1: analyze task characteristics
before choosing a pattern, answer these questions:
- complexity: simple → single-step vs complex → multi-step?
- predictability: deterministic path vs exploratory discovery?
- dependencies: sequential steps vs independent sub-tasks?
- expertise: single domain vs multiple specialized domains?
- quality needs: first-pass acceptable vs iteration required?
- time constraints: can wait for sequential vs need parallel processing?
step 2: match task to pattern
based on your answers, here's the decision tree:
for single-step or simple tasks:
- direct tool use - if it's just calling an API or executing a function
- ReAct + reflection - if you need some reasoning and quality checking
for sequential decision-making tasks:
- ReAct pattern - when you need to adapt based on intermediate results
- example: research, debugging, customer support, data exploration
for structured, multi-step tasks:
- plan-and-execute - when you can map out steps upfront
- sequential orchestration - if steps depend on each other
- example: code migrations, data pipelines, content workflows
for complex, multi-domain tasks:
- parallel orchestration - if sub-tasks are independent
- hierarchical orchestration - if you need central coordination
- example: market analysis, comprehensive reports, system audits
for unpredictable, evolving tasks:
- network orchestration - when collaboration patterns need to emerge
- example: open-ended research, creative problem-solving
for personalized, long-running tasks:
- memory-augmented agents - when agents need to learn and adapt over time
- example: personal assistants, custom support, ongoing projects
step 3: add reflection when quality matters
reflection improves output quality across all patterns. add it when:
- first-pass outputs often need refinement
- you have clear quality criteria to check against
- the cost of iteration is less than cost of poor output
practical decision framework
after two years of building agents, here's my simplified approach:
start here: single agent with ReAct + reflection
this handles 70% of use cases. seriously.
- ReAct for dynamic information gathering
- basic tool use for necessary capabilities
- reflection for quality improvement
build this first. only add complexity if you have specific evidence it'll help.
add memory when:
- personalization improves user experience
- agents need to learn from past interactions
- tasks span multiple sessions
- consistency over time matters
upgrade to plan-and-execute when:
- tasks have clear multi-step structure
- order of operations matters
- you need to coordinate multiple actions upfront
upgrade to multi-agent orchestration when:
- parallel processing provides measurable speedup
- specialized expertise significantly improves quality
- context separation is necessary for performance
- cost increase (10-15x) is justified by results
anti-patterns to avoid
patterns i see people get wrong:
- using plan-and-execute for exploratory tasks - makes agents rigid when they need flexibility
- multi-agent for simple linear workflows - adds complexity without benefit
- reflection without clear quality criteria - endless loops without improvement
- network orchestration for predictable problems - complexity that provides no value
match the pattern to the problem. not the other way around.
common mistakes (that i definitely didn't make... multiple times)
1. overengineering from the start
my first production agent had:
- sophisticated planning system
- five specialized sub-agents
- complex memory architecture
- elaborate reflection loops
it was slow, expensive, and constantly broke in weird ways.
rewrote it as a simple ReAct agent with reflection. worked better, cost 1/10th as much.
start simple. add complexity only when you have evidence it'll help.
2. wrong pattern for the task
used plan-and-execute for a customer support agent. made it rigid and slow.
customer questions are dynamic. you can't predict what information you'll need upfront. ReAct was the obvious better choice.
match the pattern to task characteristics, not to what sounds impressive.
3. not measuring pattern effectiveness
added reflection to an agent. assumed it helped. never measured.
turns out: for that specific use case, reflection added latency but didn't improve quality. the first-pass outputs were already good enough.
measure everything. patterns should be justified by data, not intuition.
practical implementation advice
1. prototype with minimal patterns
start with the simplest possible implementation:
- single agent
- basic ReAct loop
- 2-3 essential tools
- no reflection initially
get this working first. then add patterns based on where it fails.
2. measure pattern impact
track:
- task success rate
- output quality scores
- token usage
- latency
- cost per task
add patterns incrementally. measure impact each time.
3. build pattern libraries
create reusable implementations of each pattern. makes it faster to try different combinations.
my current setup has modular components for:
- ReAct loop implementation
- reflection wrapper
- planning system
- multi-agent coordination
can compose these into new agents quickly.
4. test with real data
synthetic test data lies. it's too clean, too well-formatted.
patterns that work perfectly in testing often fail on real data. test with actual production data as early as possible.
key takeaways
- patterns beat frameworks - understanding core patterns matters more than picking the "right" framework
- five core patterns - reflection, planning (ReAct vs plan-and-execute), tool use, multi-agent orchestration, and memory-augmented agents
- orchestration design framework - sequential, parallel, hierarchical, and network patterns for multi-agent coordination
- pattern selection framework - analyze task characteristics (complexity, predictability, dependencies) to match the right pattern
- start simple - 70% of use cases need just ReAct + reflection. add complexity only when justified by data
- reflection improves quality - agents that critique their own work are fundamentally more reliable
- planning depends on task type - use plan-and-execute for structured tasks, ReAct for dynamic exploration
- tool design matters - fewer, well-designed tools beat many specialized tools
- memory transforms agents - persistent context enables personalization and learning, but adds complexity
- hierarchical orchestration - layered abstraction helps manage complexity at scale through supervisor-worker patterns
- multi-agent is expensive - expect 10-15x cost increase. only use when parallel processing or specialization justifies it
- orchestration patterns scale differently - sequential is simplest, network is most complex. choose based on need
- combine patterns thoughtfully - the real power comes from combining patterns appropriately for your task
- measure everything - patterns should be justified by data, not intuition or what sounds impressive
resources worth reading
- Leapfrogger's orchestration design patterns - detailed guide to sequential, parallel, hierarchical, and network orchestration
- Analytics Vidhya's pattern selection framework - task-based approach to choosing agentic patterns
- Hypermode's guide to agentic design patterns - comprehensive overview of core patterns and implementation
- DataKnobs agent AI design patterns - detailed breakdown of ReAct, planning, and tool use patterns
- Multi-agent coordination patterns - practical guide to multi-agent architectures
- AI agents: the complete guide - foundational concepts and architecture
- context engineering for agents - managing context in production agents
- building agents with claude agent SDK - practical implementation with a specific framework
struggling with agent architecture decisions? i help companies choose and implement the right agentic design patterns for their specific use cases. let's discuss your agent system