Lesson 6: planning and reasoning - how agents tackle complex tasks
What you will learn
Section titled “What you will learn”- Why planning is essential for agents handling complex tasks
- The agentic problem-solving loop: Get Mission, Scan Scene, Think, Act, Observe
- Plan-then-execute vs. reactive approaches and the trade-offs of each
- Hierarchical planning for breaking big tasks into manageable pieces
- Re-planning: how agents adapt when things go wrong
- Reasoning techniques: Chain-of-Thought, Tree-of-Thoughts, and self-consistency
- The orchestration layer’s role in managing plans
- Common failure modes and how to avoid them
Prerequisites
Section titled “Prerequisites”ELI5: Planning is like packing for a trip
Section titled “ELI5: Planning is like packing for a trip”Imagine you are packing for a two-week trip. You could just start grabbing stuff and shoving it in a suitcase. Maybe you will get lucky and have everything you need. More likely, you will forget your toothbrush and pack three jackets you never wear.
A better approach: think about where you are going, what the weather will be like, what activities you have planned, and then make a packing list. Check items off as you pack them. If you find out halfway through that you do not have enough room, you re-prioritize - drop the “just in case” items and keep the essentials.
That is what planning does for an AI agent. Instead of reacting to each step blindly, the agent thinks ahead, makes a plan, and works through it systematically. And when the plan hits a snag, a good agent adjusts rather than plowing forward with a broken approach.
Why planning matters
Section titled “Why planning matters”Simple tasks do not need planning. If someone asks “What time is it?”, the agent just checks the clock. No plan required.
But consider a request like: “Research the top 5 competitors in the cloud database market, compare their pricing and features, and write a recommendation for which one we should use for our new project.”
This task requires:
- Identifying who the competitors are
- Researching each one (pricing, features, limitations)
- Understanding the user’s project requirements
- Comparing options against those requirements
- Synthesizing everything into a recommendation
Without planning, the agent might start researching one competitor in extreme detail, run out of context space, and forget to cover the other four. Or it might jump straight to a recommendation without doing thorough research.
Planning gives the agent a roadmap. It knows where it is going, what steps come next, and how to allocate its time and resources.
What planning buys you
Section titled “What planning buys you”| Benefit | Without planning | With planning |
|---|---|---|
| Task completion | May miss steps or get stuck | Systematic coverage of all steps |
| Resource efficiency | Wastes tokens on tangents | Allocates effort where it matters |
| Transparency | Hard to see what the agent is doing | Clear progress tracking |
| Error recovery | Gets lost when things go wrong | Can identify where the plan broke and adjust |
| Parallelism | Everything is sequential | Independent steps can run concurrently |
The agentic problem-solving loop
Section titled “The agentic problem-solving loop”At the heart of every planning agent is a problem-solving loop. Different frameworks describe this differently, but the core steps are consistent:
Get mission -> scan scene -> think -> act -> observe
Section titled “Get mission -> scan scene -> think -> act -> observe”+-------------------------------------------------------------------+| || +------------+ +------------+ +---------+ || | Get Mission| --> | Scan Scene | --> | Think | || +------------+ +------------+ +----+----+ || | || v || +---------+ +-----+-----+ || | Observe | <-- | Act | || +----+----+ +-----------+ || | || v || +------+------+ || | Goal met? | || | Yes -> Done | || | No -> Think | || +-------------+ || |+-------------------------------------------------------------------+Step by step
Section titled “Step by step”1. Get Mission
The agent receives its task. This could be a user request, a triggered event, or a delegated subtask from another agent.
Mission: "Create a summary report of our team's GitHub activity this week."2. Scan Scene
The agent gathers context about its current situation. What tools are available? What information does it already have? What constraints exist?
Available tools: github_api, document_writer, calendar_apiKnown context: Team = "platform-eng", Current date = March 15Constraints: Report should be under 2 pages3. Think
The agent reasons about what to do next. This is where planning happens - the agent considers its options and decides on a course of action.
Thought: "I need to: 1. Get the list of team members 2. For each member, fetch their commits, PRs, and reviews this week 3. Aggregate the data 4. Write a summary highlighting key contributions I will start by getting the team member list."4. Act
The agent executes an action - calling a tool, generating text, or making a decision.
Action: github_api.get_team_members(team="platform-eng")5. Observe
The agent examines the result of its action. Did it succeed? What information did it provide? Does the plan need to change?
Observation: Team has 8 members: [alice, bob, carol, dave, eve, frank, grace, hank]The loop then returns to Think, where the agent decides the next step based on what it observed. This continues until the mission is complete.
Plan-then-execute vs. reactive approaches
Section titled “Plan-then-execute vs. reactive approaches”There are two fundamental philosophies for how agents tackle tasks. Most real agents blend both, but understanding the pure forms helps you make design decisions.
Plan-then-execute
Section titled “Plan-then-execute”How it works: The agent creates a complete (or near-complete) plan before taking any action. Then it executes the plan step by step.
User: "Migrate our application from Python 2 to Python 3."
Planning phase: Step 1: Audit codebase for Python 2-specific syntax Step 2: Identify incompatible dependencies Step 3: Update dependencies to Python 3-compatible versions Step 4: Convert print statements to print functions Step 5: Fix string/unicode handling Step 6: Update integer division operators Step 7: Run test suite and fix failures Step 8: Update CI/CD pipeline to use Python 3
Execution phase: [Execute steps 1-8 in order]Strengths:
- Clear structure and progress tracking
- Can identify dependencies between steps upfront
- Enables parallel execution of independent steps
- Easy to estimate total effort and communicate progress
Weaknesses:
- Plan may be wrong or incomplete (you do not know what you do not know)
- Rigid - hard to accommodate surprises
- Upfront planning takes time and tokens
- Can lead to over-planning for simple tasks
Reactive approach
Section titled “Reactive approach”How it works: The agent takes things one step at a time. It looks at the current situation, decides the best next action, takes it, and reassesses. This is essentially the ReAct pattern from Lesson 4.
User: "Migrate our application from Python 2 to Python 3."
Step 1: Think: "First, I should understand the codebase. Let me scan for Python 2 syntax." Act: Search for Python 2 patterns Observe: Found 47 print statements, 12 unicode issues, 3 deprecated imports
Step 2: Think: "The print statements are the most common issue. Let me start there." Act: Convert print statements Observe: 47 print statements converted
Step 3: Think: "Now let me tackle the unicode issues..." [continues step by step]Strengths:
- Highly adaptive - responds to what it actually finds
- No wasted planning effort
- Works well for exploratory tasks
- Naturally handles surprises
Weaknesses:
- No big-picture view of progress
- May miss important steps that upfront planning would catch
- Hard to parallelize - each step depends on the previous observation
- Can wander off track without a guiding plan
The pragmatic middle ground
Section titled “The pragmatic middle ground”Most production agents use a hybrid approach:
- Light upfront planning: Create a rough, high-level plan (3-7 steps)
- Reactive execution: Use ReAct-style reasoning within each step
- Periodic re-planning: After major steps, reassess and adjust the plan
High-level plan (light): 1. Audit the codebase 2. Fix syntax issues 3. Update dependencies 4. Run tests and fix failures
Executing step 1 (reactive): Think -> Act -> Observe -> Think -> Act -> Observe -> Step complete
Re-plan after step 1: "The audit revealed more issues than expected. Adding step 2b for database migration code that uses a deprecated library."This gives you the structure of planning with the adaptability of reactive execution.
When to favor each approach
Section titled “When to favor each approach”| Situation | Favor planning | Favor reactive |
|---|---|---|
| Task is well-understood | Yes | |
| Task is exploratory | Yes | |
| Multiple independent subtasks | Yes | |
| High uncertainty about what you will find | Yes | |
| Need to communicate progress | Yes | |
| Speed matters more than thoroughness | Yes | |
| Failure is costly | Yes | |
| Task is simple (fewer than 3 steps) | Yes |
Hierarchical planning
Section titled “Hierarchical planning”Breaking big tasks into smaller ones
Section titled “Breaking big tasks into smaller ones”Some tasks are too complex to plan in a single flat list. Hierarchical planning breaks a big goal into sub-goals, which are further broken into sub-sub-goals - like a work breakdown structure in project management.
The project management analogy
Section titled “The project management analogy”Think about how a software project is organized:
Epic: Launch new payment system | +-- Story: Design payment API | +-- Task: Define endpoints | +-- Task: Design data models | +-- Task: Write API specification | +-- Story: Implement payment processing | +-- Task: Integrate with payment gateway | +-- Task: Handle error cases | +-- Task: Add retry logic | +-- Story: Add monitoring and alerts +-- Task: Set up logging +-- Task: Define SLOs +-- Task: Configure alert rulesAn agent uses the same structure. For example, “Write a blog post about Kubernetes networking” breaks into:
Sub-goal 1: Research Task 1.1: Identify key networking concepts Task 1.2: Find recent developments and best practices
Sub-goal 2: Outline and Write Task 2.1: Create section structure Task 2.2: Write each section with code examples
Sub-goal 3: Review Task 3.1: Check technical accuracy Task 3.2: Verify code examples workBenefits of hierarchical planning
Section titled “Benefits of hierarchical planning”- Manageable chunks. Each task is small enough to execute without losing focus.
- Clear dependencies. You can see which tasks depend on others and which can run in parallel.
- Progress tracking. You know exactly where you are in the overall plan.
- Delegation. In multi-agent systems, different sub-goals can be assigned to specialized agents.
Levels of abstraction
Section titled “Levels of abstraction”| Level | What it describes | Example |
|---|---|---|
| Goal | What success looks like | ”Deploy the application to production” |
| Sub-goal | Major phases of work | ”Prepare the environment” |
| Task | Specific actions | ”Create the Cloud Run service” |
| Step | Atomic operations | ”Run gcloud run deploy” |
Agents typically plan at the sub-goal and task level. Steps are handled by ReAct-style execution within each task.
Re-planning: adapting when things go wrong
Section titled “Re-planning: adapting when things go wrong”No plan survives first contact with reality. Good agents detect when a plan is failing and adjust.
When to re-plan
Section titled “When to re-plan”| Trigger | Example | Response |
|---|---|---|
| Task failure | API returns an error | Try alternative approach or skip and revisit |
| New information | Discovered the database uses a different schema than expected | Update the plan to account for the actual schema |
| Changed requirements | User adds a new requirement mid-task | Incorporate the new requirement into the plan |
| Resource constraints | Running low on context space or API quota | Simplify remaining steps |
| Blocked dependency | A required service is down | Reorder tasks to work on unblocked items first |
Re-planning strategies
Section titled “Re-planning strategies”1. Local adjustment
Fix the immediate problem without changing the overall plan.
Original plan step: "Query the users table for active accounts"Failure: "Table 'users' does not exist"Adjustment: "Query the 'accounts' table instead (it has the same data)"2. Step insertion
Add new steps to handle an unexpected situation.
Original plan: 1. Fetch data 2. Process data 3. Generate report
After discovering data needs cleaning: 1. Fetch data 1b. Clean and validate data <-- inserted 2. Process data 3. Generate report3. Plan revision
Significantly restructure the remaining plan based on what you have learned.
Original plan: "Migrate the monolith to microservices"After analysis: "The codebase is too tightly coupled for a direct migration."Revised plan: "First, introduce module boundaries within the monolith (strangler fig pattern), then gradually extract services."4. Goal modification
In extreme cases, the agent recognizes the original goal is not achievable and proposes a revised goal. For example, if the target was “reduce latency to under 50ms” but the database query alone takes 80ms, the agent might propose targeting 100ms with caching for frequently accessed endpoints.
The key principle: try a local fix first, and only restructure the remaining plan if the local fix is not possible.
Reasoning techniques
Section titled “Reasoning techniques”Planning requires reasoning - the ability to think through problems logically. Several techniques help agents reason more effectively.
Chain-of-Thought (CoT)
Section titled “Chain-of-Thought (CoT)”What it is: The agent works through a problem step by step, showing its reasoning at each stage. Instead of jumping to an answer, it “thinks out loud.”
Analogy: Showing your work on a math test. Instead of writing just the answer, you write each step of the calculation.
Without Chain-of-Thought: Q: "If a server handles 100 requests/second and we need to handle 10,000 requests/second, how many servers do we need?" A: "100 servers"
With Chain-of-Thought: Q: [same question] A: "Let me work through this: - One server handles 100 requests/second - We need 10,000 requests/second total - 10,000 / 100 = 100 servers for the raw capacity - But we should account for redundancy. With N+1 redundancy, we need 101. - And servers do not run at 100% efficiency. At 80% target utilization: 10,000 / (100 * 0.8) = 125 servers - With N+1 redundancy: 126 servers Answer: 126 servers for production, assuming 80% target utilization and N+1 redundancy."The step-by-step reasoning catches nuances (redundancy, utilization) that a snap answer misses.
When to use CoT:
- Math and logic problems
- Multi-step reasoning
- Tasks where intermediate steps matter
- Debugging and root cause analysis
Tree-of-Thoughts (ToT)
Section titled “Tree-of-Thoughts (ToT)”What it is: Instead of following a single chain of reasoning, the agent explores multiple possible paths and evaluates which one is most promising. Think of it as brainstorming several approaches before committing to one.
Analogy: A chess player considering several possible moves, thinking a few steps ahead for each one, and then choosing the best path.
Problem: "Our API is timing out under load. How should we fix it?"
Branch 1: Add caching -> Evaluate: "Would reduce database load by ~60%. Implementation takes 2 days. Risk: cache invalidation complexity." Score: 7/10
Branch 2: Optimize database queries -> Evaluate: "Could improve query time by ~40%. Implementation takes 3 days. Risk: low, well-understood approach." Score: 6/10
Branch 3: Add horizontal scaling -> Evaluate: "Handles any load level. Implementation takes 1 day with Kubernetes. Risk: increases infrastructure cost." Score: 8/10
Decision: Start with horizontal scaling (fastest win), then add caching for long-term efficiency.When to use ToT:
- Problems with multiple valid approaches
- Strategic decisions
- Tasks where the first idea might not be the best
- Architecture and design decisions
Self-consistency
Section titled “Self-consistency”What it is: The agent solves the same problem multiple times using different reasoning paths, then checks if the answers agree. If most paths lead to the same conclusion, confidence is high. If they disagree, more investigation is needed.
Analogy: Asking three different mechanics about a car problem. If they all say “bad alternator,” you can be confident. If they each say something different, you need more diagnostics.
Problem: "Is this code change safe to deploy on a Friday?"
Reasoning path 1 (risk analysis): "The change modifies the payment flow. Payment changes on Friday mean weekend incidents. Verdict: No."
Reasoning path 2 (scope analysis): "The change is 5 lines and has 95% test coverage. Small, well-tested changes are low risk. Verdict: Yes, with monitoring."
Reasoning path 3 (historical analysis): "Similar changes in the past have caused issues 15% of the time. That is above our 10% threshold. Verdict: No."
Consensus: 2 out of 3 say No. Recommendation: Wait until Monday.When to use self-consistency:
- High-stakes decisions
- Tasks where errors are costly
- Ambiguous problems with no clear single answer
- Validation of critical reasoning
Comparing reasoning techniques
Section titled “Comparing reasoning techniques”| Technique | Approach | Strength | Cost | Best for |
|---|---|---|---|---|
| Chain-of-Thought | Step-by-step linear | Thoroughness | 1x (single pass) | Most reasoning tasks |
| Tree-of-Thoughts | Explore multiple paths | Finds best approach | 3-5x (multiple branches) | Strategic decisions |
| Self-consistency | Multiple independent attempts | Confidence calibration | 3-5x (multiple passes) | High-stakes validation |
The orchestration layer
Section titled “The orchestration layer”The orchestration layer is the control system that manages how an agent plans, reasons, and executes. Think of it as the conductor of an orchestra - it does not play any instruments, but it decides who plays what and when.
What the orchestration layer does
Section titled “What the orchestration layer does”+----------------------------------------------------------+| ORCHESTRATION LAYER || || +----------+ +-----------+ +----------+ +--------+ || | Plan | | Execute | | Monitor | | Re-plan| || | Manager | | Engine | | & Eval | | Logic | || +----------+ +-----------+ +----------+ +--------+ || || Responsibilities: || - Break goals into executable steps || - Decide execution order and parallelism || - Route steps to the right tools or sub-agents || - Track progress and state || - Detect failures and trigger re-planning || - Manage context window usage || - Enforce guardrails and safety checks |+----------------------------------------------------------+Orchestration in practice
Section titled “Orchestration in practice”In Google Cloud’s Vertex AI Agent Engine, the orchestration layer handles:
- Routing: Directing user requests to the right agent or tool
- State management: Tracking where the agent is in its plan
- Tool execution: Calling tools and processing results
- Error handling: Catching failures and deciding how to recover
- Context management: Keeping the context window organized
The Agent Development Kit (ADK) gives you building blocks for customizing orchestration behavior. You define the agent’s tools, instructions, and behavior - the framework handles the execution loop.
Key orchestration decisions
Section titled “Key orchestration decisions”| Decision | Options | Trade-off |
|---|---|---|
| How much to plan upfront | Full plan vs. next step only | Thoroughness vs. flexibility |
| When to re-plan | After every step vs. only on failure | Adaptability vs. overhead |
| How to handle failure | Retry, skip, abort, or re-plan | Resilience vs. cost |
| Sequential vs. parallel | One step at a time vs. concurrent | Simplicity vs. speed |
| How much reasoning to show | Internal only vs. exposed to user | Transparency vs. noise |
Common failure modes
Section titled “Common failure modes”Even well-designed planning agents can fail in predictable ways. Knowing these failure modes helps you build defenses against them.
1. Infinite loops
Section titled “1. Infinite loops”What happens: The agent gets stuck repeating the same action or re-planning endlessly without making progress.
Example:
Think: "I need to find the user's email. Let me search the database."Act: search_database(query="user email")Observe: No results found.Think: "I need to find the user's email. Let me search the database."Act: search_database(query="user email")Observe: No results found.[repeats forever]Prevention:
- Set a maximum iteration count for any loop
- Track actions taken and detect repetition
- After N failed attempts at the same action, force the agent to try a different approach
- Include a “give up gracefully” option - ask the user for help
2. Plan drift
Section titled “2. Plan drift”What happens: The agent gradually wanders away from the original goal, following interesting tangents instead of staying on track.
Example:
Original goal: "Write a summary of Q4 sales performance"Step 1: Fetch Q4 sales data [on track]Step 2: Notice an anomaly in November data [slightly off track]Step 3: Deep dive into November anomaly [drifting]Step 4: Research industry trends that might explain the anomaly [lost]Step 5: Write a report about industry trends [completely off track]Prevention:
- Periodically check: “Is what I am doing aligned with the original goal?”
- Include the original goal in every reasoning step
- Set scope boundaries in the plan
- Use a separate evaluation step to check relevance
3. Over-planning simple tasks
Section titled “3. Over-planning simple tasks”What happens: The agent spends more time planning than it would take to just do the task.
Example:
User: "What is 2 + 2?"
Agent Plan: Step 1: Identify the mathematical operation (addition) Step 2: Identify the operands (2 and 2) Step 3: Verify the operands are valid numbers Step 4: Perform the addition Step 5: Verify the result Step 6: Format the response
[Just say "4"!]Prevention:
- Estimate task complexity before deciding to plan
- Simple tasks (single-step, clear answer) should skip planning entirely
- Set a planning time budget proportional to task complexity
4. Cascading failures
Section titled “4. Cascading failures”What happens: One failed step causes subsequent steps to fail in a chain reaction, and the agent does not detect the root cause.
Example:
Step 1: Fetch user profile -> Returns error (auth expired)Step 2: Process user preferences -> Fails (no profile data)Step 3: Generate recommendations -> Fails (no preferences)Step 4: Format output -> Fails (no recommendations)Agent: "I was unable to complete the task due to formatting errors." [Wrong! The real problem was expired authentication in step 1]Prevention:
- Treat each step’s output as input validation for the next step
- On failure, trace back to find the root cause
- Fail fast on critical dependencies rather than trying to proceed without them
- Report the actual root cause, not just the latest symptom
5. Context exhaustion
Section titled “5. Context exhaustion”What happens: The plan and its execution history fill up the context window, leaving no room for the agent to reason about the current step.
Prevention:
- Summarize completed steps instead of keeping full details
- Store intermediate results externally and reference them by ID
- Budget context space: reserve a fixed amount for reasoning about the current step
- See Lesson 5: Memory and Context for detailed strategies
Failure mode summary
Section titled “Failure mode summary”| Failure mode | Symptom | Key prevention |
|---|---|---|
| Infinite loops | Same action repeated | Max iteration count + repetition detection |
| Plan drift | Agent wanders off topic | Periodic goal alignment check |
| Over-planning | Excessive planning for simple tasks | Complexity estimation before planning |
| Cascading failures | Wrong root cause reported | Trace back on failure + fail fast |
| Context exhaustion | Agent loses coherence | Context budgeting + summarization |
Putting it together: a practical example
Section titled “Putting it together: a practical example”Here is a condensed example showing how planning, reasoning, and re-planning combine:
User: "Analyze our error logs from the past week and create Jira tickets for the top issues."
PLAN: Phase 1: Query error logs (past 7 days, ERROR + FATAL severity) Phase 2: Group by type, rank by frequency, identify top 5 Phase 3: For each issue, check for existing tickets, create if new Phase 4: Summarize findings
EXECUTE (ReAct within each phase): Act: log_query(severity=["ERROR","FATAL"], days=7) Observe: 12,847 entries returned Act: log_analyze(group_by="error_message", sort="count_desc") Observe: Top issues identified (connection timeouts, null pointer, rate limits)
Act: jira_search(query="Connection timeout payment-service") Observe: No existing ticket -> Create new ticket PLATFORM-1234
RE-PLAN (mid-execution): Observe: PLATFORM-999 already exists for NullPointerException Adjust: Update existing ticket with new occurrence count instead of creating duplicate
DELIVER: "Analyzed 12,847 errors. Created 4 new tickets, updated 1 existing one."Notice how the agent uses hierarchical planning upfront, ReAct-style execution within each phase, and re-planning when it discovers an existing ticket. All the patterns working together.
Key takeaways
Section titled “Key takeaways”-
Planning transforms complex tasks into manageable steps. Without it, agents either miss important steps or waste effort on tangents.
-
The problem-solving loop (Mission, Scene, Think, Act, Observe) is the engine of every planning agent. Understanding this loop helps you design and debug agent behavior.
-
Plan-then-execute and reactive approaches are two ends of a spectrum. Most practical agents use a hybrid: light upfront planning with reactive execution.
-
Hierarchical planning handles complex tasks by breaking them into goals, sub-goals, tasks, and steps. This mirrors how project management works.
-
Re-planning is essential. Plans will break. Good agents detect failures early and adjust. The ability to adapt separates useful agents from fragile ones.
-
Reasoning techniques (CoT, ToT, self-consistency) make planning more effective. Chain-of-Thought for step-by-step logic, Tree-of-Thoughts for exploring alternatives, self-consistency for validating critical decisions.
-
Watch out for common failure modes. Infinite loops, plan drift, over-planning, cascading failures, and context exhaustion are predictable problems with known solutions.
-
The orchestration layer ties it all together. It manages the plan lifecycle, routes execution, handles errors, and keeps the agent on track.
Further reading
Section titled “Further reading”Next lesson: Multi-Agent Systems