_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Planning algoritmy pro agenty

09. 02. 2026 4 min read intermediate

Planning algorithms represent a key component of modern AI agents, enabling them to strategically plan and execute complex tasks. These algorithms combine search, optimization, and decision-making to create efficient action sequences.

Planning Algorithms for AI Agents: From Theory to Practice

Planning algorithms represent a key component of modern AI agents that need to make decisions about action sequences to achieve defined goals. While large language models (LLMs) excel at text generation, their systematic planning capability is often limited. Therefore, production agent development combines LLMs with dedicated planning algorithms.

Basic Principles of Planning Algorithms

Planning in the context of AI agents involves finding a sequence of actions that leads from the current state to the desired target state. Key components include:

  • State space - representation of possible world states
  • Action space - set of available actions
  • Transition model - rules for transitions between states
  • Goal specification - definition of target state
  • Cost function - evaluation of individual actions

Forward vs Backward Planning

Forward planning starts from the current state and proceeds toward the goal, while backward planning starts from the goal and works backwards. For practical deployment with LLMs, forward planning is more intuitive:

class ForwardPlanner:
    def __init__(self, llm_client, max_steps=10):
        self.llm = llm_client
        self.max_steps = max_steps

    def plan(self, current_state, goal, available_actions):
        plan = []
        state = current_state

        for step in range(self.max_steps):
            if self.is_goal_reached(state, goal):
                break

            # Using LLM to select the best action
            prompt = f"""
            Current state: {state}
            Goal: {goal}
            Available actions: {available_actions}

            Choose the best action to achieve the goal:
            """

            action = self.llm.generate(prompt)
            plan.append(action)
            state = self.apply_action(state, action)

        return plan

Hierarchical Task Network (HTN) Planning

HTN planning is particularly suitable for complex agents because it allows decomposition of high-level tasks into elementary actions. This approach combines well with LLM capabilities:

class HTNPlanner:
    def __init__(self, llm_client):
        self.llm = llm_client
        self.methods = {}  # Dictionary of methods for task decomposition

    def decompose_task(self, task, context):
        """Decompose complex task into subtasks"""
        if task in self.methods:
            return self.methods[task](context)

        # Using LLM for dynamic decomposition
        prompt = f"""
        Break down the following task into specific steps:
        Task: {task}
        Context: {context}

        Return list of subtasks in JSON format:
        """

        response = self.llm.generate(prompt)
        return self.parse_subtasks(response)

    def plan_recursive(self, task, context, depth=0):
        if self.is_primitive(task):
            return [task]  # Primitive action

        subtasks = self.decompose_task(task, context)
        plan = []

        for subtask in subtasks:
            subplan = self.plan_recursive(subtask, context, depth + 1)
            plan.extend(subplan)

        return plan

Monte Carlo Tree Search (MCTS) for Planning

MCTS combines exploration and exploitation when searching for optimal plans. It’s valuable for agents working with incomplete information:

import random
import math

class MCTSNode:
    def __init__(self, state, action=None, parent=None):
        self.state = state
        self.action = action
        self.parent = parent
        self.children = []
        self.visits = 0
        self.reward = 0.0

    def ucb_score(self, exploration_weight=1.4):
        if self.visits == 0:
            return float('inf')

        exploitation = self.reward / self.visits
        exploration = exploration_weight * math.sqrt(
            math.log(self.parent.visits) / self.visits
        )
        return exploitation + exploration

class MCTSPlanner:
    def __init__(self, llm_client, simulations=1000):
        self.llm = llm_client
        self.simulations = simulations

    def search(self, root_state, goal):
        root = MCTSNode(root_state)

        for _ in range(self.simulations):
            node = self.select(root)
            if not self.is_terminal(node.state):
                node = self.expand(node)
            reward = self.simulate(node.state, goal)
            self.backpropagate(node, reward)

        return self.get_best_action(root)

    def expand(self, node):
        """Expand node with new possible actions"""
        actions = self.get_possible_actions(node.state)
        action = random.choice(actions)
        new_state = self.apply_action(node.state, action)
        child = MCTSNode(new_state, action, node)
        node.children.append(child)
        return child

Reactive Planning with LLMs

In dynamic environments, it’s often necessary to react to changes during plan execution. Reactive planning combines pre-prepared plans with adaptation capability:

class ReactivePlanner:
    def __init__(self, llm_client):
        self.llm = llm_client
        self.current_plan = []
        self.execution_history = []

    def execute_with_monitoring(self, initial_plan, environment):
        self.current_plan = initial_plan.copy()

        while self.current_plan:
            action = self.current_plan.pop(0)

            # Monitor environment before action
            current_state = environment.get_state()
            if self.detect_plan_failure(current_state):
                self.replan(current_state, environment.goal)
                continue

            # Execute action
            result = environment.execute(action)
            self.execution_history.append((action, result))

            # Check after action
            if result.success:
                continue
            else:
                # React to failure
                recovery_action = self.generate_recovery(
                    action, result, current_state
                )
                if recovery_action:
                    self.current_plan.insert(0, recovery_action)

    def replan(self, current_state, goal):
        """Dynamic replanning when conditions change"""
        prompt = f"""
        Original plan failed. Current situation:
        State: {current_state}
        Goal: {goal}
        History: {self.execution_history[-3:]}

        Suggest a new plan from current state:
        """

        new_plan = self.llm.generate(prompt)
        self.current_plan = self.parse_plan(new_plan)

Integration with Production Systems

When deploying planning algorithms in production, performance optimization and reliability are crucial. Recommendations include:

  • Caching - storing frequently used plans
  • Timeouts - limiting planning time
  • Fallback mechanisms - backup simple strategies
  • Monitoring - tracking plan success rates
class ProductionPlanner:
    def __init__(self, llm_client, cache_size=1000):
        self.llm = llm_client
        self.plan_cache = {}
        self.performance_metrics = {
            'planning_time': [],
            'success_rate': 0.0,
            'cache_hits': 0
        }

    async def plan_with_timeout(self, state, goal, timeout=30):
        cache_key = self.get_cache_key(state, goal)

        if cache_key in self.plan_cache:
            self.performance_metrics['cache_hits'] += 1
            return self.plan_cache[cache_key]

        try:
            plan = await asyncio.wait_for(
                self.generate_plan(state, goal),
                timeout=timeout
            )
            self.plan_cache[cache_key] = plan
            return plan

        except asyncio.TimeoutError:
            # Fallback to simple heuristic plan
            return self.simple_heuristic_plan(state, goal)

Summary

Planning algorithms are essential for creating reliable AI agents capable of complex decision-making. The combination of traditional approaches like HTN planning or MCTS with LLM capabilities opens new possibilities for adaptive and intelligent agents. The key to success is appropriate algorithm selection based on specific use cases and careful implementation considering performance and reliability in production environments.

planningai agentialgorithms
Share:

CORE SYSTEMS tým

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.