Hernando Abella
AI AgentsPythonLangChain

How AI Agents Use Tools, Memory, and Reasoning

Explore how modern AI agents combine tool use, memory systems, and reasoning loops to solve complex multi-step problems autonomously.

📖 13 min read🤖 6 real examples🧠 4 reasoning patterns
TechnologiesPythonOpenAILangChainHugging FacePinecone

AI agents are no longer science fiction. They are here, they are useful, and they are transforming how we interact with software — from autonomous coding assistants to research agents that can browse the web, analyze documents, and write reports.

At their core, AI agents are language models wrapped in a loop: they perceive their environment, reason about what to do, take actions using tools, observe the results, and iterate until they achieve their goal. This article breaks down the three pillars that make agents work: tools, memory, and reasoning.


1. What Makes an Agent?

An agent is more than a chatbot. A chatbot responds. An agent acts. The defining characteristic of an AI agent is its ability to use tools, maintain memory, and apply reasoning to decide what to do next.

User Interface
Chat / API / Webhook
Orchestrator
LangChain / Custom Loop
Agent Core
Reasoning + Memory + Tool Router
Tools Layer
Search / DB / File / Code / API
Memory Store
Buffer / Summary / Vector DB

Each layer in this architecture serves a specific purpose. The orchestrator runs the reasoning loop. The agent core decides what tool to call and what to remember. The tools layer provides the agent with capabilities beyond text generation. The memory store keeps the agent grounded in context.


2. Tools: Extending Beyond Text

Language models are incredible at generating text, but they cannot browse the web, query databases, send emails, or run code. Tools bridge this gap. Each tool is a function the agent can call, described in a format the model understands — typically a JSON schema with a name, description, and parameters.

Tool Definition (OpenAI Function Calling)
from openai import OpenAI
import json

client = OpenAI()

tools = [
  {
    "type": "function",
    "function": {
      "name": "search_web",
      "description": "Search the web for current information",
      "parameters": {
        "type": "object",
        "properties": {
          "query": {
            "type": "string",
            "description": "The search query"
          }
        },
        "required": ["query"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "calculate",
      "description": "Perform a mathematical calculation",
      "parameters": {
        "type": "object",
        "properties": {
          "expression": {
            "type": "string",
            "description": "Math expression to evaluate"
          }
        },
        "required": ["expression"]
      }
    }
  }
]

When the model decides it needs to use a tool, it returns a structured JSON object with the tool name and arguments instead of a text response. The application then executes the tool, passes the result back to the model as an observation, and the model continues its reasoning loop.

Common Tool Categories

🔍
Web Search
Search the internet for current information
📄
File Reader
Read and parse documents (PDF, CSV, text)
🧮
Calculator
Perform precise mathematical operations
📧
Email Sender
Compose and send emails via API
🗄️
Database Query
Run SQL queries against a database
🐍
Code Executor
Run Python code in a sandboxed environment
Executing Tool Calls in Python
def execute_tool_call(tool_call):
    """Handle the tool call returned by the model."""
    fn_name = tool_call.function.name
    args = json.loads(tool_call.function.arguments)

    if fn_name == "search_web":
        return search_web(args["query"])
    elif fn_name == "calculate":
        return str(eval(args["expression"]))
    elif fn_name == "read_file":
        return read_file(args["path"])
    else:
        raise ValueError(f"Unknown tool: {fn_name}")

The key insight: the model never executes tools directly. It requests tool execution, and your application decides whether to honor that request. This gives you full control over security, rate limiting, and error handling.


3. Memory: Remembering What Matters

Without memory, every agent interaction starts from scratch. The agent has no idea what it did five turns ago, what the user's name is, or what previous calculations produced. Memory solves this — and there are multiple types, each suited for different scenarios.

💬Conversation Memory

Short-term recall of the current chat history within context window limits.

📦Summary Memory

Compress past conversations into summaries to retain key information beyond context limits.

🗃️Vector Memory

Store embeddings of past interactions in a vector database for semantic retrieval.

💾Entity Memory

Extract and store named entities (people, places, facts) in a structured knowledge graph.

Conversation Buffer Memory (LangChain)
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# After each interaction:
memory.chat_memory.add_user_message(user_input)
memory.chat_memory.add_ai_message(agent_response)

# Inject into the prompt:
prompt = f"""
You are a helpful assistant with memory.
Chat history:
{memory.load_memory_variables({})["chat_history"]}

User: {user_input}
Assistant:"""
Vector Memory with Pinecone
import pinecone
from openai import OpenAI

# Initialize Pinecone
pc = pinecone.Pinecone(api_key="your-api-key")
index = pc.Index("agent-memory")

def store_memory(text: str, metadata: dict = None):
    """Store a piece of information as a vector."""
    response = client.embeddings.create(
        input=text,
        model="text-embedding-3-small"
    )
    vector = response.data[0].embedding
    index.upsert([(
        str(hash(text)),
        vector,
        {"text": text, **(metadata or {})}
    )])

def recall_memory(query: str, top_k: int = 5):
    """Retrieve relevant memories."""
    response = client.embeddings.create(
        input=query,
        model="text-embedding-3-small"
    )
    vector = response.data[0].embedding
    results = index.query(
        vector=vector,
        top_k=top_k,
        include_metadata=True
    )
    return [r.metadata["text"] for r in results.matches]

In practice, most production agents use a hybrid approach: conversation buffer for recent context, summary memory for compressed history, and vector memory for long-term semantic retrieval. Entity memory is optional but powerful for agents that need to remember facts about people, projects, or domain concepts.


4. Reasoning: How Agents Decide What to Do

Reasoning is the engine that drives agent behavior. Without explicit reasoning patterns, agents default to a single-shot response — they guess and hope. With structured reasoning, they think, act, observe, and iterate. This is the difference between a smart chatbot and a true agent.

The ReAct Pattern

ReAct (Reasoning + Acting) is the most widely adopted agent loop. At every step, the model outputs a "Thought" explaining its reasoning, then an "Action" specifying which tool to call and with what arguments. After receiving the tool's result ("Observation"), it repeats the cycle until it has enough information to produce a "Final Answer".

ThoughtI need to find the latest stock price for AAPL
ActionCall: search_web('AAPL stock price today')
ObservationResult: $198.50 (up 1.2%)
ThoughtGot the price. Now I can answer the user.
Final AnswerAAPL is trading at $198.50, up 1.2% today.
ReAct Loop Implementation
def react_loop(user_input: str, max_steps: int = 10):
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": user_input}
    ]

    for step in range(max_steps):
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=AVAILABLE_TOOLS,
            tool_choice="auto"
        )

        msg = response.choices[0].message
        messages.append(msg)

        # If model responds with text, we're done
        if msg.content and not msg.tool_calls:
            return msg.content

        # Execute tool calls
        for tool_call in msg.tool_calls:
            result = execute_tool_call(tool_call)
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result
            })

    return "Max steps reached without final answer."

Reasoning Patterns Compared

🔄ReAct (Reason + Act)

Iterative loop: Think → Act → Observe → Repeat until the goal is reached.

🌳Tree of Thought (ToT)

Explore multiple reasoning paths simultaneously and evaluate the most promising ones.

📋Plan & Execute

Create a step-by-step plan first, then execute each step while monitoring progress.

🔁Reflexion

Self-critique and learn from past mistakes by evaluating agent actions and outcomes.


5. Building a Research Agent

Let's put it all together. A research agent that can search the web, read Wikipedia articles, analyze content, and produce a structured report. This agent uses tools for web search and content extraction, conversation memory to track what it has already explored, and the ReAct reasoning pattern to decide what to do next.

Research Agent
import openai
import json
import requests
from bs4 import BeautifulSoup

SYSTEM_PROMPT = """
You are a research agent. Your goal is to answer the user's
question thoroughly by searching the web and reading sources.

For each step, output:
THOUGHT: your reasoning about what to do next
ACTION: the tool to call and its arguments

Available tools:
- search_web(query): Search the web
- read_url(url): Extract text content from a URL

When you have enough information, output your final answer.
"""

def search_web(query):
    """Search using a search API."""
    response = requests.get(
        "https://api.duckduckgo.com/",
        params={"q": query, "format": "json"}
    )
    results = response.json().get("Results", [])
    return json.dumps([
        {"title": r["Title"], "url": r["FirstURL"]}
        for r in results[:5]
    ])

def read_url(url):
    """Extract readable text from a URL."""
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")
    paragraphs = soup.find_all("p")
    text = " ".join(p.get_text() for p in paragraphs[:20])
    return text[:3000]

# Same ReAct loop as above, but with RESEARCH_TOOLS
research_agent("What are the latest developments in quantum computing?")

This agent can autonomously search for information, follow links, extract content, synthesize findings, and produce a well-researched answer. It handles the tedious parts of research — searching, reading, cross-referencing — while you focus on evaluating and applying the results.


6. Popular Agent Frameworks

You don't have to build everything from scratch. These frameworks provide battle-tested implementations of agent loops, memory systems, and tool integration:

LangChain

Most popular framework for building agentic applications with tool integration.

AutoGPT

Autonomous agent that self-generates goals and tasks to achieve complex objectives.

CrewAI

Multi-agent framework where specialized agents collaborate on shared tasks.

Semantic Kernel

Microsoft's lightweight SDK for AI orchestration with native .NET integration.

Haystack

Open-source framework for building search-augmented NLP pipelines.

Dify

Open-source LLMOps platform with visual agent workflow builder.


7. Common Challenges & Solutions

🎭 Hallucination in Tool Output

Agents sometimes hallucinate tool results, especially when a tool fails silently. Always validate tool outputs before passing them back to the model. Implement retry logic and explicit error messages.

🔄 Infinite Loops

An agent can get stuck calling the same tool repeatedly. Enforce a maximum number of steps (e.g., 10-25), implement progressive timeouts, and detect repetitive tool calls.

🔒 Security & Permissions

Agents with tool access are powerful — and dangerous. Never give agents unrestricted access to databases, file systems, or external APIs. Use scoped permissions, read-only access where possible, and human-in-the-loop approval for destructive actions.

💰 Cost Management

Each reasoning step costs tokens. A complex task with 15 steps can cost 50x more than a simple prompt. Monitor token usage, implement budget limits, and use cheaper models for routine subtasks.


Key Takeaways

  • AI agents combine tools, memory, and reasoning to solve complex multi-step problems autonomously.
  • Tools extend model capabilities beyond text — search, compute, databases, and APIs are all fair game.
  • Memory systems range from simple conversation buffers to semantic vector stores for long-term recall.
  • The ReAct pattern (Thought → Action → Observation) is the foundation of most production agents.
  • Frameworks like LangChain and CrewAI provide ready-made agent infrastructure you can customize.
  • Always implement safeguards: max steps, cost limits, permission scoping, and human oversight.

AI agents represent a fundamental shift from passive chatbots to active problem-solvers. By mastering tools, memory, and reasoning, you can build systems that don't just talk — they get things done.


📘 Ready to go deeper?

Generative AI with Python

Master RAG pipelines, AI agents, tool calling, vector databases, and multimodal systems — with hands-on code throughout.

🔍 RAG & Vector DBs🤖 AI Agents🛠 Tool Calling🖼 Multimodal AI
Get it on Amazon →
Generative AI with Python book cover