AI agents go beyond chat. They reason about a task, decide which tools to call, execute multi-step workflows, and adapt when things don't go as planned. LangChain is the most widely adopted framework for building them, with solid abstractions for tool use, memory, and orchestration.
This guide covers building production-ready agents with LangChain, from basic tool-calling implementations to autonomous systems with persistent memory, error recovery, and deployment patterns that hold up under real traffic. If you're just getting started, my intro to agentic AI systems with LangChain walks through the fundamentals first.
What Are AI Agents?
AI agents are systems built on top of large language models that can reason about a problem, break it into steps, call external tools and APIs to act on the world, keep context across interactions, and adapt their next move based on what just happened.
Unlike chatbots that respond to a query and then forget about it, agents plan sequences of actions, call tools, read the results, and decide what to do next. That makes them useful for business processes, support workflows, and analysis tasks where the path to the answer is not fixed in advance.
Why LangChain
You could build agents with raw OpenAI/Anthropic API calls. Some people prefer that. But LangChain handles the plumbing that every agent needs, and that plumbing is more work than most people expect.
Agent patterns that work. ReAct (reason, act, observe, repeat) is the default and handles most use cases. Plan-and-Execute breaks complex tasks into subtasks first. You get these out of the box instead of implementing the control loop yourself.
Tool integration in minutes. The @tool decorator turns any Python function into an agent-callable tool. The agent reads the docstring to decide when to use it. Adding a new capability to your agent is literally writing a function with a good docstring.
Memory that scales. ConversationBufferMemory stores everything (fine for short conversations). ConversationSummaryMemory compresses long conversations to stay within token limits. VectorStoreMemory does semantic search over past interactions. Pick the strategy that fits your context window constraints.
Production utilities. Callback handlers for logging, token counters for cost tracking, retry logic for API failures, streaming for real-time responses. The stuff you don't think about until you deploy and realize you need all of it.
Building Your First AI Agent
Let's build a research assistant agent that can search the web, analyze content, and provide synthesized answers. This example demonstrates core agent capabilities including tool usage and reasoning.
First, install the required dependencies:
pip install langchain langchain-openai langchain-community duckduckgo-search python-dotenv
Now create a basic agent with web search capabilities:
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool
from langchain import hub
from duckduckgo_search import DDGS
# Load environment variables
load_dotenv()
# Initialize the LLM
llm = ChatOpenAI(
model="gpt-4o",
temperature=0,
api_key=os.getenv("OPENAI_API_KEY")
)
# Create a web search tool
def search_web(query: str) -> str:
"""Search the web for current information."""
try:
results = DDGS().text(query, max_results=3)
formatted_results = []
for r in results:
formatted_results.append(f"Title: {r['title']}\nContent: {r['body']}\nURL: {r['href']}\n")
return "\n".join(formatted_results)
except Exception as e:
return f"Search failed: {str(e)}"
# Define tools available to the agent
tools = [
Tool(
name="WebSearch",
func=search_web,
description="Useful for searching current information on the internet. Input should be a search query string."
)
]
# Get the ReAct prompt template from LangChain hub
prompt = hub.pull("hwchase17/react")
# Create the agent
agent = create_react_agent(llm, tools, prompt)
# Create an executor to run the agent
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True,
max_iterations=5
)
# Test the agent
if __name__ == "__main__":
response = agent_executor.invoke({
"input": "What are the latest developments in AI agent frameworks released in October 2025?"
})
print("\nFinal Answer:")
print(response["output"])
This agent shows the ReAct pattern in its simplest form: it reads the query, reasons about what it needs, calls the web search tool, and synthesizes an answer. Setting verbose=True exposes the reasoning chain step by step, which is the single most useful thing you can do when an agent starts behaving oddly.
Building Advanced Multi-Tool Agents
Production AI agents typically need access to multiple tools and data sources. Let's build an agent that combines web search, calculations, and data analysis:
import os
import json
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool
from langchain import hub
from duckduckgo_search import DDGS
import re
load_dotenv()
# Initialize LLM with lower temperature for more consistent behavior
llm = ChatOpenAI(
model="gpt-4o",
temperature=0.1,
api_key=os.getenv("OPENAI_API_KEY")
)
# Web search tool
def search_web(query: str) -> str:
"""Search the web for current information."""
try:
results = DDGS().text(query, max_results=3)
formatted_results = []
for r in results:
formatted_results.append(
f"Title: {r['title']}\n"
f"Content: {r['body'][:200]}...\n"
f"URL: {r['href']}\n"
)
return "\n".join(formatted_results) if formatted_results else "No results found"
except Exception as e:
return f"Search error: {str(e)}"
# Calculator tool for numerical operations
def calculate(expression: str) -> str:
"""Perform mathematical calculations. Input should be a valid Python math expression."""
try:
# Remove any whitespace
expression = expression.strip()
# Basic safety check - only allow numbers and basic operators
if not re.match(r'^[\d\s\+\-\*\/\(\)\.\%]+$', expression):
return "Invalid expression. Only numbers and basic math operators allowed."
result = eval(expression)
return f"Result: {result}"
except Exception as e:
return f"Calculation error: {str(e)}"
# Data analysis tool
def analyze_data(data_string: str) -> str:
"""Analyze numerical data and provide statistics. Input should be comma-separated numbers."""
try:
numbers = [float(x.strip()) for x in data_string.split(',')]
if not numbers:
return "No valid numbers provided"
avg = sum(numbers) / len(numbers)
sorted_nums = sorted(numbers)
median = sorted_nums[len(sorted_nums)//2]
analysis = {
"count": len(numbers),
"min": min(numbers),
"max": max(numbers),
"average": round(avg, 2),
"median": round(median, 2),
"sum": sum(numbers)
}
return json.dumps(analysis, indent=2)
except Exception as e:
return f"Analysis error: {str(e)}"
# Define all tools
tools = [
Tool(
name="WebSearch",
func=search_web,
description="Search the internet for current information. Use this when you need up-to-date data or facts. Input: search query string"
),
Tool(
name="Calculator",
func=calculate,
description="Perform mathematical calculations. Use for any numerical operations. Input: math expression like '(100 * 1.5) + 50'"
),
Tool(
name="DataAnalyzer",
func=analyze_data,
description="Analyze a list of numbers and get statistics. Input: comma-separated numbers like '10, 20, 30, 40, 50'"
)
]
# Get ReAct prompt template
prompt = hub.pull("hwchase17/react")
# Create agent with multiple tools
agent = create_react_agent(llm, tools, prompt)
# Create executor with custom configuration
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True,
max_iterations=6,
max_execution_time=60, # 60 second timeout
return_intermediate_steps=True
)
# Example usage
if __name__ == "__main__":
# Complex query requiring multiple tools
query = """
Research the current pricing of OpenAI GPT-4 API per 1M tokens.
Then calculate the cost for processing 5 million input tokens and 2 million output tokens.
Finally, analyze these monthly costs over 6 months: 850, 920, 1100, 1050, 980, 1200
"""
result = agent_executor.invoke({"input": query})
print("\n" + "="*50)
print("FINAL ANSWER:")
print("="*50)
print(result["output"])
print("\n" + "="*50)
print("AGENT STEPS:")
print("="*50)
for i, step in enumerate(result["intermediate_steps"], 1):
print(f"\nStep {i}:")
print(f"Action: {step[0].tool}")
print(f"Input: {step[0].tool_input}")
print(f"Result: {step[1][:100]}...")
This agent can handle queries that need multiple tools chained together. It picks which tools to use and in what order based on the query, so the same agent code works for very different tasks.
Implementing Memory Systems
Agents without memory forget everything between conversations. LangChain provides several memory types for different use cases. For workflows with branching, retries, and checkpointed state, LangGraph state machines give you primitives that go beyond conversation buffers:
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool
from langchain.memory import ConversationBufferMemory, ConversationSummaryMemory
from langchain import hub
from langchain.prompts import PromptTemplate
load_dotenv()
# Initialize LLM
llm = ChatOpenAI(
model="gpt-4o",
temperature=0.7,
api_key=os.getenv("OPENAI_API_KEY")
)
# Simple note-taking tool for demonstration
notes_storage = {}
def save_note(note_content: str) -> str:
"""Save a note for later reference. Input format: 'key: content'"""
try:
if ':' not in note_content:
return "Format should be 'key: content'"
key, content = note_content.split(':', 1)
notes_storage[key.strip()] = content.strip()
return f"Note saved with key: {key.strip()}"
except Exception as e:
return f"Error saving note: {str(e)}"
def retrieve_note(key: str) -> str:
"""Retrieve a previously saved note by key."""
key = key.strip()
if key in notes_storage:
return f"Note content: {notes_storage[key]}"
return f"No note found for key: {key}"
# Define tools
tools = [
Tool(
name="SaveNote",
func=save_note,
description="Save information for later reference. Input: 'key: content'"
),
Tool(
name="RetrieveNote",
func=retrieve_note,
description="Get a previously saved note. Input: the key name"
)
]
# Initialize memory - stores full conversation history
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True
)
# For longer conversations, use ConversationSummaryMemory instead:
# memory = ConversationSummaryMemory(
# llm=llm,
# memory_key="chat_history",
# return_messages=True
# )
# Create custom prompt that includes memory
template = """Answer the following questions as best you can. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Previous conversation:
{chat_history}
Begin!
Question: {input}
Thought:{agent_scratchpad}"""
prompt = PromptTemplate(
input_variables=["input", "chat_history", "agent_scratchpad", "tools", "tool_names"],
template=template
)
# Create agent with memory
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
memory=memory,
verbose=True,
handle_parsing_errors=True,
max_iterations=5
)
# Interactive conversation demonstrating memory
if __name__ == "__main__":
print("Agent with Memory - Interactive Demo")
print("Type 'quit' to exit\n")
# Pre-load some context
test_queries = [
"Save this note - project_deadline: The AI agent project is due on November 15, 2025",
"Save another note - budget: The project budget is $50,000",
"What is the project deadline?",
"What's the budget we discussed?",
"How many days until the deadline from today (October 17, 2025)?"
]
for query in test_queries:
print(f"\n{'='*60}")
print(f"Query: {query}")
print('='*60)
response = agent_executor.invoke({"input": query})
print(f"\nAnswer: {response['output']}\n")
input("Press Enter to continue...")
This example shows how memory enables the agent to reference previous conversations and maintain context. The agent can retrieve information from earlier in the conversation without requiring the user to repeat themselves.
Building Domain-Specific Agent Tools
Custom tools are where agents start being useful for actual business systems. Here's how to wire them up:
import os
import json
from typing import Optional, Dict, Any
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool, StructuredTool
from langchain.pydantic_v1 import BaseModel, Field
from langchain import hub
load_dotenv()
# Initialize LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0, api_key=os.getenv("OPENAI_API_KEY"))
# Mock database for demonstration
customer_database = {
"CUST001": {"name": "Acme Corp", "status": "active", "balance": 15000, "tier": "premium"},
"CUST002": {"name": "TechStart Inc", "status": "active", "balance": 5000, "tier": "standard"},
"CUST003": {"name": "Global Systems", "status": "suspended", "balance": -2000, "tier": "premium"}
}
order_database = {
"ORD001": {"customer_id": "CUST001", "amount": 3000, "status": "shipped", "items": 5},
"ORD002": {"customer_id": "CUST002", "amount": 1200, "status": "processing", "items": 2},
"ORD003": {"customer_id": "CUST001", "amount": 5500, "status": "delivered", "items": 8}
}
# Define input schemas for structured tools
class CustomerLookupInput(BaseModel):
customer_id: str = Field(description="The customer ID to lookup, format: CUST###")
class OrderLookupInput(BaseModel):
order_id: str = Field(description="The order ID to lookup, format: ORD###")
class CustomerSearchInput(BaseModel):
query: str = Field(description="Search term for customer name")
# Tool functions
def lookup_customer(customer_id: str) -> str:
"""Get detailed customer information by ID."""
customer_id = customer_id.upper().strip()
if customer_id in customer_database:
customer = customer_database[customer_id]
return json.dumps(customer, indent=2)
return f"Customer {customer_id} not found"
def lookup_order(order_id: str) -> str:
"""Get order details by order ID."""
order_id = order_id.upper().strip()
if order_id in order_database:
order = order_database[order_id]
# Enrich with customer name
customer = customer_database.get(order["customer_id"], {})
order["customer_name"] = customer.get("name", "Unknown")
return json.dumps(order, indent=2)
return f"Order {order_id} not found"
def search_customers(query: str) -> str:
"""Search for customers by name."""
query = query.lower()
results = []
for cust_id, data in customer_database.items():
if query in data["name"].lower():
results.append(f"{cust_id}: {data['name']} ({data['status']}, {data['tier']})")
if results:
return "\n".join(results)
return "No customers found matching the search"
def get_customer_orders(customer_id: str) -> str:
"""Get all orders for a specific customer."""
customer_id = customer_id.upper().strip()
orders = []
for order_id, order_data in order_database.items():
if order_data["customer_id"] == customer_id:
orders.append(f"{order_id}: ${order_data['amount']} - {order_data['status']}")
if orders:
return "\n".join(orders)
return f"No orders found for customer {customer_id}"
# Create structured tools with defined schemas
tools = [
StructuredTool.from_function(
func=lookup_customer,
name="CustomerLookup",
description="Look up detailed customer information by customer ID. Returns name, status, balance, and tier.",
args_schema=CustomerLookupInput
),
StructuredTool.from_function(
func=lookup_order,
name="OrderLookup",
description="Get order details including amount, status, and customer information by order ID.",
args_schema=OrderLookupInput
),
StructuredTool.from_function(
func=search_customers,
name="CustomerSearch",
description="Search for customers by name. Returns matching customer IDs with basic info.",
args_schema=CustomerSearchInput
),
Tool(
name="CustomerOrders",
func=get_customer_orders,
description="Get all orders for a specific customer ID. Input: customer ID (CUST###)"
)
]
# Create agent
prompt = hub.pull("hwchase17/react")
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True,
max_iterations=6
)
# Example business queries
if __name__ == "__main__":
queries = [
"Find information about Acme Corp and show me their orders",
"What is the total value of all orders for customer CUST001?",
"Which customers have suspended status?",
"Show me details for order ORD002"
]
for query in queries:
print(f"\n{'='*70}")
print(f"Query: {query}")
print('='*70)
result = agent_executor.invoke({"input": query})
print(f"\nAnswer: {result['output']}\n")
input("Press Enter for next query...")
This pattern shows how to build tools that talk to real business systems. Using StructuredTool with Pydantic schemas gives you type safety and clearer error messages, which is what you want once the agent is in production and you cannot babysit every call.
What I've learned shipping these
Reliable agents come from boring, careful work in a few specific areas: error handling, monitoring, and cost.
Catch errors at the tool boundary
Wrap every tool function in try-except and return informative error messages. Agents should handle API failures, timeouts, and invalid inputs without crashing the whole run. A tool that returns "API timeout, retry in 30s" is far more useful to an agent than a raw stack trace.
Cap iterations and time
Set max_iterations (typically 5-10) and max_execution_time (30-120 seconds) so a confused agent cannot burn through your budget. Without these caps, a stuck agent will quietly loop until something else stops it.
Pick temperature with intent
Use temperature 0-0.1 when you need deterministic behavior. Use 0.3-0.7 for creative tasks. Anything above 0.8 in production is asking for trouble.
Track tokens and cost per interaction
Implement callback handlers to count tokens per request. Set alerts for unusual spending. The number that actually matters is cost per completed task, not cost per API call. A cheap-per-call model that loops three times is more expensive than an expensive model that gets it right on the first try.
Log the reasoning, not just the result
Store intermediate steps and reasoning chains for debugging and audit. When an agent makes a bad decision in production, you need to be able to see what it was thinking, not just what it did.
Retry with backoff
LLM API calls fail. Implement retries with increasing delays so a transient blip does not cascade into a failed run, and so you do not hammer a struggling upstream provider.
Version your prompts
Store prompts in version control and treat them as code. Small wording changes can significantly change agent behavior, and you want a paper trail when something starts misbehaving.
Deployment
Scaling
Deploy agents as containers and run them on Kubernetes (or an equivalent) for horizontal scaling. Use request queuing to absorb traffic spikes without hitting LLM rate limits. FastAPI works well for concurrent request handling.
Cost
Monitor LLM cost per interaction. Cache repeated queries. Use cheaper models for simple tasks and reserve the expensive ones for the reasoning steps that need them. Set daily and monthly spending limits at the provider level, not just in your own code.
Security
Never put API keys in code. Use environment variables or a secret manager. Authenticate agent endpoints, validate all user input before it reaches an agent, and add content filters to make prompt injection harder.
Monitoring
Track response time, error rate, token usage, cost per interaction, and user satisfaction. Alert on sudden cost increases or rising error rates. LangSmith handles a lot of this out of the box, or you can wire your own logging if you need something custom.
Where these agents are showing up
A few patterns where I keep seeing production agents working well:
- Customer support. Agents search knowledge bases, pull customer data, update tickets, and escalate when they hit something they cannot handle.
- Data analysis and reporting. Agents query databases, run calculations, build visualizations, and write summaries from raw data.
- Research and content workflows. Agents pull information from multiple sources, reconcile it, and produce structured reports or article drafts.
- DevOps and IT. Agents watch system health, diagnose problems, run remediation scripts, and write up incidents.
- Sales and CRM. Agents qualify leads, update records, schedule meetings, and draft outreach based on customer history.
Wrapping up
LangChain agents are a real step up from scripted automation. You get LLM reasoning plus tool integration plus memory in a single package, and that combination handles multi-step workflows that used to need a human.
The framework covers the plumbing, from agent types to error handling, so the hard part becomes the design decisions: which tools the agent gets, what success looks like, and how much autonomy you actually want to grant. Clear tool descriptions, sensible execution limits, and proper monitoring make far more difference than fancy architecture.
Start with a single-tool agent. Validate it on real-looking inputs. Add tools and memory only when you have evidence you need them. Done with care, agents can take a meaningful chunk of work off your team's plate without sacrificing the quality bar.
Where to go next
- Set up your dev environment with LangChain and an LLM provider API key (OpenAI or Anthropic).
- Build a simple single-tool agent so the ReAct execution flow stops being abstract.
- Add custom tools that connect to your actual business systems or data.
- Add memory once multi-turn context starts mattering.
- Deploy with monitoring, error handling, and cost controls before real users hit it.
- Iterate against actual user feedback and the metrics that matter for your use case.