LangGraph is a framework built on top of LangChain for creating AI agents that need state management and non-linear control flow. If your agent needs to loop, branch, retry failed steps, or pause for human approval, LangGraph gives you the primitives to build that without writing a custom state machine from scratch.
This guide covers building autonomous agents with LangGraph, from basic state machines to cyclical workflows with human-in-the-loop patterns and persistent checkpointing. For a concrete application of these patterns, see autonomous AI agents for CI/CD pipeline automation.
What LangGraph is
LangGraph is a framework for building stateful, multi-actor applications with LLMs. It extends LangChain with a few specific things:
- Graph-based workflow definitions, where nodes are actions and edges define transitions
- Built-in state management that persists across agent execution steps and survives long-running workflows
- Cyclical execution, so agents can loop back to previous steps based on conditions or outcomes
- Human-in-the-loop patterns for approval steps or interventions at specific decision points
- Checkpointing and persistence to save agent state and resume after interruptions
Traditional LangChain agents follow a linear ReAct pattern: Reason, Act, Observe, Repeat. That's fine for simple tool-calling. LangGraph agents can implement actual state machines with conditional branching, parallel execution, and cycles. That's the difference between an agent that calls a tool and one that runs a multi-stage approval workflow, an iterative refinement loop, or a long-running business process.
What LangGraph adds over plain LangChain
LangChain agents work fine for simple tool-calling. But try building an agent that needs to loop back to a previous step based on output quality, or pause for human approval, or run two tasks in parallel and merge the results. That's where plain LangChain agents fall apart.
Explicit state. Every node reads from and writes to a shared state object. No guessing what the previous step produced. You can inspect state at any point, which makes debugging much less painful.
Non-linear flow. Conditional branches, parallel execution, loops with exit conditions, fallback paths. You define the graph structure and LangGraph handles the execution. Traditional agent frameworks force you into a linear loop. LangGraph lets you build actual workflows.
Checkpointing. State saves automatically after each node. If the agent crashes at step 7 of a 10-step workflow, you resume from step 7. Important for anything that takes more than a few minutes to run.
Human interrupts. Add an interrupt point at any node. The agent pauses, surfaces the current state to a human, waits for approval or input, then continues. Simple to implement, and required for any workflow touching sensitive operations.
Building a first LangGraph agent
Let's build a research and writing agent that can gather information, create content drafts, and iterate based on quality checks. This demonstrates core LangGraph concepts including state management and cyclical workflows.
First, install the required dependencies:
pip install langgraph langchain langchain-openai langchain-community python-dotenv
Now create a basic LangGraph agent with multiple states:
import os
from typing import TypedDict, Annotated
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolExecutor
from langchain.tools import Tool
load_dotenv()
# Define the agent state structure
class AgentState(TypedDict):
task: str
research_data: str
draft_content: str
quality_score: int
iteration_count: int
final_output: str
# Initialize LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0.7, api_key=os.getenv("OPENAI_API_KEY"))
# Node 1: Research node that gathers information
def research_node(state: AgentState) -> AgentState:
"""Gather information about the topic."""
task = state["task"]
# In production, this would call actual research tools
prompt = f"Provide 3-4 key facts about: {task}"
response = llm.invoke(prompt)
return {
**state,
"research_data": response.content,
"iteration_count": state.get("iteration_count", 0)
}
# Node 2: Writing node that creates content
def writing_node(state: AgentState) -> AgentState:
"""Create content based on research data."""
research = state["research_data"]
task = state["task"]
prompt = f"""Based on this research:
{research}
Write a concise 2-paragraph article about: {task}"""
response = llm.invoke(prompt)
return {
**state,
"draft_content": response.content,
"iteration_count": state.get("iteration_count", 0) + 1
}
# Node 3: Quality check node
def quality_check_node(state: AgentState) -> AgentState:
"""Evaluate the quality of the content."""
content = state["draft_content"]
prompt = f"""Rate this content from 1-10 for clarity and completeness.
Respond with ONLY a number.
Content:
{content}"""
response = llm.invoke(prompt)
try:
score = int(response.content.strip())
except ValueError:
score = 5 # Default if parsing fails
return {
**state,
"quality_score": score
}
# Node 4: Finalization node
def finalize_node(state: AgentState) -> AgentState:
"""Prepare the final output."""
return {
**state,
"final_output": state["draft_content"]
}
# Conditional edge: Decide whether to iterate or finalize
def should_iterate(state: AgentState) -> str:
"""Determine if we should iterate or finalize."""
quality_score = state.get("quality_score", 0)
iteration_count = state.get("iteration_count", 0)
# Iterate if quality is low and we haven't tried too many times
if quality_score < 7 and iteration_count < 3:
return "iterate"
else:
return "finalize"
# Build the graph
workflow = StateGraph(AgentState)
# Add nodes
workflow.add_node("research", research_node)
workflow.add_node("writing", writing_node)
workflow.add_node("quality_check", quality_check_node)
workflow.add_node("finalize", finalize_node)
# Define edges (flow between nodes)
workflow.add_edge("research", "writing")
workflow.add_edge("writing", "quality_check")
# Add conditional edge based on quality
workflow.add_conditional_edges(
"quality_check",
should_iterate,
{
"iterate": "research", # Loop back to improve
"finalize": "finalize" # Move to finalization
}
)
workflow.add_edge("finalize", END)
# Set entry point
workflow.set_entry_point("research")
# Compile the graph
app = workflow.compile()
# Execute the agent
if __name__ == "__main__":
initial_state = {
"task": "explain how LangGraph enables autonomous AI agents",
"research_data": "",
"draft_content": "",
"quality_score": 0,
"iteration_count": 0,
"final_output": ""
}
print("Starting LangGraph Agent...\n")
# Run the agent
final_state = app.invoke(initial_state)
print("="*70)
print("FINAL OUTPUT:")
print("="*70)
print(final_state["final_output"])
print(f"\nIterations: {final_state['iteration_count']}")
print(f"Final Quality Score: {final_state['quality_score']}")
This agent demonstrates a cyclical workflow: it researches, writes, checks quality, and if the quality is insufficient, loops back to research again. The StateGraph manages state transitions, and conditional edges enable dynamic routing based on agent decisions.
Multi-agent systems with LangGraph
Real applications often need multiple specialized agents working together. LangGraph handles this well because the graph structure makes it explicit which agent does what and when:
import os
from typing import TypedDict, List
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
load_dotenv()
# Define shared state for multi-agent system
class MultiAgentState(TypedDict):
user_request: str
research_findings: str
technical_analysis: str
business_recommendation: str
approval_status: str
final_report: str
# Initialize specialized LLMs (could use different models/prompts)
llm = ChatOpenAI(model="gpt-4o", temperature=0, api_key=os.getenv("OPENAI_API_KEY"))
# Agent 1: Research Specialist
def research_agent(state: MultiAgentState) -> MultiAgentState:
"""Research agent gathers information and facts."""
request = state["user_request"]
prompt = f"""You are a research specialist. Gather key information about:
{request}
Provide 4-5 factual findings with sources."""
response = llm.invoke(prompt)
return {
**state,
"research_findings": response.content
}
# Agent 2: Technical Analyst
def technical_agent(state: MultiAgentState) -> MultiAgentState:
"""Technical agent analyzes implementation details."""
request = state["user_request"]
research = state["research_findings"]
prompt = f"""You are a technical analyst. Based on this request and research:
Request: {request}
Research:
{research}
Provide technical analysis: feasibility, implementation approach, and technical requirements."""
response = llm.invoke(prompt)
return {
**state,
"technical_analysis": response.content
}
# Agent 3: Business Advisor
def business_agent(state: MultiAgentState) -> MultiAgentState:
"""Business agent provides strategic recommendations."""
request = state["user_request"]
research = state["research_findings"]
technical = state["technical_analysis"]
prompt = f"""You are a business advisor. Based on:
Request: {request}
Research: {research}
Technical Analysis: {technical}
Provide business recommendation: ROI, risks, timeline, and go/no-go decision."""
response = llm.invoke(prompt)
return {
**state,
"business_recommendation": response.content
}
# Agent 4: Report Compiler
def report_agent(state: MultiAgentState) -> MultiAgentState:
"""Compile all findings into a final report."""
research = state["research_findings"]
technical = state["technical_analysis"]
business = state["business_recommendation"]
prompt = f"""Compile this information into a concise executive summary:
RESEARCH FINDINGS:
{research}
TECHNICAL ANALYSIS:
{technical}
BUSINESS RECOMMENDATION:
{business}
Create a structured 3-paragraph summary."""
response = llm.invoke(prompt)
return {
**state,
"final_report": response.content,
"approval_status": "pending"
}
# Human approval node (simulated)
def human_approval_node(state: MultiAgentState) -> MultiAgentState:
"""Pause for human approval."""
print("\n" + "="*70)
print("FINAL REPORT FOR APPROVAL:")
print("="*70)
print(state["final_report"])
print("\n" + "="*70)
# In production, this would integrate with your approval system
approval = input("\nApprove this report? (yes/no): ").lower().strip()
return {
**state,
"approval_status": "approved" if approval == "yes" else "rejected"
}
# Conditional edge: Check approval status
def check_approval(state: MultiAgentState) -> str:
"""Route based on approval status."""
if state["approval_status"] == "approved":
return "approved"
else:
return "rejected"
# Build multi-agent workflow
workflow = StateGraph(MultiAgentState)
# Add agent nodes
workflow.add_node("research", research_agent)
workflow.add_node("technical", technical_agent)
workflow.add_node("business", business_agent)
workflow.add_node("report", report_agent)
workflow.add_node("approval", human_approval_node)
# Define sequential flow
workflow.add_edge("research", "technical")
workflow.add_edge("technical", "business")
workflow.add_edge("business", "report")
workflow.add_edge("report", "approval")
# Add conditional routing after approval
workflow.add_conditional_edges(
"approval",
check_approval,
{
"approved": END,
"rejected": "research" # Loop back to start if rejected
}
)
# Set entry point
workflow.set_entry_point("research")
# Compile
app = workflow.compile()
# Execute multi-agent system
if __name__ == "__main__":
initial_state = {
"user_request": "Should we implement LangGraph for our customer support automation?",
"research_findings": "",
"technical_analysis": "",
"business_recommendation": "",
"approval_status": "",
"final_report": ""
}
print("Starting Multi-Agent Analysis System...\n")
final_state = app.invoke(initial_state)
if final_state["approval_status"] == "approved":
print("\n✅ Report Approved!")
print("\nFINAL REPORT:")
print(final_state["final_report"])
else:
print("\n❌ Report Rejected - System will iterate")
This multi-agent system demonstrates how specialized agents can collaborate through shared state. Each agent adds its expertise, and human approval creates a checkpoint before finalizing decisions.
Persistent state and checkpoints
For long-running workflows or agents that need to survive restarts, LangGraph has built-in checkpointing:
import os
from typing import TypedDict
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.sqlite import SqliteSaver
load_dotenv()
# Define agent state
class PersistentAgentState(TypedDict):
task: str
progress: str
steps_completed: list
current_step: int
total_steps: int
result: str
llm = ChatOpenAI(model="gpt-4o", temperature=0, api_key=os.getenv("OPENAI_API_KEY"))
# Simulate long-running steps
def step_one(state: PersistentAgentState) -> PersistentAgentState:
"""Execute first step of long workflow."""
print("Executing Step 1: Data Collection")
prompt = f"Simulate data collection for: {state['task']}"
response = llm.invoke(prompt)
steps = state.get("steps_completed", [])
steps.append("Step 1: Data Collection Complete")
return {
**state,
"steps_completed": steps,
"current_step": 1,
"progress": "Collected initial data"
}
def step_two(state: PersistentAgentState) -> PersistentAgentState:
"""Execute second step."""
print("Executing Step 2: Data Analysis")
prompt = f"Analyze data for: {state['task']}"
response = llm.invoke(prompt)
steps = state.get("steps_completed", [])
steps.append("Step 2: Analysis Complete")
return {
**state,
"steps_completed": steps,
"current_step": 2,
"progress": "Analysis completed"
}
def step_three(state: PersistentAgentState) -> PersistentAgentState:
"""Execute final step."""
print("Executing Step 3: Report Generation")
prompt = f"Generate final report for: {state['task']}"
response = llm.invoke(prompt)
steps = state.get("steps_completed", [])
steps.append("Step 3: Report Generated")
return {
**state,
"steps_completed": steps,
"current_step": 3,
"progress": "Complete",
"result": response.content
}
# Build workflow with checkpointing
workflow = StateGraph(PersistentAgentState)
workflow.add_node("step1", step_one)
workflow.add_node("step2", step_two)
workflow.add_node("step3", step_three)
workflow.add_edge("step1", "step2")
workflow.add_edge("step2", "step3")
workflow.add_edge("step3", END)
workflow.set_entry_point("step1")
# Initialize checkpoint saver
checkpointer = SqliteSaver.from_conn_string(":memory:")
# Compile with checkpointing enabled
app = workflow.compile(checkpointer=checkpointer)
# Execute with thread ID for state persistence
if __name__ == "__main__":
initial_state = {
"task": "Quarterly sales analysis",
"progress": "",
"steps_completed": [],
"current_step": 0,
"total_steps": 3,
"result": ""
}
# Create a thread ID for this workflow instance
thread_id = "workflow-123"
config = {"configurable": {"thread_id": thread_id}}
print("Starting Persistent Workflow...\n")
# Execute workflow
final_state = app.invoke(initial_state, config=config)
print("\n" + "="*70)
print("WORKFLOW COMPLETE")
print("="*70)
print(f"Progress: {final_state['progress']}")
print(f"Steps Completed: {len(final_state['steps_completed'])}")
print("\nSteps:")
for step in final_state["steps_completed"]:
print(f" ✓ {step}")
# Demonstrate checkpoint retrieval
print("\n" + "="*70)
print("CHECKPOINTS SAVED:")
print("="*70)
# Get all checkpoints for this thread
checkpoint_history = app.get_state_history(config)
for i, checkpoint in enumerate(checkpoint_history):
if i < 3: # Show first 3 checkpoints
print(f"\nCheckpoint {i + 1}:")
print(f" Current Step: {checkpoint.values.get('current_step', 0)}")
print(f" Progress: {checkpoint.values.get('progress', 'N/A')}")
A few things checkpointing makes possible:
- Resume from failure. If the agent crashes at step 2, restart from that checkpoint instead of replaying step 1.
- Long-running workflows. Anything that runs for hours or days without losing state.
- Time-travel debugging. Inspect agent state at any point in execution history.
- Audit trails. Complete records of decisions and state transitions, useful for compliance or just for figuring out what happened.
Tool integration
LangGraph agents can use external tools while maintaining state across tool calls:
import os
from typing import TypedDict, Annotated
import operator
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain.tools import Tool
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolExecutor, ToolInvocation
load_dotenv()
# Define state with tool tracking
class ToolAgentState(TypedDict):
messages: Annotated[list, operator.add]
query: str
tool_calls: list
final_answer: str
llm = ChatOpenAI(model="gpt-4o", temperature=0, api_key=os.getenv("OPENAI_API_KEY"))
# Define tools
def calculate(expression: str) -> str:
"""Perform calculations. Input: math expression."""
try:
result = eval(expression)
return f"Result: {result}"
except Exception as e:
return f"Error: {str(e)}"
def get_weather(location: str) -> str:
"""Get weather for a location. Input: city name."""
# Mock weather data
return f"Weather in {location}: 72°F, Sunny"
tools = [
Tool(name="Calculator", func=calculate, description="Perform math calculations"),
Tool(name="WeatherTool", func=get_weather, description="Get current weather")
]
tool_executor = ToolExecutor(tools)
# Agent decision node
def agent_node(state: ToolAgentState) -> ToolAgentState:
"""Agent decides what to do next."""
query = state["query"]
prompt = f"""Given this query: {query}
Previous tool calls: {state.get('tool_calls', [])}
Decide: Do you need to call a tool, or can you provide the final answer?
If you need a tool, specify: TOOL: tool_name, INPUT: input_value
If you have the answer, specify: ANSWER: your response"""
response = llm.invoke(prompt)
content = response.content
if "TOOL:" in content:
# Parse tool call
tool_name = content.split("TOOL:")[1].split(",")[0].strip()
tool_input = content.split("INPUT:")[1].strip()
tool_calls = state.get("tool_calls", [])
tool_calls.append({"tool": tool_name, "input": tool_input})
return {
**state,
"tool_calls": tool_calls
}
else:
# Final answer
answer = content.split("ANSWER:")[1].strip() if "ANSWER:" in content else content
return {
**state,
"final_answer": answer
}
# Tool execution node
def tool_node(state: ToolAgentState) -> ToolAgentState:
"""Execute the most recent tool call."""
tool_calls = state["tool_calls"]
if not tool_calls:
return state
last_call = tool_calls[-1]
tool_name = last_call["tool"]
tool_input = last_call["input"]
# Execute tool
tool_invocation = ToolInvocation(tool=tool_name, tool_input=tool_input)
result = tool_executor.invoke(tool_invocation)
# Update state with result
tool_calls[-1]["result"] = result
return {
**state,
"tool_calls": tool_calls
}
# Routing function
def should_continue(state: ToolAgentState) -> str:
"""Decide next step based on state."""
if state.get("final_answer"):
return "end"
elif state.get("tool_calls") and "result" not in state["tool_calls"][-1]:
return "execute_tool"
else:
return "agent"
# Build graph
workflow = StateGraph(ToolAgentState)
workflow.add_node("agent", agent_node)
workflow.add_node("tool", tool_node)
workflow.add_conditional_edges(
"agent",
should_continue,
{
"execute_tool": "tool",
"agent": "agent",
"end": END
}
)
workflow.add_edge("tool", "agent")
workflow.set_entry_point("agent")
app = workflow.compile()
# Test the tool-using agent
if __name__ == "__main__":
test_query = "What's the weather in San Francisco, and if it's above 70°F, calculate 70 * 1.5"
initial_state = {
"messages": [],
"query": test_query,
"tool_calls": [],
"final_answer": ""
}
print(f"Query: {test_query}\n")
print("="*70)
final_state = app.invoke(initial_state)
print("\nTOOL CALLS:")
for call in final_state["tool_calls"]:
print(f" • {call['tool']}: {call['input']}")
print(f" Result: {call.get('result', 'N/A')}")
print(f"\nFINAL ANSWER:")
print(final_state["final_answer"])
This pattern creates agents that can dynamically decide when to use tools, track all tool executions in state, and make decisions based on tool results.
What works in production
Reliable LangGraph agents come down to a handful of practical habits:
Design clear state schemas. Use TypedDict to define what state looks like. Include every field an agent might need, give them default values, and document what they represent. This is the difference between a workflow you can debug and one you can't.
Bound your loops. Every cyclical workflow needs a maximum iteration count. Put a counter in state, check it in conditional edges. Without it, the first time the model gets stuck, your agent runs forever and your bill notices.
Handle errors. Wrap node functions in try/except. Put error info into state so downstream nodes can react. For anything user-facing, add a dedicated recovery node so failures don't just propagate.
Use checkpointing for anything that takes more than a few minutes. Save after expensive operations (API calls, database queries). If the agent crashes at step 7 of 10, you want to resume at 7, not start over.
Add observability from the start. Log state transitions, node executions, and decision points. Track execution time per node, iteration counts, success rates. Structured logging beats grep every time.
Test each node by itself. Unit tests with mock state objects. Test conditional edge functions across different state configurations. Get the pieces right before you assemble the graph.
Version-control your graph definitions. They're code, treat them like code. Document each node and edge. Use descriptive commit messages so workflow changes are easy to audit later.
What to plan for before shipping
Scalability
Deploy LangGraph agents as containerized services. Docker and Kubernetes are fine. Queue requests so concurrent workflows don't hammer the LLM API. Stateless nodes scale horizontally. Shared state needs dedicated checkpoint storage.
Cost
Track token usage per workflow and per node. Set cost limits and alerts for runaway workflows because you will have at least one. Cache repeated LLM calls when you can. Use cheaper models for simple nodes, reserve expensive models for decisions that actually matter.
Security
Sanitize user inputs before they touch agent state. Validate state transitions so they can't be manipulated externally. Encrypt checkpoint storage when it contains anything sensitive. Authenticate workflow triggers and human-approval endpoints.
Monitoring
Track success rates, average execution time, and cost per workflow. Checkpoint storage grows; clean up old ones. Alert on failures, timeouts, and unexpected state transitions. For multi-agent workflows, distributed tracing is the only way to figure out where things went sideways.
Where this actually shows up
A few real workflows where LangGraph earns its keep:
- Multi-stage content creation: research, draft, review, revise, with quality gates and human approval before anything ships
- Financial analysis pipelines: data collection, analysis, risk assessment, recommendation, with compliance checkpoints along the way
- Customer onboarding: application processing, document verification, account creation, welcome sequence, with manual review where it matters
- Incident response: detection, diagnosis, remediation, verification, with escalation paths when the simple path doesn't work
- Supply chain: demand forecasting, inventory planning, order placement, supplier coordination, with approval gates
Wrap-up
LangGraph solves a specific problem. LangChain agents work fine for simple tool-calling. They fall apart the moment you need branching, looping, error recovery, or human approval. The graph-based approach gives you explicit control over flow without rolling your own state machine.
The tradeoff is complexity. A chatbot doesn't need LangGraph. Anything with multi-step workflows, conditional paths, or human-in-the-loop requirements does.
Start with a basic linear workflow to get comfortable with state management. Add conditional edges once that flow is clear. Then bring in checkpointing and human approval. Don't try to build everything at once because the debugging is much harder than it looks.
Next steps
- Install LangGraph and build a simple 3-4 node state machine to get a feel for state transitions
- Add conditional edges that route based on agent decisions or tool outputs
- Add checkpointing so workflows can survive a restart
- Build a human-in-the-loop workflow with approval gates for anything high-stakes
- Add monitoring (execution times, token costs, failure rates) before going to production