mirror of
https://github.com/wshobson/agents.git
synced 2026-03-18 09:37:15 +00:00
feat(llm-application-dev): modernize to LangGraph and latest models v2.0.0
- Migrate from LangChain 0.x to LangChain 1.x/LangGraph patterns - Update model references to Claude 4.5 and GPT-5.2 - Add Voyage AI as primary embedding recommendation - Add structured outputs with Pydantic - Replace deprecated initialize_agent() with StateGraph - Fix security: use AST-based safe math instead of unsafe execution - Add plugin.json and README.md for consistency - Bump marketplace version to 1.3.3
This commit is contained in:
@@ -1,11 +1,11 @@
|
||||
---
|
||||
name: langchain-architecture
|
||||
description: Design LLM applications using the LangChain framework with agents, memory, and tool integration patterns. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.
|
||||
description: Design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.
|
||||
---
|
||||
|
||||
# LangChain Architecture
|
||||
# LangChain & LangGraph Architecture
|
||||
|
||||
Master the LangChain framework for building sophisticated LLM applications with agents, chains, memory, and tool integration.
|
||||
Master modern LangChain 1.x and LangGraph for building sophisticated LLM applications with agents, state management, memory, and tool integration.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
@@ -17,126 +17,100 @@ Master the LangChain framework for building sophisticated LLM applications with
|
||||
- Implementing document processing pipelines
|
||||
- Building production-grade LLM applications
|
||||
|
||||
## Package Structure (LangChain 1.x)
|
||||
|
||||
```
|
||||
langchain (1.2.x) # High-level orchestration
|
||||
langchain-core (1.2.x) # Core abstractions (messages, prompts, tools)
|
||||
langchain-community # Third-party integrations
|
||||
langgraph # Agent orchestration and state management
|
||||
langchain-openai # OpenAI integrations
|
||||
langchain-anthropic # Anthropic/Claude integrations
|
||||
langchain-voyageai # Voyage AI embeddings
|
||||
langchain-pinecone # Pinecone vector store
|
||||
```
|
||||
|
||||
## Core Concepts
|
||||
|
||||
### 1. Agents
|
||||
Autonomous systems that use LLMs to decide which actions to take.
|
||||
### 1. LangGraph Agents
|
||||
LangGraph is the standard for building agents in 2026. It provides:
|
||||
|
||||
**Agent Types:**
|
||||
- **ReAct**: Reasoning + Acting in interleaved manner
|
||||
- **OpenAI Functions**: Leverages function calling API
|
||||
- **Structured Chat**: Handles multi-input tools
|
||||
- **Conversational**: Optimized for chat interfaces
|
||||
- **Self-Ask with Search**: Decomposes complex queries
|
||||
**Key Features:**
|
||||
- **StateGraph**: Explicit state management with typed state
|
||||
- **Durable Execution**: Agents persist through failures
|
||||
- **Human-in-the-Loop**: Inspect and modify state at any point
|
||||
- **Memory**: Short-term and long-term memory across sessions
|
||||
- **Checkpointing**: Save and resume agent state
|
||||
|
||||
### 2. Chains
|
||||
Sequences of calls to LLMs or other utilities.
|
||||
**Agent Patterns:**
|
||||
- **ReAct**: Reasoning + Acting with `create_react_agent`
|
||||
- **Plan-and-Execute**: Separate planning and execution nodes
|
||||
- **Multi-Agent**: Supervisor routing between specialized agents
|
||||
- **Tool-Calling**: Structured tool invocation with Pydantic schemas
|
||||
|
||||
**Chain Types:**
|
||||
- **LLMChain**: Basic prompt + LLM combination
|
||||
- **SequentialChain**: Multiple chains in sequence
|
||||
- **RouterChain**: Routes inputs to specialized chains
|
||||
- **TransformChain**: Data transformations between steps
|
||||
- **MapReduceChain**: Parallel processing with aggregation
|
||||
### 2. State Management
|
||||
LangGraph uses TypedDict for explicit state:
|
||||
|
||||
### 3. Memory
|
||||
Systems for maintaining context across interactions.
|
||||
```python
|
||||
from typing import Annotated, TypedDict
|
||||
from langgraph.graph import MessagesState
|
||||
|
||||
**Memory Types:**
|
||||
- **ConversationBufferMemory**: Stores all messages
|
||||
- **ConversationSummaryMemory**: Summarizes older messages
|
||||
- **ConversationBufferWindowMemory**: Keeps last N messages
|
||||
- **EntityMemory**: Tracks information about entities
|
||||
- **VectorStoreMemory**: Semantic similarity retrieval
|
||||
# Simple message-based state
|
||||
class AgentState(MessagesState):
|
||||
"""Extends MessagesState with custom fields."""
|
||||
context: Annotated[list, "retrieved documents"]
|
||||
|
||||
# Custom state for complex agents
|
||||
class CustomState(TypedDict):
|
||||
messages: Annotated[list, "conversation history"]
|
||||
context: Annotated[dict, "retrieved context"]
|
||||
current_step: str
|
||||
results: list
|
||||
```
|
||||
|
||||
### 3. Memory Systems
|
||||
Modern memory implementations:
|
||||
|
||||
- **ConversationBufferMemory**: Stores all messages (short conversations)
|
||||
- **ConversationSummaryMemory**: Summarizes older messages (long conversations)
|
||||
- **ConversationTokenBufferMemory**: Token-based windowing
|
||||
- **VectorStoreRetrieverMemory**: Semantic similarity retrieval
|
||||
- **LangGraph Checkpointers**: Persistent state across sessions
|
||||
|
||||
### 4. Document Processing
|
||||
Loading, transforming, and storing documents for retrieval.
|
||||
Loading, transforming, and storing documents:
|
||||
|
||||
**Components:**
|
||||
- **Document Loaders**: Load from various sources
|
||||
- **Text Splitters**: Chunk documents intelligently
|
||||
- **Vector Stores**: Store and retrieve embeddings
|
||||
- **Retrievers**: Fetch relevant documents
|
||||
- **Indexes**: Organize documents for efficient access
|
||||
|
||||
### 5. Callbacks
|
||||
Hooks for logging, monitoring, and debugging.
|
||||
### 5. Callbacks & Tracing
|
||||
LangSmith is the standard for observability:
|
||||
|
||||
**Use Cases:**
|
||||
- Request/response logging
|
||||
- Token usage tracking
|
||||
- Latency monitoring
|
||||
- Error handling
|
||||
- Custom metrics collection
|
||||
- Error tracking
|
||||
- Trace visualization
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Modern ReAct Agent with LangGraph
|
||||
|
||||
```python
|
||||
from langchain.agents import AgentType, initialize_agent, load_tools
|
||||
from langchain.llms import OpenAI
|
||||
from langchain.memory import ConversationBufferMemory
|
||||
from langgraph.prebuilt import create_react_agent
|
||||
from langgraph.checkpoint.memory import MemorySaver
|
||||
from langchain_anthropic import ChatAnthropic
|
||||
from langchain_core.tools import tool
|
||||
import ast
|
||||
import operator
|
||||
|
||||
# Initialize LLM
|
||||
llm = OpenAI(temperature=0)
|
||||
|
||||
# Load tools
|
||||
tools = load_tools(["serpapi", "llm-math"], llm=llm)
|
||||
|
||||
# Add memory
|
||||
memory = ConversationBufferMemory(memory_key="chat_history")
|
||||
|
||||
# Create agent
|
||||
agent = initialize_agent(
|
||||
tools,
|
||||
llm,
|
||||
agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
|
||||
memory=memory,
|
||||
verbose=True
|
||||
)
|
||||
|
||||
# Run agent
|
||||
result = agent.run("What's the weather in SF? Then calculate 25 * 4")
|
||||
```
|
||||
|
||||
## Architecture Patterns
|
||||
|
||||
### Pattern 1: RAG with LangChain
|
||||
```python
|
||||
from langchain.chains import RetrievalQA
|
||||
from langchain.document_loaders import TextLoader
|
||||
from langchain.text_splitter import CharacterTextSplitter
|
||||
from langchain.vectorstores import Chroma
|
||||
from langchain.embeddings import OpenAIEmbeddings
|
||||
|
||||
# Load and process documents
|
||||
loader = TextLoader('documents.txt')
|
||||
documents = loader.load()
|
||||
|
||||
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
|
||||
texts = text_splitter.split_documents(documents)
|
||||
|
||||
# Create vector store
|
||||
embeddings = OpenAIEmbeddings()
|
||||
vectorstore = Chroma.from_documents(texts, embeddings)
|
||||
|
||||
# Create retrieval chain
|
||||
qa_chain = RetrievalQA.from_chain_type(
|
||||
llm=llm,
|
||||
chain_type="stuff",
|
||||
retriever=vectorstore.as_retriever(),
|
||||
return_source_documents=True
|
||||
)
|
||||
|
||||
# Query
|
||||
result = qa_chain({"query": "What is the main topic?"})
|
||||
```
|
||||
|
||||
### Pattern 2: Custom Agent with Tools
|
||||
```python
|
||||
from langchain.agents import Tool, AgentExecutor
|
||||
from langchain.agents.react.base import ReActDocstoreAgent
|
||||
from langchain.tools import tool
|
||||
# Initialize LLM (Claude Sonnet 4.5 recommended)
|
||||
llm = ChatAnthropic(model="claude-sonnet-4-5", temperature=0)
|
||||
|
||||
# Define tools with Pydantic schemas
|
||||
@tool
|
||||
def search_database(query: str) -> str:
|
||||
"""Search internal database for information."""
|
||||
@@ -144,195 +118,541 @@ def search_database(query: str) -> str:
|
||||
return f"Results for: {query}"
|
||||
|
||||
@tool
|
||||
def send_email(recipient: str, content: str) -> str:
|
||||
def calculate(expression: str) -> str:
|
||||
"""Safely evaluate a mathematical expression.
|
||||
|
||||
Supports: +, -, *, /, **, %, parentheses
|
||||
Example: '(2 + 3) * 4' returns '20'
|
||||
"""
|
||||
# Safe math evaluation using ast
|
||||
allowed_operators = {
|
||||
ast.Add: operator.add,
|
||||
ast.Sub: operator.sub,
|
||||
ast.Mult: operator.mul,
|
||||
ast.Div: operator.truediv,
|
||||
ast.Pow: operator.pow,
|
||||
ast.Mod: operator.mod,
|
||||
ast.USub: operator.neg,
|
||||
}
|
||||
|
||||
def _eval(node):
|
||||
if isinstance(node, ast.Constant):
|
||||
return node.value
|
||||
elif isinstance(node, ast.BinOp):
|
||||
left = _eval(node.left)
|
||||
right = _eval(node.right)
|
||||
return allowed_operators[type(node.op)](left, right)
|
||||
elif isinstance(node, ast.UnaryOp):
|
||||
operand = _eval(node.operand)
|
||||
return allowed_operators[type(node.op)](operand)
|
||||
else:
|
||||
raise ValueError(f"Unsupported operation: {type(node)}")
|
||||
|
||||
try:
|
||||
tree = ast.parse(expression, mode='eval')
|
||||
return str(_eval(tree.body))
|
||||
except Exception as e:
|
||||
return f"Error: {e}"
|
||||
|
||||
tools = [search_database, calculate]
|
||||
|
||||
# Create checkpointer for memory persistence
|
||||
checkpointer = MemorySaver()
|
||||
|
||||
# Create ReAct agent
|
||||
agent = create_react_agent(
|
||||
llm,
|
||||
tools,
|
||||
checkpointer=checkpointer
|
||||
)
|
||||
|
||||
# Run agent with thread ID for memory
|
||||
config = {"configurable": {"thread_id": "user-123"}}
|
||||
result = await agent.ainvoke(
|
||||
{"messages": [("user", "Search for Python tutorials and calculate 25 * 4")]},
|
||||
config=config
|
||||
)
|
||||
```
|
||||
|
||||
## Architecture Patterns
|
||||
|
||||
### Pattern 1: RAG with LangGraph
|
||||
|
||||
```python
|
||||
from langgraph.graph import StateGraph, START, END
|
||||
from langchain_anthropic import ChatAnthropic
|
||||
from langchain_voyageai import VoyageAIEmbeddings
|
||||
from langchain_pinecone import PineconeVectorStore
|
||||
from langchain_core.documents import Document
|
||||
from langchain_core.prompts import ChatPromptTemplate
|
||||
from typing import TypedDict, Annotated
|
||||
|
||||
class RAGState(TypedDict):
|
||||
question: str
|
||||
context: Annotated[list[Document], "retrieved documents"]
|
||||
answer: str
|
||||
|
||||
# Initialize components
|
||||
llm = ChatAnthropic(model="claude-sonnet-4-5")
|
||||
embeddings = VoyageAIEmbeddings(model="voyage-3-large")
|
||||
vectorstore = PineconeVectorStore(index_name="docs", embedding=embeddings)
|
||||
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
|
||||
|
||||
# Define nodes
|
||||
async def retrieve(state: RAGState) -> RAGState:
|
||||
"""Retrieve relevant documents."""
|
||||
docs = await retriever.ainvoke(state["question"])
|
||||
return {"context": docs}
|
||||
|
||||
async def generate(state: RAGState) -> RAGState:
|
||||
"""Generate answer from context."""
|
||||
prompt = ChatPromptTemplate.from_template(
|
||||
"""Answer based on the context below. If you cannot answer, say so.
|
||||
|
||||
Context: {context}
|
||||
|
||||
Question: {question}
|
||||
|
||||
Answer:"""
|
||||
)
|
||||
context_text = "\n\n".join(doc.page_content for doc in state["context"])
|
||||
response = await llm.ainvoke(
|
||||
prompt.format(context=context_text, question=state["question"])
|
||||
)
|
||||
return {"answer": response.content}
|
||||
|
||||
# Build graph
|
||||
builder = StateGraph(RAGState)
|
||||
builder.add_node("retrieve", retrieve)
|
||||
builder.add_node("generate", generate)
|
||||
builder.add_edge(START, "retrieve")
|
||||
builder.add_edge("retrieve", "generate")
|
||||
builder.add_edge("generate", END)
|
||||
|
||||
rag_chain = builder.compile()
|
||||
|
||||
# Use the chain
|
||||
result = await rag_chain.ainvoke({"question": "What is the main topic?"})
|
||||
```
|
||||
|
||||
### Pattern 2: Custom Agent with Structured Tools
|
||||
|
||||
```python
|
||||
from langchain_core.tools import StructuredTool
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
class SearchInput(BaseModel):
|
||||
"""Input for database search."""
|
||||
query: str = Field(description="Search query")
|
||||
filters: dict = Field(default={}, description="Optional filters")
|
||||
|
||||
class EmailInput(BaseModel):
|
||||
"""Input for sending email."""
|
||||
recipient: str = Field(description="Email recipient")
|
||||
subject: str = Field(description="Email subject")
|
||||
content: str = Field(description="Email body")
|
||||
|
||||
async def search_database(query: str, filters: dict = {}) -> str:
|
||||
"""Search internal database for information."""
|
||||
# Your database search logic
|
||||
return f"Results for '{query}' with filters {filters}"
|
||||
|
||||
async def send_email(recipient: str, subject: str, content: str) -> str:
|
||||
"""Send an email to specified recipient."""
|
||||
# Email sending logic
|
||||
return f"Email sent to {recipient}"
|
||||
|
||||
tools = [search_database, send_email]
|
||||
tools = [
|
||||
StructuredTool.from_function(
|
||||
coroutine=search_database,
|
||||
name="search_database",
|
||||
description="Search internal database",
|
||||
args_schema=SearchInput
|
||||
),
|
||||
StructuredTool.from_function(
|
||||
coroutine=send_email,
|
||||
name="send_email",
|
||||
description="Send an email",
|
||||
args_schema=EmailInput
|
||||
)
|
||||
]
|
||||
|
||||
agent = initialize_agent(
|
||||
tools,
|
||||
llm,
|
||||
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
|
||||
verbose=True
|
||||
)
|
||||
agent = create_react_agent(llm, tools)
|
||||
```
|
||||
|
||||
### Pattern 3: Multi-Step Chain
|
||||
### Pattern 3: Multi-Step Workflow with StateGraph
|
||||
|
||||
```python
|
||||
from langchain.chains import LLMChain, SequentialChain
|
||||
from langchain.prompts import PromptTemplate
|
||||
from langgraph.graph import StateGraph, START, END
|
||||
from typing import TypedDict, Literal
|
||||
|
||||
# Step 1: Extract key information
|
||||
extract_prompt = PromptTemplate(
|
||||
input_variables=["text"],
|
||||
template="Extract key entities from: {text}\n\nEntities:"
|
||||
)
|
||||
extract_chain = LLMChain(llm=llm, prompt=extract_prompt, output_key="entities")
|
||||
class WorkflowState(TypedDict):
|
||||
text: str
|
||||
entities: list
|
||||
analysis: str
|
||||
summary: str
|
||||
current_step: str
|
||||
|
||||
# Step 2: Analyze entities
|
||||
analyze_prompt = PromptTemplate(
|
||||
input_variables=["entities"],
|
||||
template="Analyze these entities: {entities}\n\nAnalysis:"
|
||||
)
|
||||
analyze_chain = LLMChain(llm=llm, prompt=analyze_prompt, output_key="analysis")
|
||||
async def extract_entities(state: WorkflowState) -> WorkflowState:
|
||||
"""Extract key entities from text."""
|
||||
prompt = f"Extract key entities from: {state['text']}\n\nReturn as JSON list."
|
||||
response = await llm.ainvoke(prompt)
|
||||
return {"entities": response.content, "current_step": "analyze"}
|
||||
|
||||
# Step 3: Generate summary
|
||||
summary_prompt = PromptTemplate(
|
||||
input_variables=["entities", "analysis"],
|
||||
template="Summarize:\nEntities: {entities}\nAnalysis: {analysis}\n\nSummary:"
|
||||
)
|
||||
summary_chain = LLMChain(llm=llm, prompt=summary_prompt, output_key="summary")
|
||||
async def analyze_entities(state: WorkflowState) -> WorkflowState:
|
||||
"""Analyze extracted entities."""
|
||||
prompt = f"Analyze these entities: {state['entities']}\n\nProvide insights."
|
||||
response = await llm.ainvoke(prompt)
|
||||
return {"analysis": response.content, "current_step": "summarize"}
|
||||
|
||||
# Combine into sequential chain
|
||||
overall_chain = SequentialChain(
|
||||
chains=[extract_chain, analyze_chain, summary_chain],
|
||||
input_variables=["text"],
|
||||
output_variables=["entities", "analysis", "summary"],
|
||||
verbose=True
|
||||
)
|
||||
async def generate_summary(state: WorkflowState) -> WorkflowState:
|
||||
"""Generate final summary."""
|
||||
prompt = f"""Summarize:
|
||||
Entities: {state['entities']}
|
||||
Analysis: {state['analysis']}
|
||||
|
||||
Provide a concise summary."""
|
||||
response = await llm.ainvoke(prompt)
|
||||
return {"summary": response.content, "current_step": "complete"}
|
||||
|
||||
def route_step(state: WorkflowState) -> Literal["analyze", "summarize", "end"]:
|
||||
"""Route to next step based on current state."""
|
||||
step = state.get("current_step", "extract")
|
||||
if step == "analyze":
|
||||
return "analyze"
|
||||
elif step == "summarize":
|
||||
return "summarize"
|
||||
return "end"
|
||||
|
||||
# Build workflow
|
||||
builder = StateGraph(WorkflowState)
|
||||
builder.add_node("extract", extract_entities)
|
||||
builder.add_node("analyze", analyze_entities)
|
||||
builder.add_node("summarize", generate_summary)
|
||||
|
||||
builder.add_edge(START, "extract")
|
||||
builder.add_conditional_edges("extract", route_step, {
|
||||
"analyze": "analyze",
|
||||
"summarize": "summarize",
|
||||
"end": END
|
||||
})
|
||||
builder.add_conditional_edges("analyze", route_step, {
|
||||
"summarize": "summarize",
|
||||
"end": END
|
||||
})
|
||||
builder.add_edge("summarize", END)
|
||||
|
||||
workflow = builder.compile()
|
||||
```
|
||||
|
||||
## Memory Management Best Practices
|
||||
### Pattern 4: Multi-Agent Orchestration
|
||||
|
||||
### Choosing the Right Memory Type
|
||||
```python
|
||||
# For short conversations (< 10 messages)
|
||||
from langchain.memory import ConversationBufferMemory
|
||||
memory = ConversationBufferMemory()
|
||||
from langgraph.graph import StateGraph, START, END
|
||||
from langgraph.prebuilt import create_react_agent
|
||||
from langchain_core.messages import HumanMessage
|
||||
from typing import Literal
|
||||
|
||||
# For long conversations (summarize old messages)
|
||||
from langchain.memory import ConversationSummaryMemory
|
||||
memory = ConversationSummaryMemory(llm=llm)
|
||||
class MultiAgentState(TypedDict):
|
||||
messages: list
|
||||
next_agent: str
|
||||
|
||||
# For sliding window (last N messages)
|
||||
from langchain.memory import ConversationBufferWindowMemory
|
||||
memory = ConversationBufferWindowMemory(k=5)
|
||||
# Create specialized agents
|
||||
researcher = create_react_agent(llm, research_tools)
|
||||
writer = create_react_agent(llm, writing_tools)
|
||||
reviewer = create_react_agent(llm, review_tools)
|
||||
|
||||
# For entity tracking
|
||||
from langchain.memory import ConversationEntityMemory
|
||||
memory = ConversationEntityMemory(llm=llm)
|
||||
async def supervisor(state: MultiAgentState) -> MultiAgentState:
|
||||
"""Route to appropriate agent based on task."""
|
||||
prompt = f"""Based on the conversation, which agent should handle this?
|
||||
|
||||
# For semantic retrieval of relevant history
|
||||
from langchain.memory import VectorStoreRetrieverMemory
|
||||
memory = VectorStoreRetrieverMemory(retriever=retriever)
|
||||
Options:
|
||||
- researcher: For finding information
|
||||
- writer: For creating content
|
||||
- reviewer: For reviewing and editing
|
||||
- FINISH: Task is complete
|
||||
|
||||
Messages: {state['messages']}
|
||||
|
||||
Respond with just the agent name."""
|
||||
|
||||
response = await llm.ainvoke(prompt)
|
||||
return {"next_agent": response.content.strip().lower()}
|
||||
|
||||
def route_to_agent(state: MultiAgentState) -> Literal["researcher", "writer", "reviewer", "end"]:
|
||||
"""Route based on supervisor decision."""
|
||||
next_agent = state.get("next_agent", "").lower()
|
||||
if next_agent == "finish":
|
||||
return "end"
|
||||
return next_agent if next_agent in ["researcher", "writer", "reviewer"] else "end"
|
||||
|
||||
# Build multi-agent graph
|
||||
builder = StateGraph(MultiAgentState)
|
||||
builder.add_node("supervisor", supervisor)
|
||||
builder.add_node("researcher", researcher)
|
||||
builder.add_node("writer", writer)
|
||||
builder.add_node("reviewer", reviewer)
|
||||
|
||||
builder.add_edge(START, "supervisor")
|
||||
builder.add_conditional_edges("supervisor", route_to_agent, {
|
||||
"researcher": "researcher",
|
||||
"writer": "writer",
|
||||
"reviewer": "reviewer",
|
||||
"end": END
|
||||
})
|
||||
|
||||
# Each agent returns to supervisor
|
||||
for agent in ["researcher", "writer", "reviewer"]:
|
||||
builder.add_edge(agent, "supervisor")
|
||||
|
||||
multi_agent = builder.compile()
|
||||
```
|
||||
|
||||
## Callback System
|
||||
## Memory Management
|
||||
|
||||
### Token-Based Memory with LangGraph
|
||||
|
||||
```python
|
||||
from langgraph.checkpoint.memory import MemorySaver
|
||||
from langgraph.prebuilt import create_react_agent
|
||||
|
||||
# In-memory checkpointer (development)
|
||||
checkpointer = MemorySaver()
|
||||
|
||||
# Create agent with persistent memory
|
||||
agent = create_react_agent(llm, tools, checkpointer=checkpointer)
|
||||
|
||||
# Each thread_id maintains separate conversation
|
||||
config = {"configurable": {"thread_id": "session-abc123"}}
|
||||
|
||||
# Messages persist across invocations with same thread_id
|
||||
result1 = await agent.ainvoke({"messages": [("user", "My name is Alice")]}, config)
|
||||
result2 = await agent.ainvoke({"messages": [("user", "What's my name?")]}, config)
|
||||
# Agent remembers: "Your name is Alice"
|
||||
```
|
||||
|
||||
### Production Memory with PostgreSQL
|
||||
|
||||
```python
|
||||
from langgraph.checkpoint.postgres import PostgresSaver
|
||||
|
||||
# Production checkpointer
|
||||
checkpointer = PostgresSaver.from_conn_string(
|
||||
"postgresql://user:pass@localhost/langgraph"
|
||||
)
|
||||
|
||||
agent = create_react_agent(llm, tools, checkpointer=checkpointer)
|
||||
```
|
||||
|
||||
### Vector Store Memory for Long-Term Context
|
||||
|
||||
```python
|
||||
from langchain_community.vectorstores import Chroma
|
||||
from langchain_voyageai import VoyageAIEmbeddings
|
||||
|
||||
embeddings = VoyageAIEmbeddings(model="voyage-3-large")
|
||||
memory_store = Chroma(
|
||||
collection_name="conversation_memory",
|
||||
embedding_function=embeddings,
|
||||
persist_directory="./memory_db"
|
||||
)
|
||||
|
||||
async def retrieve_relevant_memory(query: str, k: int = 5) -> list:
|
||||
"""Retrieve relevant past conversations."""
|
||||
docs = await memory_store.asimilarity_search(query, k=k)
|
||||
return [doc.page_content for doc in docs]
|
||||
|
||||
async def store_memory(content: str, metadata: dict = {}):
|
||||
"""Store conversation in long-term memory."""
|
||||
await memory_store.aadd_texts([content], metadatas=[metadata])
|
||||
```
|
||||
|
||||
## Callback System & LangSmith
|
||||
|
||||
### LangSmith Tracing
|
||||
|
||||
```python
|
||||
import os
|
||||
from langchain_anthropic import ChatAnthropic
|
||||
|
||||
# Enable LangSmith tracing
|
||||
os.environ["LANGCHAIN_TRACING_V2"] = "true"
|
||||
os.environ["LANGCHAIN_API_KEY"] = "your-api-key"
|
||||
os.environ["LANGCHAIN_PROJECT"] = "my-project"
|
||||
|
||||
# All LangChain/LangGraph operations are automatically traced
|
||||
llm = ChatAnthropic(model="claude-sonnet-4-5")
|
||||
```
|
||||
|
||||
### Custom Callback Handler
|
||||
|
||||
```python
|
||||
from langchain.callbacks.base import BaseCallbackHandler
|
||||
from langchain_core.callbacks import BaseCallbackHandler
|
||||
from typing import Any, Dict, List
|
||||
|
||||
class CustomCallbackHandler(BaseCallbackHandler):
|
||||
def on_llm_start(self, serialized, prompts, **kwargs):
|
||||
print(f"LLM started with prompts: {prompts}")
|
||||
def on_llm_start(
|
||||
self, serialized: Dict[str, Any], prompts: List[str], **kwargs
|
||||
) -> None:
|
||||
print(f"LLM started with {len(prompts)} prompts")
|
||||
|
||||
def on_llm_end(self, response, **kwargs):
|
||||
print(f"LLM ended with response: {response}")
|
||||
def on_llm_end(self, response, **kwargs) -> None:
|
||||
print(f"LLM completed: {len(response.generations)} generations")
|
||||
|
||||
def on_llm_error(self, error, **kwargs):
|
||||
def on_llm_error(self, error: Exception, **kwargs) -> None:
|
||||
print(f"LLM error: {error}")
|
||||
|
||||
def on_chain_start(self, serialized, inputs, **kwargs):
|
||||
print(f"Chain started with inputs: {inputs}")
|
||||
def on_tool_start(
|
||||
self, serialized: Dict[str, Any], input_str: str, **kwargs
|
||||
) -> None:
|
||||
print(f"Tool started: {serialized.get('name')}")
|
||||
|
||||
def on_agent_action(self, action, **kwargs):
|
||||
print(f"Agent taking action: {action}")
|
||||
def on_tool_end(self, output: str, **kwargs) -> None:
|
||||
print(f"Tool completed: {output[:100]}...")
|
||||
|
||||
# Use callback
|
||||
agent.run("query", callbacks=[CustomCallbackHandler()])
|
||||
# Use callbacks
|
||||
result = await agent.ainvoke(
|
||||
{"messages": [("user", "query")]},
|
||||
config={"callbacks": [CustomCallbackHandler()]}
|
||||
)
|
||||
```
|
||||
|
||||
## Streaming Responses
|
||||
|
||||
```python
|
||||
from langchain_anthropic import ChatAnthropic
|
||||
|
||||
llm = ChatAnthropic(model="claude-sonnet-4-5", streaming=True)
|
||||
|
||||
# Stream tokens
|
||||
async for chunk in llm.astream("Tell me a story"):
|
||||
print(chunk.content, end="", flush=True)
|
||||
|
||||
# Stream agent events
|
||||
async for event in agent.astream_events(
|
||||
{"messages": [("user", "Search and summarize")]},
|
||||
version="v2"
|
||||
):
|
||||
if event["event"] == "on_chat_model_stream":
|
||||
print(event["data"]["chunk"].content, end="")
|
||||
elif event["event"] == "on_tool_start":
|
||||
print(f"\n[Using tool: {event['name']}]")
|
||||
```
|
||||
|
||||
## Testing Strategies
|
||||
|
||||
```python
|
||||
import pytest
|
||||
from unittest.mock import Mock
|
||||
from unittest.mock import AsyncMock, patch
|
||||
|
||||
def test_agent_tool_selection():
|
||||
# Mock LLM to return specific tool selection
|
||||
mock_llm = Mock()
|
||||
mock_llm.predict.return_value = "Action: search_database\nAction Input: test query"
|
||||
@pytest.mark.asyncio
|
||||
async def test_agent_tool_selection():
|
||||
"""Test agent selects correct tool."""
|
||||
with patch.object(llm, 'ainvoke') as mock_llm:
|
||||
mock_llm.return_value = AsyncMock(content="Using search_database")
|
||||
|
||||
agent = initialize_agent(tools, mock_llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
|
||||
result = await agent.ainvoke({
|
||||
"messages": [("user", "search for documents")]
|
||||
})
|
||||
|
||||
result = agent.run("test query")
|
||||
# Verify tool was called
|
||||
assert "search_database" in str(result)
|
||||
|
||||
# Verify correct tool was selected
|
||||
assert "search_database" in str(mock_llm.predict.call_args)
|
||||
@pytest.mark.asyncio
|
||||
async def test_memory_persistence():
|
||||
"""Test memory persists across invocations."""
|
||||
config = {"configurable": {"thread_id": "test-thread"}}
|
||||
|
||||
def test_memory_persistence():
|
||||
memory = ConversationBufferMemory()
|
||||
# First message
|
||||
await agent.ainvoke(
|
||||
{"messages": [("user", "Remember: the code is 12345")]},
|
||||
config
|
||||
)
|
||||
|
||||
memory.save_context({"input": "Hi"}, {"output": "Hello!"})
|
||||
# Second message should remember
|
||||
result = await agent.ainvoke(
|
||||
{"messages": [("user", "What was the code?")]},
|
||||
config
|
||||
)
|
||||
|
||||
assert "Hi" in memory.load_memory_variables({})['history']
|
||||
assert "Hello!" in memory.load_memory_variables({})['history']
|
||||
assert "12345" in result["messages"][-1].content
|
||||
```
|
||||
|
||||
## Performance Optimization
|
||||
|
||||
### 1. Caching
|
||||
```python
|
||||
from langchain.cache import InMemoryCache
|
||||
import langchain
|
||||
### 1. Caching with Redis
|
||||
|
||||
langchain.llm_cache = InMemoryCache()
|
||||
```python
|
||||
from langchain_community.cache import RedisCache
|
||||
from langchain_core.globals import set_llm_cache
|
||||
import redis
|
||||
|
||||
redis_client = redis.Redis.from_url("redis://localhost:6379")
|
||||
set_llm_cache(RedisCache(redis_client))
|
||||
```
|
||||
|
||||
### 2. Batch Processing
|
||||
### 2. Async Batch Processing
|
||||
|
||||
```python
|
||||
# Process multiple documents in parallel
|
||||
from langchain.document_loaders import DirectoryLoader
|
||||
from concurrent.futures import ThreadPoolExecutor
|
||||
import asyncio
|
||||
from langchain_core.documents import Document
|
||||
|
||||
loader = DirectoryLoader('./docs')
|
||||
docs = loader.load()
|
||||
async def process_documents(documents: list[Document]) -> list:
|
||||
"""Process documents in parallel."""
|
||||
tasks = [process_single(doc) for doc in documents]
|
||||
return await asyncio.gather(*tasks)
|
||||
|
||||
def process_doc(doc):
|
||||
return text_splitter.split_documents([doc])
|
||||
|
||||
with ThreadPoolExecutor(max_workers=4) as executor:
|
||||
split_docs = list(executor.map(process_doc, docs))
|
||||
async def process_single(doc: Document) -> dict:
|
||||
"""Process a single document."""
|
||||
chunks = text_splitter.split_documents([doc])
|
||||
embeddings = await embeddings_model.aembed_documents(
|
||||
[c.page_content for c in chunks]
|
||||
)
|
||||
return {"doc_id": doc.metadata.get("id"), "embeddings": embeddings}
|
||||
```
|
||||
|
||||
### 3. Streaming Responses
|
||||
```python
|
||||
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
|
||||
### 3. Connection Pooling
|
||||
|
||||
llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()])
|
||||
```python
|
||||
from langchain_pinecone import PineconeVectorStore
|
||||
from pinecone import Pinecone
|
||||
|
||||
# Reuse Pinecone client
|
||||
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
|
||||
index = pc.Index("my-index")
|
||||
|
||||
# Create vector store with existing index
|
||||
vectorstore = PineconeVectorStore(index=index, embedding=embeddings)
|
||||
```
|
||||
|
||||
## Resources
|
||||
|
||||
- **references/agents.md**: Deep dive on agent architectures
|
||||
- **references/memory.md**: Memory system patterns
|
||||
- **references/chains.md**: Chain composition strategies
|
||||
- **references/document-processing.md**: Document loading and indexing
|
||||
- **references/callbacks.md**: Monitoring and observability
|
||||
- **assets/agent-template.py**: Production-ready agent template
|
||||
- **assets/memory-config.yaml**: Memory configuration examples
|
||||
- **assets/chain-example.py**: Complex chain examples
|
||||
- [LangChain Documentation](https://python.langchain.com/docs/)
|
||||
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
|
||||
- [LangSmith Platform](https://smith.langchain.com/)
|
||||
- [LangChain GitHub](https://github.com/langchain-ai/langchain)
|
||||
- [LangGraph GitHub](https://github.com/langchain-ai/langgraph)
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
1. **Memory Overflow**: Not managing conversation history length
|
||||
2. **Tool Selection Errors**: Poor tool descriptions confuse agents
|
||||
3. **Context Window Exceeded**: Exceeding LLM token limits
|
||||
4. **No Error Handling**: Not catching and handling agent failures
|
||||
5. **Inefficient Retrieval**: Not optimizing vector store queries
|
||||
1. **Using Deprecated APIs**: Use LangGraph for agents, not `initialize_agent`
|
||||
2. **Memory Overflow**: Use checkpointers with TTL for long-running agents
|
||||
3. **Poor Tool Descriptions**: Clear descriptions help LLM select correct tools
|
||||
4. **Context Window Exceeded**: Use summarization or sliding window memory
|
||||
5. **No Error Handling**: Wrap tool functions with try/except
|
||||
6. **Blocking Operations**: Use async methods (`ainvoke`, `astream`)
|
||||
7. **Missing Observability**: Always enable LangSmith tracing in production
|
||||
|
||||
## Production Checklist
|
||||
|
||||
- [ ] Implement proper error handling
|
||||
- [ ] Add request/response logging
|
||||
- [ ] Monitor token usage and costs
|
||||
- [ ] Set timeout limits for agent execution
|
||||
- [ ] Use LangGraph StateGraph for agent orchestration
|
||||
- [ ] Implement async patterns throughout (`ainvoke`, `astream`)
|
||||
- [ ] Add production checkpointer (PostgreSQL, Redis)
|
||||
- [ ] Enable LangSmith tracing
|
||||
- [ ] Implement structured tools with Pydantic schemas
|
||||
- [ ] Add timeout limits for agent execution
|
||||
- [ ] Implement rate limiting
|
||||
- [ ] Add input validation
|
||||
- [ ] Test with edge cases
|
||||
- [ ] Set up observability (callbacks)
|
||||
- [ ] Implement fallback strategies
|
||||
- [ ] Add comprehensive error handling
|
||||
- [ ] Set up health checks
|
||||
- [ ] Version control prompts and configurations
|
||||
- [ ] Write integration tests for agent workflows
|
||||
|
||||
Reference in New Issue
Block a user