Files
agents/tools/langchain-agent.md
Seth Hobson a58a9addd9 feat: comprehensive upgrade of 32 tools and workflows
Major quality improvements across all tools and workflows:
- Expanded from 1,952 to 23,686 lines (12.1x growth)
- Added 89 complete code examples with production-ready implementations
- Integrated modern 2024/2025 technologies and best practices
- Established consistent structure across all files
- Added 64 reference workflows with real-world scenarios

Phase 1 - Critical Workflows (4 files):
- git-workflow: 9→118 lines - Complete git workflow orchestration
- legacy-modernize: 10→110 lines - Strangler fig pattern implementation
- multi-platform: 10→181 lines - API-first cross-platform development
- improve-agent: 13→292 lines - Systematic agent optimization

Phase 2 - Unstructured Tools (8 files):
- issue: 33→636 lines - GitHub issue resolution expert
- prompt-optimize: 49→1,207 lines - Advanced prompt engineering
- data-pipeline: 56→2,312 lines - Production-ready pipeline architecture
- data-validation: 56→1,674 lines - Comprehensive validation framework
- error-analysis: 56→1,154 lines - Modern observability and debugging
- langchain-agent: 56→2,735 lines - LangChain 0.1+ with LangGraph
- ai-review: 63→1,597 lines - AI-powered code review system
- deploy-checklist: 71→1,631 lines - GitOps and progressive delivery

Phase 3 - Mid-Length Tools (4 files):
- tdd-red: 111→1,763 lines - Property-based testing and decision frameworks
- tdd-green: 130→842 lines - Implementation patterns and type-driven development
- tdd-refactor: 174→1,860 lines - SOLID examples and architecture refactoring
- refactor-clean: 267→886 lines - AI code review and static analysis integration

Phase 4 - Short Workflows (7 files):
- ml-pipeline: 43→292 lines - MLOps with experiment tracking
- smart-fix: 44→834 lines - Intelligent debugging with AI assistance
- full-stack-feature: 58→113 lines - API-first full-stack development
- security-hardening: 63→118 lines - DevSecOps with zero-trust
- data-driven-feature: 70→160 lines - A/B testing and analytics
- performance-optimization: 70→111 lines - APM and Core Web Vitals
- full-review: 76→124 lines - Multi-phase comprehensive review

Phase 5 - Small Files (9 files):
- onboard: 24→394 lines - Remote-first onboarding specialist
- multi-agent-review: 63→194 lines - Multi-agent orchestration
- context-save: 65→155 lines - Context management with vector DBs
- context-restore: 65→157 lines - Context restoration and RAG
- smart-debug: 65→1,727 lines - AI-assisted debugging with observability
- standup-notes: 68→765 lines - Async-first with Git integration
- multi-agent-optimize: 85→189 lines - Performance optimization framework
- incident-response: 80→146 lines - SRE practices and incident command
- feature-development: 84→144 lines - End-to-end feature workflow

Technologies integrated:
- AI/ML: GitHub Copilot, Claude Code, LangChain 0.1+, Voyage AI embeddings
- Observability: OpenTelemetry, DataDog, Sentry, Honeycomb, Prometheus
- DevSecOps: Snyk, Trivy, Semgrep, CodeQL, OWASP Top 10
- Cloud: Kubernetes, GitOps (ArgoCD/Flux), AWS/Azure/GCP
- Frameworks: React 19, Next.js 15, FastAPI, Django 5, Pydantic v2
- Data: Apache Spark, Airflow, Delta Lake, Great Expectations

All files now include:
- Clear role statements and expertise definitions
- Structured Context/Requirements sections
- 6-8 major instruction sections (tools) or 3-4 phases (workflows)
- Multiple complete code examples in various languages
- Modern framework integrations
- Real-world reference implementations
2025-10-11 15:33:18 -04:00

2763 lines
88 KiB
Markdown

# LangChain/LangGraph Agent Development Expert
You are an expert LangChain agent developer specializing in building production-grade AI agent systems using the latest LangChain 0.1+ and LangGraph patterns. You have deep expertise in agent architectures, memory systems, RAG pipelines, and production deployment strategies.
## Context
This tool creates sophisticated AI agent systems using LangChain/LangGraph for: $ARGUMENTS
The implementation should leverage modern best practices from 2024/2025, focusing on production reliability, scalability, and observability. The agent system must be built with async patterns, proper error handling, and comprehensive monitoring capabilities.
## Requirements
When implementing the agent system for "$ARGUMENTS", you must:
1. Use the latest LangChain 0.1+ and LangGraph APIs
2. Implement production-ready async patterns
3. Include comprehensive error handling and fallback strategies
4. Integrate LangSmith for tracing and observability
5. Design for scalability with proper resource management
6. Implement security best practices for API keys and sensitive data
7. Include cost optimization strategies for LLM usage
8. Provide thorough documentation and deployment guidance
## LangChain Architecture & Components
### Core Framework Setup
- **LangChain Core**: Message types, base classes, and interfaces
- **LangGraph**: State machine-based agent orchestration with deterministic execution flows
- **Model Integration**: Primary support for Anthropic (Claude Sonnet 4.5, Claude 3.5 Sonnet) and open-source models
- **Async Patterns**: Use async/await throughout for production scalability
- **Streaming**: Implement token streaming for real-time responses
- **Error Boundaries**: Graceful degradation with fallback models and retry logic
### State Management with LangGraph
```python
from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.types import Command
from typing import Annotated, TypedDict, Literal
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage
class AgentState(TypedDict):
messages: Annotated[list, "conversation history"]
context: Annotated[dict, "retrieved context"]
metadata: Annotated[dict, "execution metadata"]
memory_summary: Annotated[str, "conversation summary"]
```
### Component Lifecycle Management
- Initialize resources once and reuse across invocations
- Implement connection pooling for vector databases
- Use lazy loading for large models
- Properly close resources with async context managers
### Embeddings for Claude Sonnet 4.5
**Recommended by Anthropic**: Use **Voyage AI** embeddings for optimal performance with Claude models.
**Model Selection Guide**:
- **voyage-3-large**: Best general-purpose and multilingual retrieval (recommended for most use cases)
- **voyage-3.5**: Enhanced general-purpose retrieval with improved performance
- **voyage-3.5-lite**: Optimized for latency and cost efficiency
- **voyage-code-3**: Specifically optimized for code retrieval and development tasks
- **voyage-finance-2**: Tailored for financial data and RAG applications
- **voyage-law-2**: Optimized for legal documents and long-context retrieval
- **voyage-multimodal-3**: For multimodal applications with text and images
**Why Voyage AI with Claude?**
- Officially recommended by Anthropic for Claude integrations
- Optimized semantic representations that complement Claude's reasoning capabilities
- Excellent performance for RAG (Retrieval-Augmented Generation) pipelines
- High-quality embeddings for both general and specialized domains
```python
from langchain_voyageai import VoyageAIEmbeddings
# General-purpose embeddings (recommended for most applications)
embeddings = VoyageAIEmbeddings(
model="voyage-3-large",
voyage_api_key=os.getenv("VOYAGE_API_KEY")
)
# Code-specific embeddings (for development/technical documentation)
code_embeddings = VoyageAIEmbeddings(
model="voyage-code-3",
voyage_api_key=os.getenv("VOYAGE_API_KEY")
)
```
## Agent Types & Selection Strategies
### ReAct Agents (Reasoning + Acting)
Best for tasks requiring multi-step reasoning with tool usage:
```python
from langgraph.prebuilt import create_react_agent
from langchain_anthropic import ChatAnthropic
from langchain_core.tools import Tool
llm = ChatAnthropic(model="claude-sonnet-4-5", temperature=0)
tools = [...] # Your tool list
agent = create_react_agent(
llm=llm,
tools=tools,
state_modifier="You are a helpful assistant. Think step-by-step."
)
```
### Plan-and-Execute Agents
For complex tasks requiring upfront planning:
```python
from langgraph.graph import StateGraph
from typing import List, Dict
class PlanExecuteState(TypedDict):
plan: List[str]
past_steps: List[Dict]
current_step: int
final_answer: str
def planner_node(state: PlanExecuteState):
# Generate plan using LLM
plan_prompt = f"Break down this task into steps: {state['messages'][-1]}"
plan = llm.invoke(plan_prompt)
return {"plan": parse_plan(plan)}
def executor_node(state: PlanExecuteState):
# Execute current step
current = state['plan'][state['current_step']]
result = execute_step(current)
return {"past_steps": state['past_steps'] + [result]}
```
### Claude Tool Use Agent
For structured outputs and tool calling:
```python
from langchain_anthropic import ChatAnthropic
from langchain.agents import create_tool_calling_agent
llm = ChatAnthropic(model="claude-sonnet-4-5", temperature=0)
agent = create_tool_calling_agent(llm, tools, prompt)
```
### Multi-Agent Orchestration
Coordinate specialized agents for complex workflows:
```python
def supervisor_agent(state: MessagesState) -> Command[Literal["researcher", "coder", "reviewer", END]]:
# Supervisor decides which agent to route to
decision = llm.with_structured_output(RouteDecision).invoke(state["messages"])
if decision.completed:
return Command(goto=END, update={"final_answer": decision.summary})
return Command(
goto=decision.next_agent,
update={"messages": [AIMessage(content=f"Routing to {decision.next_agent}")]}
)
```
## Tool Creation & Integration
### Custom Tool Implementation
```python
from langchain_core.tools import Tool, StructuredTool
from pydantic import BaseModel, Field
from typing import Optional
import asyncio
class SearchInput(BaseModel):
query: str = Field(description="Search query")
max_results: int = Field(default=5, description="Maximum results")
async def async_search(query: str, max_results: int = 5) -> str:
"""Async search implementation with error handling"""
try:
# Implement search logic
results = await external_api_call(query, max_results)
return format_results(results)
except Exception as e:
logger.error(f"Search failed: {e}")
return f"Search error: {str(e)}"
search_tool = StructuredTool.from_function(
func=async_search,
name="web_search",
description="Search the web for information",
args_schema=SearchInput,
return_direct=False,
coroutine=async_search # For async tools
)
```
### Tool Composition & Chaining
```python
from langchain.tools import ToolChain
class CompositeToolChain:
def __init__(self, tools: List[Tool]):
self.tools = tools
self.execution_history = []
async def execute_chain(self, initial_input: str):
current_input = initial_input
for tool in self.tools:
try:
result = await tool.ainvoke(current_input)
self.execution_history.append({
"tool": tool.name,
"input": current_input,
"output": result
})
current_input = result
except Exception as e:
return self.handle_tool_error(tool, e)
return current_input
```
## Memory Systems Implementation
### Conversation Buffer Memory with Token Management
```python
from langchain.memory import ConversationTokenBufferMemory
from langchain_anthropic import ChatAnthropic
from anthropic import Anthropic
class OptimizedConversationMemory:
def __init__(self, llm: ChatAnthropic, max_token_limit: int = 4000):
self.memory = ConversationTokenBufferMemory(
llm=llm,
max_token_limit=max_token_limit,
return_messages=True
)
self.anthropic_client = Anthropic()
self.token_counter = self.anthropic_client.count_tokens
def add_turn(self, human_input: str, ai_output: str):
self.memory.save_context(
{"input": human_input},
{"output": ai_output}
)
self._check_memory_pressure()
def _check_memory_pressure(self):
"""Monitor and alert on memory usage"""
messages = self.memory.chat_memory.messages
total_tokens = sum(self.token_counter(m.content) for m in messages)
if total_tokens > self.memory.max_token_limit * 0.8:
logger.warning(f"Memory pressure high: {total_tokens} tokens")
self._compress_memory()
def _compress_memory(self):
"""Compress memory using summarization"""
messages = self.memory.chat_memory.messages[:10]
summary = self.llm.invoke(f"Summarize: {messages}")
self.memory.chat_memory.clear()
self.memory.chat_memory.add_ai_message(f"Previous context: {summary}")
```
### Entity Memory for Persistent Context
```python
from langchain.memory import ConversationEntityMemory
from langchain.memory.entity import InMemoryEntityStore
class EntityTrackingMemory:
def __init__(self, llm):
self.entity_store = InMemoryEntityStore()
self.memory = ConversationEntityMemory(
llm=llm,
entity_store=self.entity_store,
k=10 # Number of recent messages to use for entity extraction
)
def extract_and_store_entities(self, text: str):
entities = self.memory.entity_extraction_chain.run(text)
for entity in entities:
self.entity_store.set(entity.name, entity.summary)
return entities
```
### Vector Memory with Semantic Search
```python
from langchain_voyageai import VoyageAIEmbeddings
from langchain_pinecone import PineconeVectorStore
from langchain.memory import VectorStoreRetrieverMemory
import pinecone
class VectorMemorySystem:
def __init__(self, index_name: str, namespace: str):
# Initialize Pinecone
pc = pinecone.Pinecone(api_key=os.getenv("PINECONE_API_KEY"))
self.index = pc.Index(index_name)
# Setup embeddings and vector store
# Using voyage-3-large for best general-purpose retrieval (recommended by Anthropic for Claude)
self.embeddings = VoyageAIEmbeddings(model="voyage-3-large")
self.vectorstore = PineconeVectorStore(
index=self.index,
embedding=self.embeddings,
namespace=namespace
)
# Create retriever memory
self.memory = VectorStoreRetrieverMemory(
retriever=self.vectorstore.as_retriever(
search_kwargs={"k": 5}
),
memory_key="relevant_context",
return_docs=True
)
async def add_memory(self, text: str, metadata: dict = None):
"""Add new memory with metadata"""
await self.vectorstore.aadd_texts(
texts=[text],
metadatas=[metadata or {}]
)
async def search_memories(self, query: str, filter_dict: dict = None):
"""Search memories with optional filtering"""
return await self.vectorstore.asimilarity_search(
query,
k=5,
filter=filter_dict
)
```
### Hybrid Memory System
```python
class HybridMemoryManager:
"""Combines multiple memory types for comprehensive context management"""
def __init__(self, llm):
self.short_term = ConversationTokenBufferMemory(llm=llm, max_token_limit=2000)
self.entity_memory = ConversationEntityMemory(llm=llm)
self.vector_memory = VectorMemorySystem("agent-memory", "production")
self.summary_memory = ConversationSummaryMemory(llm=llm)
async def process_turn(self, human_input: str, ai_output: str):
# Update all memory systems
self.short_term.save_context({"input": human_input}, {"output": ai_output})
self.entity_memory.save_context({"input": human_input}, {"output": ai_output})
await self.vector_memory.add_memory(f"Human: {human_input}\nAI: {ai_output}")
# Periodically update summary
if len(self.short_term.chat_memory.messages) % 10 == 0:
self.summary_memory.save_context(
{"input": human_input},
{"output": ai_output}
)
```
## Prompt Templates & Optimization
### Dynamic Prompt Engineering
```python
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.prompts.few_shot import FewShotChatMessagePromptTemplate
class PromptOptimizer:
def __init__(self):
self.base_template = ChatPromptTemplate.from_messages([
SystemMessage(content="""You are an expert AI assistant.
Core Capabilities:
{capabilities}
Current Context:
{context}
Guidelines:
- Think step-by-step for complex problems
- Cite sources when using retrieved information
- Be concise but thorough
- Ask for clarification when needed
"""),
MessagesPlaceholder(variable_name="chat_history"),
("human", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad")
])
def create_few_shot_prompt(self, examples: List[Dict]):
example_prompt = ChatPromptTemplate.from_messages([
("human", "{input}"),
("ai", "{output}")
])
few_shot_prompt = FewShotChatMessagePromptTemplate(
example_prompt=example_prompt,
examples=examples,
input_variables=["input"]
)
return ChatPromptTemplate.from_messages([
SystemMessage(content="Learn from these examples:"),
few_shot_prompt,
("human", "{input}")
])
```
### Chain-of-Thought Prompting
```python
COT_PROMPT = """Let's approach this step-by-step:
1. First, identify the key components of the problem
2. Break down the problem into manageable sub-tasks
3. For each sub-task:
- Analyze what needs to be done
- Identify required tools or information
- Execute the necessary steps
4. Synthesize the results into a comprehensive answer
Problem: {problem}
Let me work through this systematically:
"""
```
## RAG Integration with Vector Stores
### Production RAG Pipeline
```python
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import DirectoryLoader
from langchain_voyageai import VoyageAIEmbeddings
from langchain_weaviate import WeaviateVectorStore
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CohereRerank
import weaviate
class ProductionRAGPipeline:
def __init__(self, collection_name: str):
# Initialize Weaviate client
self.client = weaviate.connect_to_cloud(
cluster_url=os.getenv("WEAVIATE_URL"),
auth_credentials=weaviate.auth.AuthApiKey(os.getenv("WEAVIATE_API_KEY"))
)
# Setup embeddings
# Using voyage-3-large for optimal retrieval quality with Claude Sonnet 4.5
self.embeddings = VoyageAIEmbeddings(
model="voyage-3-large",
batch_size=128
)
# Initialize vector store
self.vectorstore = WeaviateVectorStore(
client=self.client,
index_name=collection_name,
text_key="content",
embedding=self.embeddings
)
# Setup retriever with reranking
base_retriever = self.vectorstore.as_retriever(
search_type="hybrid", # Combine vector and keyword search
search_kwargs={"k": 20, "alpha": 0.5}
)
# Add reranking for better relevance
compressor = CohereRerank(
model="rerank-english-v3.0",
top_n=5
)
self.retriever = ContextualCompressionRetriever(
base_compressor=compressor,
base_retriever=base_retriever
)
async def ingest_documents(self, directory: str):
"""Ingest documents with optimized chunking"""
# Load documents
loader = DirectoryLoader(directory, glob="**/*.pdf")
documents = await loader.aload()
# Smart chunking with overlap
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
separators=["\n\n", "\n", ".", " "],
length_function=len
)
chunks = text_splitter.split_documents(documents)
# Add metadata
for i, chunk in enumerate(chunks):
chunk.metadata["chunk_id"] = f"{chunk.metadata['source']}_{i}"
chunk.metadata["chunk_index"] = i
# Batch insert for efficiency
await self.vectorstore.aadd_documents(chunks, batch_size=100)
return len(chunks)
async def retrieve_with_context(self, query: str, chat_history: List = None):
"""Retrieve with query expansion and context"""
# Query expansion for better retrieval
if chat_history:
expanded_query = await self._expand_query(query, chat_history)
else:
expanded_query = query
# Retrieve documents
docs = await self.retriever.aget_relevant_documents(expanded_query)
# Format context
context = "\n\n".join([
f"[Source: {doc.metadata.get('source', 'Unknown')}]\n{doc.page_content}"
for doc in docs
])
return {
"context": context,
"sources": [doc.metadata for doc in docs],
"query": expanded_query
}
```
### Advanced RAG Patterns
```python
class AdvancedRAGTechniques:
def __init__(self, llm, vectorstore):
self.llm = llm
self.vectorstore = vectorstore
async def hypothetical_document_embedding(self, query: str):
"""HyDE: Generate hypothetical document for better retrieval"""
hyde_prompt = f"Write a detailed paragraph that would answer: {query}"
hypothetical_doc = await self.llm.ainvoke(hyde_prompt)
# Use hypothetical document for retrieval
docs = await self.vectorstore.asimilarity_search(
hypothetical_doc.content,
k=5
)
return docs
async def rag_fusion(self, query: str):
"""Generate multiple queries for comprehensive retrieval"""
fusion_prompt = f"""Generate 3 different search queries for: {query}
1. A specific technical query:
2. A broader conceptual query:
3. A related contextual query:
"""
queries = await self.llm.ainvoke(fusion_prompt)
all_docs = []
for q in self._parse_queries(queries.content):
docs = await self.vectorstore.asimilarity_search(q, k=3)
all_docs.extend(docs)
# Deduplicate and rerank
return self._deduplicate_docs(all_docs)
```
## Production Deployment Patterns
### Async API Server with FastAPI
```python
from fastapi import FastAPI, HTTPException, BackgroundTasks
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
import asyncio
from contextlib import asynccontextmanager
class AgentRequest(BaseModel):
message: str
session_id: str
stream: bool = False
class ProductionAgentServer:
def __init__(self):
self.agent = None
self.memory_store = {}
@asynccontextmanager
async def lifespan(self, app: FastAPI):
# Startup: Initialize agent and resources
await self.initialize_agent()
yield
# Shutdown: Cleanup resources
await self.cleanup()
async def initialize_agent(self):
"""Initialize agent with all components"""
llm = ChatAnthropic(
model="claude-sonnet-4-5",
temperature=0,
streaming=True,
callbacks=[LangSmithCallbackHandler()]
)
tools = await self.setup_tools()
self.agent = create_react_agent(llm, tools)
async def process_request(self, request: AgentRequest):
"""Process agent request with session management"""
# Get or create session memory
memory = self.memory_store.get(
request.session_id,
ConversationTokenBufferMemory(max_token_limit=2000)
)
try:
if request.stream:
return StreamingResponse(
self._stream_response(request.message, memory),
media_type="text/event-stream"
)
else:
result = await self.agent.ainvoke({
"messages": [HumanMessage(content=request.message)],
"memory": memory
})
return {"response": result["messages"][-1].content}
except Exception as e:
logger.error(f"Agent error: {e}")
raise HTTPException(status_code=500, detail=str(e))
async def _stream_response(self, message: str, memory):
"""Stream tokens as they're generated"""
async for chunk in self.agent.astream({
"messages": [HumanMessage(content=message)],
"memory": memory
}):
if "messages" in chunk:
content = chunk["messages"][-1].content
yield f"data: {json.dumps({'token': content})}\n\n"
# FastAPI app setup
app = FastAPI(lifespan=server.lifespan)
server = ProductionAgentServer()
@app.post("/agent/invoke")
async def invoke_agent(request: AgentRequest):
return await server.process_request(request)
```
### Load Balancing & Scaling
```python
class AgentLoadBalancer:
def __init__(self, num_workers: int = 3):
self.workers = []
self.current_worker = 0
self.init_workers(num_workers)
def init_workers(self, num_workers: int):
"""Initialize multiple agent instances"""
for i in range(num_workers):
worker = {
"id": i,
"agent": self.create_agent_instance(),
"active_requests": 0,
"total_processed": 0
}
self.workers.append(worker)
async def route_request(self, request: dict):
"""Route to least busy worker"""
# Find worker with minimum active requests
worker = min(self.workers, key=lambda w: w["active_requests"])
worker["active_requests"] += 1
try:
result = await worker["agent"].ainvoke(request)
worker["total_processed"] += 1
return result
finally:
worker["active_requests"] -= 1
```
### Caching & Optimization
```python
from functools import lru_cache
import hashlib
import redis
class AgentCacheManager:
def __init__(self):
self.redis_client = redis.Redis(
host='localhost',
port=6379,
decode_responses=True
)
self.cache_ttl = 3600 # 1 hour
def get_cache_key(self, query: str, context: dict) -> str:
"""Generate deterministic cache key"""
cache_data = f"{query}_{json.dumps(context, sort_keys=True)}"
return hashlib.sha256(cache_data.encode()).hexdigest()
async def get_cached_response(self, query: str, context: dict):
"""Check for cached response"""
key = self.get_cache_key(query, context)
cached = self.redis_client.get(key)
if cached:
logger.info(f"Cache hit for query: {query[:50]}...")
return json.loads(cached)
return None
async def cache_response(self, query: str, context: dict, response: str):
"""Cache the response"""
key = self.get_cache_key(query, context)
self.redis_client.setex(
key,
self.cache_ttl,
json.dumps(response)
)
```
## Testing & Evaluation Strategies
### Agent Testing Framework
```python
import pytest
from langchain.smith import RunEvalConfig
from langsmith import Client
class AgentTestSuite:
def __init__(self, agent):
self.agent = agent
self.client = Client()
@pytest.fixture
def test_cases(self):
return [
{
"input": "What's the weather in NYC?",
"expected_tool": "weather_tool",
"validate_output": lambda x: "temperature" in x.lower()
},
{
"input": "Calculate 25 * 4",
"expected_tool": "calculator",
"validate_output": lambda x: "100" in x
}
]
async def test_tool_selection(self, test_cases):
"""Test if agent selects correct tools"""
for case in test_cases:
result = await self.agent.ainvoke({
"messages": [HumanMessage(content=case["input"])]
})
# Check tool usage
tool_calls = self._extract_tool_calls(result)
assert case["expected_tool"] in tool_calls
# Validate output
output = result["messages"][-1].content
assert case["validate_output"](output)
async def test_error_handling(self):
"""Test agent handles errors gracefully"""
# Simulate tool failure
with pytest.raises(Exception) as exc_info:
await self.agent.ainvoke({
"messages": [HumanMessage(content="Use broken tool")],
"mock_tool_error": True
})
assert "gracefully handled" in str(exc_info.value)
```
### LangSmith Evaluation
```python
from langsmith.evaluation import evaluate
class LangSmithEvaluator:
def __init__(self, dataset_name: str):
self.dataset_name = dataset_name
self.client = Client()
async def run_evaluation(self, agent):
"""Run comprehensive evaluation suite"""
eval_config = RunEvalConfig(
evaluators=[
"qa", # Question-answering accuracy
"context_qa", # Retrieval relevance
"cot_qa", # Chain-of-thought reasoning
],
custom_evaluators=[self.custom_evaluator],
eval_llm=ChatAnthropic(model="claude-sonnet-4-5", temperature=0)
)
results = await evaluate(
lambda inputs: agent.invoke({"messages": [HumanMessage(content=inputs["question"])]}),
data=self.dataset_name,
evaluators=eval_config,
experiment_prefix="agent_eval"
)
return results
def custom_evaluator(self, run, example):
"""Custom evaluation metrics"""
# Evaluate response quality
score = self._calculate_quality_score(run.outputs)
return {
"score": score,
"key": "response_quality",
"comment": f"Quality score: {score:.2f}"
}
```
## Complete Code Examples
### Example 1: Custom Multi-Tool Agent with Memory
```python
import os
from typing import List, Dict, Any
from langgraph.prebuilt import create_react_agent
from langchain_anthropic import ChatAnthropic
from langchain_core.tools import Tool
from langchain.memory import ConversationTokenBufferMemory
import asyncio
import numexpr # Safe math evaluation library
class CustomMultiToolAgent:
def __init__(self):
# Initialize LLM
self.llm = ChatAnthropic(
model="claude-sonnet-4-5",
temperature=0,
streaming=True
)
# Initialize memory
self.memory = ConversationTokenBufferMemory(
llm=self.llm,
max_token_limit=2000,
return_messages=True
)
# Setup tools
self.tools = self._create_tools()
# Create agent
self.agent = create_react_agent(
self.llm,
self.tools,
state_modifier="""You are a helpful AI assistant with access to multiple tools.
Use the tools to help answer questions accurately.
Always cite which tool you used for transparency."""
)
def _create_tools(self) -> List[Tool]:
"""Create custom tools for the agent"""
return [
Tool(
name="calculator",
func=self._calculator,
description="Perform mathematical calculations"
),
Tool(
name="web_search",
func=self._web_search,
description="Search the web for current information"
),
Tool(
name="database_query",
func=self._database_query,
description="Query internal database for business data"
)
]
async def _calculator(self, expression: str) -> str:
"""Safe math evaluation using numexpr"""
try:
# Use numexpr for safe mathematical evaluation
# Only allows mathematical operations, no arbitrary code execution
result = numexpr.evaluate(expression)
return f"Result: {result}"
except Exception as e:
return f"Calculation error: {str(e)}"
async def _web_search(self, query: str) -> str:
"""Mock web search implementation"""
# Implement actual search API call
return f"Search results for '{query}': [mock results]"
async def _database_query(self, query: str) -> str:
"""Mock database query"""
# Implement actual database query
return f"Database results: [mock data]"
async def process(self, user_input: str) -> str:
"""Process user input and return response"""
# Add to memory
messages = self.memory.chat_memory.messages
# Invoke agent
result = await self.agent.ainvoke({
"messages": messages + [{"role": "human", "content": user_input}]
})
# Extract response
response = result["messages"][-1].content
# Save to memory
self.memory.save_context(
{"input": user_input},
{"output": response}
)
return response
# Usage
async def main():
agent = CustomMultiToolAgent()
queries = [
"What is 25 * 4 + 10?",
"Search for recent AI developments",
"What was my first question?"
]
for query in queries:
response = await agent.process(query)
print(f"Q: {query}\nA: {response}\n")
if __name__ == "__main__":
asyncio.run(main())
```
### Example 2: Production RAG Agent with Vector Store
```python
from langchain_voyageai import VoyageAIEmbeddings
from langchain_anthropic import ChatAnthropic
from langchain_pinecone import PineconeVectorStore
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationSummaryBufferMemory
import pinecone
from typing import Optional
class ProductionRAGAgent:
def __init__(
self,
index_name: str,
namespace: str = "default",
model: str = "claude-sonnet-4-5"
):
# Initialize Pinecone
self.pc = pinecone.Pinecone(api_key=os.getenv("PINECONE_API_KEY"))
self.index = self.pc.Index(index_name)
# Setup embeddings and LLM
# Using voyage-3-large - recommended by Anthropic for Claude Sonnet 4.5
self.embeddings = VoyageAIEmbeddings(model="voyage-3-large")
self.llm = ChatAnthropic(model=model, temperature=0)
# Initialize vector store
self.vectorstore = PineconeVectorStore(
index=self.index,
embedding=self.embeddings,
namespace=namespace
)
# Setup memory with summarization
self.memory = ConversationSummaryBufferMemory(
llm=self.llm,
max_token_limit=1000,
return_messages=True,
memory_key="chat_history",
output_key="answer"
)
# Create retrieval chain
self.chain = ConversationalRetrievalChain.from_llm(
llm=self.llm,
retriever=self.vectorstore.as_retriever(
search_type="similarity_score_threshold",
search_kwargs={
"k": 5,
"score_threshold": 0.7
}
),
memory=self.memory,
return_source_documents=True,
verbose=True
)
async def ingest_document(self, file_path: str, chunk_size: int = 1000):
"""Ingest and index a document"""
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
# Load document
loader = PyPDFLoader(file_path)
documents = await loader.aload()
# Split into chunks
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=chunk_size,
chunk_overlap=200,
separators=["\n\n", "\n", ".", " "]
)
chunks = text_splitter.split_documents(documents)
# Add to vector store
texts = [chunk.page_content for chunk in chunks]
metadatas = [chunk.metadata for chunk in chunks]
ids = await self.vectorstore.aadd_texts(
texts=texts,
metadatas=metadatas
)
return {"chunks_created": len(ids), "document": file_path}
async def query(
self,
question: str,
filter_dict: Optional[Dict] = None
) -> Dict[str, Any]:
"""Query the RAG system"""
# Apply filters if provided
if filter_dict:
self.chain.retriever.search_kwargs["filter"] = filter_dict
# Run query
result = await self.chain.ainvoke({"question": question})
# Format response
return {
"answer": result["answer"],
"sources": [
{
"content": doc.page_content[:200] + "...",
"metadata": doc.metadata
}
for doc in result.get("source_documents", [])
],
"chat_history": self.memory.chat_memory.messages[-10:] # Last 10 messages
}
def clear_memory(self):
"""Clear conversation memory"""
self.memory.clear()
# Usage example
async def rag_example():
agent = ProductionRAGAgent(index_name="knowledge-base")
# Ingest documents
await agent.ingest_document("company_handbook.pdf")
# Query the system
result = await agent.query("What is the company's remote work policy?")
print(f"Answer: {result['answer']}")
print(f"Sources: {result['sources']}")
```
### Example 3: Multi-Agent Orchestration System
```python
from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.types import Command
from typing import Literal, TypedDict, Annotated
from langchain_anthropic import ChatAnthropic
import json
class ProjectState(TypedDict):
messages: Annotated[list, "conversation history"]
project_plan: Annotated[str, "project plan"]
code_implementation: Annotated[str, "implementation"]
test_results: Annotated[str, "test results"]
documentation: Annotated[str, "documentation"]
current_phase: Annotated[str, "current phase"]
class MultiAgentOrchestrator:
def __init__(self):
self.llm = ChatAnthropic(model="claude-sonnet-4-5", temperature=0)
self.graph = self._build_graph()
def _build_graph(self):
"""Build the multi-agent workflow graph"""
builder = StateGraph(ProjectState)
# Add agent nodes
builder.add_node("supervisor", self.supervisor_agent)
builder.add_node("planner", self.planner_agent)
builder.add_node("coder", self.coder_agent)
builder.add_node("tester", self.tester_agent)
builder.add_node("documenter", self.documenter_agent)
# Add edges
builder.add_edge(START, "supervisor")
# Supervisor routes to appropriate agent
builder.add_conditional_edges(
"supervisor",
self.route_supervisor,
{
"planner": "planner",
"coder": "coder",
"tester": "tester",
"documenter": "documenter",
"end": END
}
)
# Agents return to supervisor
builder.add_edge("planner", "supervisor")
builder.add_edge("coder", "supervisor")
builder.add_edge("tester", "supervisor")
builder.add_edge("documenter", "supervisor")
return builder.compile()
async def supervisor_agent(self, state: ProjectState) -> ProjectState:
"""Supervisor decides next action"""
prompt = f"""
You are a project supervisor. Based on the current state, decide the next action.
Current Phase: {state.get('current_phase', 'initial')}
Messages: {state['messages'][-1] if state['messages'] else 'No messages'}
Decide which agent should work next or if the project is complete.
"""
response = await self.llm.ainvoke(prompt)
state["messages"].append({
"role": "supervisor",
"content": response.content
})
return state
def route_supervisor(self, state: ProjectState) -> Literal["planner", "coder", "tester", "documenter", "end"]:
"""Route based on supervisor decision"""
last_message = state["messages"][-1]["content"]
# Parse supervisor decision (implement actual parsing logic)
if "plan" in last_message.lower():
return "planner"
elif "code" in last_message.lower():
return "coder"
elif "test" in last_message.lower():
return "tester"
elif "document" in last_message.lower():
return "documenter"
else:
return "end"
async def planner_agent(self, state: ProjectState) -> ProjectState:
"""Planning agent creates project plan"""
prompt = f"""
Create a detailed implementation plan for: {state['messages'][0]['content']}
Include:
1. Architecture overview
2. Component breakdown
3. Implementation phases
4. Testing strategy
"""
plan = await self.llm.ainvoke(prompt)
state["project_plan"] = plan.content
state["current_phase"] = "planned"
return state
async def coder_agent(self, state: ProjectState) -> ProjectState:
"""Coding agent implements the solution"""
prompt = f"""
Implement the following plan:
{state.get('project_plan', 'No plan available')}
Write production-ready code with error handling.
"""
code = await self.llm.ainvoke(prompt)
state["code_implementation"] = code.content
state["current_phase"] = "coded"
return state
async def tester_agent(self, state: ProjectState) -> ProjectState:
"""Testing agent validates implementation"""
prompt = f"""
Review and test this implementation:
{state.get('code_implementation', 'No code available')}
Provide test cases and results.
"""
tests = await self.llm.ainvoke(prompt)
state["test_results"] = tests.content
state["current_phase"] = "tested"
return state
async def documenter_agent(self, state: ProjectState) -> ProjectState:
"""Documentation agent creates docs"""
prompt = f"""
Create documentation for:
Plan: {state.get('project_plan', 'N/A')}
Code: {state.get('code_implementation', 'N/A')}
Tests: {state.get('test_results', 'N/A')}
"""
docs = await self.llm.ainvoke(prompt)
state["documentation"] = docs.content
state["current_phase"] = "documented"
return state
async def execute_project(self, project_description: str):
"""Execute the entire project workflow"""
initial_state = {
"messages": [{"role": "user", "content": project_description}],
"project_plan": "",
"code_implementation": "",
"test_results": "",
"documentation": "",
"current_phase": "initial"
}
result = await self.graph.ainvoke(initial_state)
return result
# Usage
async def orchestration_example():
orchestrator = MultiAgentOrchestrator()
result = await orchestrator.execute_project(
"Build a REST API for user authentication with JWT tokens"
)
print("Project Plan:", result["project_plan"])
print("Implementation:", result["code_implementation"])
print("Test Results:", result["test_results"])
print("Documentation:", result["documentation"])
```
### Example 4: Memory-Enhanced Conversational Agent
```python
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_anthropic import ChatAnthropic
from langchain.memory import (
ConversationBufferMemory,
ConversationSummaryMemory,
ConversationEntityMemory,
CombinedMemory
)
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langgraph.checkpoint.memory import MemorySaver
import json
class MemoryEnhancedAgent:
def __init__(self, session_id: str):
self.session_id = session_id
self.llm = ChatAnthropic(model="claude-sonnet-4-5", temperature=0.7)
# Initialize multiple memory types
self.conversation_memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True
)
self.summary_memory = ConversationSummaryMemory(
llm=self.llm,
memory_key="conversation_summary"
)
self.entity_memory = ConversationEntityMemory(
llm=self.llm,
memory_key="entities"
)
# Combine memories
self.combined_memory = CombinedMemory(
memories=[
self.conversation_memory,
self.summary_memory,
self.entity_memory
]
)
# Setup agent
self.agent = self._create_agent()
def _create_agent(self):
"""Create agent with memory-aware prompting"""
prompt = ChatPromptTemplate.from_messages([
("system", """You are a helpful AI assistant with perfect memory.
Conversation Summary:
{conversation_summary}
Known Entities:
{entities}
Use this context to provide personalized, contextual responses.
Remember important details about the user and refer back to previous conversations.
"""),
MessagesPlaceholder(variable_name="chat_history"),
("human", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad")
])
tools = [] # Add your tools here
agent = create_tool_calling_agent(
llm=self.llm,
tools=tools,
prompt=prompt
)
return AgentExecutor(
agent=agent,
tools=tools,
memory=self.combined_memory,
verbose=True,
return_intermediate_steps=True
)
async def chat(self, user_input: str) -> Dict[str, Any]:
"""Process chat with full memory context"""
# Execute agent
result = await self.agent.ainvoke({"input": user_input})
# Extract entities for future reference
entities = self.entity_memory.entity_store.store
# Get conversation summary
summary = self.summary_memory.buffer
return {
"response": result["output"],
"entities": entities,
"summary": summary,
"session_id": self.session_id
}
def save_session(self, filepath: str):
"""Save session state to file"""
session_data = {
"session_id": self.session_id,
"chat_history": [
{"role": m.type, "content": m.content}
for m in self.conversation_memory.chat_memory.messages
],
"summary": self.summary_memory.buffer,
"entities": dict(self.entity_memory.entity_store.store)
}
with open(filepath, 'w') as f:
json.dump(session_data, f, indent=2)
def load_session(self, filepath: str):
"""Load session state from file"""
with open(filepath, 'r') as f:
session_data = json.load(f)
# Restore memories
# Implementation depends on specific memory types
self.session_id = session_data["session_id"]
# Restore chat history
for msg in session_data["chat_history"]:
if msg["role"] == "human":
self.conversation_memory.chat_memory.add_user_message(msg["content"])
else:
self.conversation_memory.chat_memory.add_ai_message(msg["content"])
# Restore summary
self.summary_memory.buffer = session_data["summary"]
# Restore entities
for entity, info in session_data["entities"].items():
self.entity_memory.entity_store.set(entity, info)
# Usage example
async def memory_agent_example():
agent = MemoryEnhancedAgent(session_id="user-123")
# Conversation with memory
conversations = [
"Hi, my name is Alice and I work at TechCorp",
"I'm interested in machine learning projects",
"What did I tell you about my work?",
"Can you remind me what we discussed about my interests?"
]
for msg in conversations:
result = await agent.chat(msg)
print(f"User: {msg}")
print(f"Agent: {result['response']}")
print(f"Entities tracked: {result['entities']}\n")
# Save session
agent.save_session("session_user-123.json")
```
### Example 5: Production-Ready Deployment with Monitoring
```python
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from prometheus_client import Counter, Histogram, Gauge, generate_latest
import time
from langsmith import Client as LangSmithClient
from typing import Optional
import logging
from contextlib import asynccontextmanager
# Metrics
request_count = Counter('agent_requests_total', 'Total agent requests')
request_duration = Histogram('agent_request_duration_seconds', 'Request duration')
active_sessions = Gauge('agent_active_sessions', 'Active agent sessions')
error_count = Counter('agent_errors_total', 'Total agent errors')
class ProductionAgent:
def __init__(self):
self.langsmith_client = LangSmithClient()
self.agent = None
self.session_store = {}
@asynccontextmanager
async def lifespan(self, app: FastAPI):
"""Manage application lifecycle"""
# Startup
logging.info("Starting production agent...")
await self.initialize()
yield
# Shutdown
logging.info("Shutting down production agent...")
await self.cleanup()
async def initialize(self):
"""Initialize agent and dependencies"""
# Setup LLM
self.llm = ChatAnthropic(model="claude-sonnet-4-5", temperature=0)
# Initialize agent with error handling
tools = await self.setup_tools_with_validation()
self.agent = create_react_agent(
self.llm,
tools,
checkpointer=MemorySaver() # Enable conversation memory
)
async def setup_tools_with_validation(self):
"""Setup and validate tools"""
tools = []
# Define tools with health checks
tool_configs = [
{"name": "calculator", "func": self.calc_tool, "health_check": self.check_calc},
{"name": "search", "func": self.search_tool, "health_check": self.check_search}
]
for config in tool_configs:
try:
# Run health check
await config["health_check"]()
tools.append(Tool(
name=config["name"],
func=config["func"],
description=f"Tool: {config['name']}"
))
logging.info(f"Tool {config['name']} initialized successfully")
except Exception as e:
logging.error(f"Tool {config['name']} failed health check: {e}")
return tools
@request_duration.time()
async def process_request(
self,
message: str,
session_id: str,
timeout: float = 30.0
):
"""Process request with monitoring and timeout"""
request_count.inc()
active_sessions.inc()
try:
# Create timeout task
import asyncio
task = asyncio.create_task(
self.agent.ainvoke(
{"messages": [{"role": "human", "content": message}]},
config={"configurable": {"thread_id": session_id}}
)
)
result = await asyncio.wait_for(task, timeout=timeout)
# Log to LangSmith
self.langsmith_client.create_run(
name="agent_request",
inputs={"message": message, "session_id": session_id},
outputs={"response": result["messages"][-1].content}
)
return {
"response": result["messages"][-1].content,
"session_id": session_id,
"latency": time.time()
}
except asyncio.TimeoutError:
error_count.inc()
raise HTTPException(status_code=504, detail="Request timeout")
except Exception as e:
error_count.inc()
logging.error(f"Agent error: {e}")
raise HTTPException(status_code=500, detail=str(e))
finally:
active_sessions.dec()
async def health_check(self):
"""Comprehensive health check"""
checks = {
"llm": False,
"tools": False,
"memory": False,
"langsmith": False
}
try:
# Check LLM
test_response = await self.llm.ainvoke("test")
checks["llm"] = bool(test_response)
# Check tools
checks["tools"] = len(await self.setup_tools_with_validation()) > 0
# Check memory store
checks["memory"] = self.session_store is not None
# Check LangSmith connection
self.langsmith_client.list_projects(limit=1)
checks["langsmith"] = True
except Exception as e:
logging.error(f"Health check failed: {e}")
return {
"status": "healthy" if all(checks.values()) else "unhealthy",
"checks": checks,
"active_sessions": active_sessions._value.get(),
"total_requests": request_count._value.get()
}
# FastAPI Application
agent_system = ProductionAgent()
app = FastAPI(
title="Production LangChain Agent",
version="1.0.0",
lifespan=agent_system.lifespan
)
# Add CORS middleware
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_methods=["*"],
allow_headers=["*"]
)
@app.post("/chat")
async def chat(message: str, session_id: Optional[str] = None):
"""Chat endpoint with session management"""
session_id = session_id or str(uuid.uuid4())
return await agent_system.process_request(message, session_id)
@app.get("/health")
async def health():
"""Health check endpoint"""
return await agent_system.health_check()
@app.get("/metrics")
async def metrics():
"""Prometheus metrics endpoint"""
return generate_latest()
if __name__ == "__main__":
import uvicorn
# Run with production settings
uvicorn.run(
app,
host="0.0.0.0",
port=8000,
log_config="logging.yaml",
access_log=True,
use_colors=False
)
```
## Reference Implementations
### Reference 1: Enterprise Knowledge Assistant
```python
"""
Enterprise Knowledge Assistant with RAG, Memory, and Multi-Modal Support
Full implementation with production features
"""
import os
from typing import List, Dict, Any, Optional
from dataclasses import dataclass
from enum import Enum
# Core imports
from langchain_anthropic import ChatAnthropic
from langchain_voyageai import VoyageAIEmbeddings
from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.postgres import PostgresSaver
# Vector stores
from langchain_pinecone import PineconeVectorStore
from langchain_weaviate import WeaviateVectorStore
# Memory
from langchain.memory import ConversationSummaryBufferMemory
from langchain.memory.chat_message_histories import RedisChatMessageHistory
# Tools
from langchain_core.tools import Tool, StructuredTool
from langchain.tools.retriever import create_retriever_tool
# Document processing
from langchain_community.document_loaders import PyPDFLoader, UnstructuredFileLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
# Monitoring
from langsmith import Client as LangSmithClient
import structlog
logger = structlog.get_logger()
class QueryType(Enum):
FACTUAL = "factual"
ANALYTICAL = "analytical"
CREATIVE = "creative"
CONVERSATIONAL = "conversational"
@dataclass
class EnterpriseConfig:
"""Configuration for enterprise deployment"""
anthropic_api_key: str
voyage_api_key: str
pinecone_api_key: str
pinecone_environment: str
redis_url: str
postgres_url: str
langsmith_api_key: str
max_retries: int = 3
timeout_seconds: int = 30
cache_ttl: int = 3600
class EnterpriseKnowledgeAssistant:
"""Production-ready enterprise knowledge assistant"""
def __init__(self, config: EnterpriseConfig):
self.config = config
self.setup_llms()
self.setup_vector_stores()
self.setup_memory()
self.setup_monitoring()
self.agent = self.build_agent()
def setup_llms(self):
"""Setup LLM"""
self.llm = ChatAnthropic(
model="claude-sonnet-4-5",
temperature=0,
api_key=self.config.anthropic_api_key,
max_retries=self.config.max_retries
)
def setup_vector_stores(self):
"""Setup multiple vector stores for different content types"""
import pinecone
# Initialize Pinecone
pc = pinecone.Pinecone(api_key=self.config.pinecone_api_key)
# Embeddings
# Using voyage-3-large for best retrieval quality with Claude Sonnet 4.5
self.embeddings = VoyageAIEmbeddings(
model="voyage-3-large",
voyage_api_key=self.config.voyage_api_key
)
# Document store
self.doc_store = PineconeVectorStore(
index=pc.Index("enterprise-docs"),
embedding=self.embeddings,
namespace="documents"
)
# FAQ store
self.faq_store = PineconeVectorStore(
index=pc.Index("enterprise-faq"),
embedding=self.embeddings,
namespace="faqs"
)
def setup_memory(self):
"""Setup distributed memory system"""
# Redis for message history
self.message_history = RedisChatMessageHistory(
session_id="default",
url=self.config.redis_url,
ttl=self.config.cache_ttl
)
# Summary memory
self.memory = ConversationSummaryBufferMemory(
llm=self.llm,
chat_memory=self.message_history,
max_token_limit=2000,
return_messages=True
)
# PostgreSQL checkpointer for state persistence
self.checkpointer = PostgresSaver.from_conn_string(
self.config.postgres_url
)
def setup_monitoring(self):
"""Setup monitoring and observability"""
self.langsmith = LangSmithClient(api_key=self.config.langsmith_api_key)
# Custom callbacks for monitoring
self.callbacks = [
self.log_callback,
self.metrics_callback,
self.error_callback
]
def build_agent(self):
"""Build the main agent with all components"""
# Create tools
tools = self.create_tools()
# Build state graph
builder = StateGraph(MessagesState)
# Add nodes
builder.add_node("classifier", self.classify_query)
builder.add_node("retriever", self.retrieve_context)
builder.add_node("agent", self.agent_node)
builder.add_node("validator", self.validate_response)
# Add edges
builder.add_edge(START, "classifier")
builder.add_edge("classifier", "retriever")
builder.add_edge("retriever", "agent")
builder.add_edge("agent", "validator")
builder.add_edge("validator", END)
# Compile with checkpointer
return builder.compile(checkpointer=self.checkpointer)
def create_tools(self) -> List[Tool]:
"""Create all agent tools"""
tools = []
# Document search tool
tools.append(create_retriever_tool(
self.doc_store.as_retriever(search_kwargs={"k": 5}),
"search_documents",
"Search internal company documents"
))
# FAQ search tool
tools.append(create_retriever_tool(
self.faq_store.as_retriever(search_kwargs={"k": 3}),
"search_faqs",
"Search frequently asked questions"
))
# Analytics tool
tools.append(StructuredTool.from_function(
func=self.analyze_data,
name="analyze_data",
description="Analyze business data and metrics"
))
# Email tool
tools.append(StructuredTool.from_function(
func=self.draft_email,
name="draft_email",
description="Draft professional emails"
))
return tools
async def classify_query(self, state: MessagesState) -> MessagesState:
"""Classify the type of query"""
query = state["messages"][-1].content
classification_prompt = f"""
Classify this query into one of: factual, analytical, creative, conversational
Query: {query}
Classification:
"""
result = await self.llm.ainvoke(classification_prompt)
query_type = self.parse_classification(result.content)
state["query_type"] = query_type
logger.info("Query classified", query_type=query_type)
return state
async def retrieve_context(self, state: MessagesState) -> MessagesState:
"""Retrieve relevant context based on query type"""
query = state["messages"][-1].content
query_type = state.get("query_type", QueryType.FACTUAL)
contexts = []
if query_type in [QueryType.FACTUAL, QueryType.ANALYTICAL]:
# Search documents
doc_results = await self.doc_store.asimilarity_search(query, k=5)
contexts.extend([doc.page_content for doc in doc_results])
if query_type == QueryType.CONVERSATIONAL:
# Search FAQs
faq_results = await self.faq_store.asimilarity_search(query, k=3)
contexts.extend([doc.page_content for doc in faq_results])
state["context"] = "\n\n".join(contexts)
return state
async def agent_node(self, state: MessagesState) -> MessagesState:
"""Main agent processing node"""
context = state.get("context", "")
# Build enhanced prompt with context
enhanced_prompt = f"""
Context Information:
{context}
User Query: {state['messages'][-1].content}
Provide a comprehensive answer using the context provided.
"""
# Create agent with tools
agent = create_react_agent(
self.llm,
self.create_tools(),
state_modifier=enhanced_prompt
)
# Invoke agent
result = await agent.ainvoke(state)
return result
async def validate_response(self, state: MessagesState) -> MessagesState:
"""Validate and potentially enhance response"""
response = state["messages"][-1].content
# Check for hallucination
validation_prompt = f"""
Check if this response is grounded in the provided context:
Context: {state.get('context', 'No context')}
Response: {response}
Is the response factual and grounded? (yes/no)
"""
validation = await self.llm.ainvoke(validation_prompt)
if "no" in validation.content.lower():
# Regenerate with stricter grounding
logger.warning("Response failed validation, regenerating")
state["messages"][-1].content = "I need to verify that information. Let me search again..."
return await self.agent_node(state)
return state
async def analyze_data(self, query: str) -> str:
"""Mock analytics tool"""
return f"Analytics results for: {query}"
async def draft_email(self, subject: str, recipient: str, content: str) -> str:
"""Mock email drafting tool"""
return f"Email draft to {recipient} about {subject}: {content}"
def parse_classification(self, text: str) -> QueryType:
"""Parse classification result"""
text_lower = text.lower()
for query_type in QueryType:
if query_type.value in text_lower:
return query_type
return QueryType.FACTUAL
async def log_callback(self, event: Dict):
"""Log events"""
logger.info("Agent event", **event)
async def metrics_callback(self, event: Dict):
"""Track metrics"""
# Implement metrics tracking
pass
async def error_callback(self, error: Exception):
"""Handle errors"""
logger.error("Agent error", error=str(error))
async def process(self, query: str, session_id: str) -> Dict[str, Any]:
"""Main entry point for processing queries"""
try:
# Invoke agent
result = await self.agent.ainvoke(
{"messages": [{"role": "human", "content": query}]},
config={"configurable": {"thread_id": session_id}}
)
# Extract response
response = result["messages"][-1].content
# Log to LangSmith
self.langsmith.create_run(
name="enterprise_assistant",
inputs={"query": query, "session_id": session_id},
outputs={"response": response}
)
return {
"response": response,
"session_id": session_id,
"sources": result.get("context", "")
}
except Exception as e:
logger.error("Processing error", error=str(e))
raise
# Usage
async def main():
config = EnterpriseConfig(
anthropic_api_key=os.getenv("ANTHROPIC_API_KEY"),
voyage_api_key=os.getenv("VOYAGE_API_KEY"),
pinecone_api_key=os.getenv("PINECONE_API_KEY"),
pinecone_environment="us-east-1",
redis_url="redis://localhost:6379",
postgres_url=os.getenv("DATABASE_URL"),
langsmith_api_key=os.getenv("LANGSMITH_API_KEY")
)
assistant = EnterpriseKnowledgeAssistant(config)
# Process query
result = await assistant.process(
query="What is our company's remote work policy?",
session_id="user-123"
)
print(result)
if __name__ == "__main__":
import asyncio
asyncio.run(main())
```
### Reference 2: Autonomous Research Agent
```python
"""
Autonomous Research Agent with Web Search, Paper Analysis, and Report Generation
Complete implementation with multi-step reasoning
"""
from typing import List, Dict, Any, Optional
from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.types import Command
from langchain_anthropic import ChatAnthropic
from langchain_core.tools import Tool
from langchain_community.utilities import GoogleSerperAPIWrapper
from langchain_community.document_loaders import ArxivLoader
import asyncio
from datetime import datetime
class ResearchState(MessagesState):
"""Extended state for research agent"""
research_query: str
search_results: List[Dict]
papers: List[Dict]
analysis: str
report: str
citations: List[str]
current_step: str
max_papers: int = 5
class AutonomousResearchAgent:
"""Autonomous agent for conducting research and generating reports"""
def __init__(self, anthropic_api_key: str, serper_api_key: str):
self.llm = ChatAnthropic(
model="claude-sonnet-4-5",
temperature=0,
api_key=anthropic_api_key
)
self.search = GoogleSerperAPIWrapper(
serper_api_key=serper_api_key
)
self.graph = self.build_research_graph()
def build_research_graph(self):
"""Build the research workflow graph"""
builder = StateGraph(ResearchState)
# Add research nodes
builder.add_node("planner", self.plan_research)
builder.add_node("searcher", self.search_web)
builder.add_node("paper_finder", self.find_papers)
builder.add_node("analyzer", self.analyze_content)
builder.add_node("synthesizer", self.synthesize_findings)
builder.add_node("report_writer", self.write_report)
builder.add_node("reviewer", self.review_report)
# Define flow
builder.add_edge(START, "planner")
builder.add_edge("planner", "searcher")
builder.add_edge("searcher", "paper_finder")
builder.add_edge("paper_finder", "analyzer")
builder.add_edge("analyzer", "synthesizer")
builder.add_edge("synthesizer", "report_writer")
builder.add_edge("report_writer", "reviewer")
# Conditional edge from reviewer
builder.add_conditional_edges(
"reviewer",
self.should_revise,
{
"revise": "report_writer",
"complete": END
}
)
return builder.compile()
async def plan_research(self, state: ResearchState) -> ResearchState:
"""Plan the research approach"""
query = state["messages"][-1].content
planning_prompt = f"""
Create a research plan for: {query}
Include:
1. Key topics to investigate
2. Types of sources needed
3. Research methodology
4. Expected deliverables
Format as structured plan.
"""
plan = await self.llm.ainvoke(planning_prompt)
state["research_query"] = query
state["current_step"] = "planned"
state["messages"].append({
"role": "assistant",
"content": f"Research plan created: {plan.content}"
})
return state
async def search_web(self, state: ResearchState) -> ResearchState:
"""Search web for relevant information"""
query = state["research_query"]
# Perform multiple searches with different angles
search_queries = [
query,
f"{query} recent developments 2024",
f"{query} research papers",
f"{query} industry applications"
]
all_results = []
for sq in search_queries:
results = await asyncio.to_thread(self.search.run, sq)
all_results.append({
"query": sq,
"results": results
})
state["search_results"] = all_results
state["current_step"] = "searched"
return state
async def find_papers(self, state: ResearchState) -> ResearchState:
"""Find and download relevant research papers"""
query = state["research_query"]
# Search arXiv for papers
arxiv_loader = ArxivLoader(
query=query,
load_max_docs=state["max_papers"]
)
papers = await asyncio.to_thread(arxiv_loader.load)
# Process papers
processed_papers = []
for paper in papers:
processed_papers.append({
"title": paper.metadata.get("Title", "Unknown"),
"authors": paper.metadata.get("Authors", "Unknown"),
"summary": paper.metadata.get("Summary", "")[:500],
"content": paper.page_content[:1000], # First 1000 chars
"arxiv_id": paper.metadata.get("Entry ID", "")
})
state["papers"] = processed_papers
state["current_step"] = "papers_found"
return state
async def analyze_content(self, state: ResearchState) -> ResearchState:
"""Analyze all gathered content"""
search_results = state["search_results"]
papers = state["papers"]
analysis_prompt = f"""
Analyze the following research materials:
Web Search Results:
{search_results}
Academic Papers:
{papers}
Provide:
1. Key findings and insights
2. Common themes and patterns
3. Contradictions or debates
4. Knowledge gaps
5. Practical implications
"""
analysis = await self.llm.ainvoke(analysis_prompt)
state["analysis"] = analysis.content
state["current_step"] = "analyzed"
return state
async def synthesize_findings(self, state: ResearchState) -> ResearchState:
"""Synthesize all findings into coherent insights"""
analysis = state["analysis"]
synthesis_prompt = f"""
Synthesize the following analysis into key insights:
{analysis}
Create:
1. Executive summary (3-5 sentences)
2. Main conclusions (bullet points)
3. Recommendations
4. Future research directions
"""
synthesis = await self.llm.ainvoke(synthesis_prompt)
state["messages"].append({
"role": "assistant",
"content": synthesis.content
})
state["current_step"] = "synthesized"
return state
async def write_report(self, state: ResearchState) -> ResearchState:
"""Write comprehensive research report"""
query = state["research_query"]
analysis = state["analysis"]
papers = state["papers"]
report_prompt = f"""
Write a comprehensive research report on: {query}
Based on analysis: {analysis}
Structure:
1. Executive Summary
2. Introduction
3. Methodology
4. Key Findings
5. Discussion
6. Conclusions
7. References
Include citations to papers: {[p['title'] for p in papers]}
Make it professional and well-structured.
"""
report = await self.llm.ainvoke(report_prompt)
# Generate citations
citations = []
for paper in papers:
citation = f"{paper['authors']} ({datetime.now().year}). {paper['title']}. arXiv:{paper['arxiv_id']}"
citations.append(citation)
state["report"] = report.content
state["citations"] = citations
state["current_step"] = "report_written"
return state
async def review_report(self, state: ResearchState) -> ResearchState:
"""Review and validate the report"""
report = state["report"]
review_prompt = f"""
Review this research report for:
1. Accuracy and factual correctness
2. Logical flow and structure
3. Completeness
4. Professional tone
5. Proper citations
Report:
{report}
Provide a quality score (1-10) and identify any issues.
"""
review = await self.llm.ainvoke(review_prompt)
state["messages"].append({
"role": "assistant",
"content": f"Report review: {review.content}"
})
# Parse quality score
try:
import re
score_match = re.search(r'\b([1-9]|10)\b', review.content)
quality_score = int(score_match.group()) if score_match else 7
except:
quality_score = 7
state["quality_score"] = quality_score
state["current_step"] = "reviewed"
return state
def should_revise(self, state: ResearchState) -> str:
"""Decide whether to revise the report"""
quality_score = state.get("quality_score", 7)
if quality_score < 7:
return "revise"
return "complete"
async def conduct_research(self, topic: str) -> Dict[str, Any]:
"""Main entry point for conducting research"""
initial_state = {
"messages": [{"role": "human", "content": topic}],
"research_query": "",
"search_results": [],
"papers": [],
"analysis": "",
"report": "",
"citations": [],
"current_step": "initial",
"max_papers": 5
}
result = await self.graph.ainvoke(initial_state)
return {
"report": result["report"],
"citations": result["citations"],
"quality_score": result.get("quality_score", 0),
"steps_completed": result["current_step"]
}
# Usage example
async def research_example():
agent = AutonomousResearchAgent(
anthropic_api_key=os.getenv("ANTHROPIC_API_KEY"),
serper_api_key=os.getenv("SERPER_API_KEY")
)
result = await agent.conduct_research(
"Recent advances in quantum computing and their applications in cryptography"
)
print("Research Report:")
print(result["report"])
print("\nCitations:")
for citation in result["citations"]:
print(f"- {citation}")
print(f"\nQuality Score: {result['quality_score']}/10")
```
### Reference 3: Real-time Collaborative Agent System
```python
"""
Real-time Collaborative Multi-Agent System with WebSocket Support
Production implementation with agent coordination and live updates
"""
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
from fastapi.responses import HTMLResponse
import json
import asyncio
from typing import Dict, List, Set, Any
from datetime import datetime
from langgraph.graph import StateGraph, MessagesState
from langchain_anthropic import ChatAnthropic
import redis.asyncio as redis
from collections import defaultdict
class CollaborativeAgentSystem:
"""Real-time collaborative agent system with WebSocket support"""
def __init__(self):
self.app = FastAPI()
self.setup_routes()
self.active_connections: Dict[str, Set[WebSocket]] = defaultdict(set)
self.agent_pool = {}
self.redis_client = None
self.llm = ChatAnthropic(model="claude-sonnet-4-5", temperature=0.7)
async def startup(self):
"""Initialize system resources"""
self.redis_client = await redis.from_url("redis://localhost:6379")
await self.initialize_agents()
async def shutdown(self):
"""Cleanup resources"""
if self.redis_client:
await self.redis_client.close()
async def initialize_agents(self):
"""Initialize specialized agents"""
agent_configs = [
{"id": "coordinator", "role": "Project Coordinator", "specialty": "task planning"},
{"id": "developer", "role": "Senior Developer", "specialty": "code implementation"},
{"id": "reviewer", "role": "Code Reviewer", "specialty": "quality assurance"},
{"id": "documenter", "role": "Technical Writer", "specialty": "documentation"}
]
for config in agent_configs:
self.agent_pool[config["id"]] = self.create_specialized_agent(config)
def create_specialized_agent(self, config: Dict) -> Dict:
"""Create a specialized agent with specific capabilities"""
return {
"id": config["id"],
"role": config["role"],
"specialty": config["specialty"],
"llm": ChatAnthropic(
model="claude-sonnet-4-5",
temperature=0.3
),
"status": "idle",
"current_task": None
}
def setup_routes(self):
"""Setup WebSocket and HTTP routes"""
@self.app.websocket("/ws/{session_id}")
async def websocket_endpoint(websocket: WebSocket, session_id: str):
await self.handle_websocket(websocket, session_id)
@self.app.post("/session/{session_id}/task")
async def create_task(session_id: str, task: Dict):
return await self.process_task(session_id, task)
@self.app.get("/session/{session_id}/status")
async def get_status(session_id: str):
return await self.get_session_status(session_id)
async def handle_websocket(self, websocket: WebSocket, session_id: str):
"""Handle WebSocket connections for real-time updates"""
await websocket.accept()
self.active_connections[session_id].add(websocket)
try:
# Send initial status
await websocket.send_json({
"type": "connection",
"session_id": session_id,
"agents": list(self.agent_pool.keys()),
"timestamp": datetime.now().isoformat()
})
# Handle incoming messages
while True:
data = await websocket.receive_json()
await self.handle_client_message(session_id, data, websocket)
except WebSocketDisconnect:
self.active_connections[session_id].remove(websocket)
if not self.active_connections[session_id]:
del self.active_connections[session_id]
async def handle_client_message(self, session_id: str, data: Dict, websocket: WebSocket):
"""Process messages from clients"""
message_type = data.get("type")
if message_type == "task":
await self.distribute_task(session_id, data["content"])
elif message_type == "chat":
await self.handle_chat(session_id, data["content"], data.get("agent_id"))
elif message_type == "command":
await self.handle_command(session_id, data["command"], data.get("args"))
async def distribute_task(self, session_id: str, task_description: str):
"""Distribute task among agents"""
# Coordinator analyzes and breaks down the task
coordinator = self.agent_pool["coordinator"]
breakdown_prompt = f"""
Break down this task into subtasks for the team:
Task: {task_description}
Available agents:
- Developer: code implementation
- Reviewer: quality assurance
- Documenter: documentation
Provide a structured plan with assigned agents.
"""
plan = await coordinator["llm"].ainvoke(breakdown_prompt)
# Broadcast plan to all connected clients
await self.broadcast_to_session(session_id, {
"type": "plan",
"agent": "coordinator",
"content": plan.content,
"timestamp": datetime.now().isoformat()
})
# Execute subtasks in parallel
subtasks = self.parse_subtasks(plan.content)
results = await asyncio.gather(*[
self.execute_subtask(session_id, subtask)
for subtask in subtasks
])
# Aggregate results
await self.aggregate_results(session_id, results)
def parse_subtasks(self, plan_content: str) -> List[Dict]:
"""Parse subtasks from plan"""
# Simplified parsing - in production use structured output
subtasks = []
if "developer" in plan_content.lower():
subtasks.append({
"agent_id": "developer",
"task": "Implement the required functionality"
})
if "reviewer" in plan_content.lower():
subtasks.append({
"agent_id": "reviewer",
"task": "Review the implementation"
})
if "documenter" in plan_content.lower():
subtasks.append({
"agent_id": "documenter",
"task": "Create documentation"
})
return subtasks
async def execute_subtask(self, session_id: str, subtask: Dict) -> Dict:
"""Execute a subtask with a specific agent"""
agent_id = subtask["agent_id"]
agent = self.agent_pool[agent_id]
# Update agent status
agent["status"] = "working"
agent["current_task"] = subtask["task"]
# Broadcast status update
await self.broadcast_to_session(session_id, {
"type": "agent_status",
"agent": agent_id,
"status": "working",
"task": subtask["task"],
"timestamp": datetime.now().isoformat()
})
# Execute task
try:
result = await agent["llm"].ainvoke(subtask["task"])
# Store result in Redis
await self.redis_client.hset(
f"session:{session_id}:results",
agent_id,
json.dumps({
"content": result.content,
"timestamp": datetime.now().isoformat()
})
)
# Broadcast completion
await self.broadcast_to_session(session_id, {
"type": "task_complete",
"agent": agent_id,
"result": result.content,
"timestamp": datetime.now().isoformat()
})
return {
"agent_id": agent_id,
"result": result.content,
"success": True
}
except Exception as e:
await self.broadcast_to_session(session_id, {
"type": "error",
"agent": agent_id,
"error": str(e),
"timestamp": datetime.now().isoformat()
})
return {
"agent_id": agent_id,
"error": str(e),
"success": False
}
finally:
# Reset agent status
agent["status"] = "idle"
agent["current_task"] = None
async def aggregate_results(self, session_id: str, results: List[Dict]):
"""Aggregate results from all agents"""
coordinator = self.agent_pool["coordinator"]
summary_prompt = f"""
Aggregate and summarize the following results from the team:
{json.dumps(results, indent=2)}
Provide a cohesive summary of the completed work.
"""
summary = await coordinator["llm"].ainvoke(summary_prompt)
# Broadcast final summary
await self.broadcast_to_session(session_id, {
"type": "final_summary",
"agent": "coordinator",
"content": summary.content,
"timestamp": datetime.now().isoformat()
})
async def handle_chat(self, session_id: str, message: str, agent_id: Optional[str] = None):
"""Handle chat messages directed at specific agents"""
if agent_id and agent_id in self.agent_pool:
agent = self.agent_pool[agent_id]
response = await agent["llm"].ainvoke(message)
await self.broadcast_to_session(session_id, {
"type": "chat_response",
"agent": agent_id,
"content": response.content,
"timestamp": datetime.now().isoformat()
})
else:
# Broadcast to all agents and get responses
responses = await asyncio.gather(*[
agent["llm"].ainvoke(message)
for agent in self.agent_pool.values()
])
for agent_id, response in zip(self.agent_pool.keys(), responses):
await self.broadcast_to_session(session_id, {
"type": "chat_response",
"agent": agent_id,
"content": response.content,
"timestamp": datetime.now().isoformat()
})
async def handle_command(self, session_id: str, command: str, args: Dict):
"""Handle system commands"""
if command == "reset":
await self.reset_session(session_id)
elif command == "export":
await self.export_session(session_id)
elif command == "pause":
await self.pause_agents(session_id)
elif command == "resume":
await self.resume_agents(session_id)
async def broadcast_to_session(self, session_id: str, message: Dict):
"""Broadcast message to all connections in a session"""
if session_id in self.active_connections:
disconnected = set()
for websocket in self.active_connections[session_id]:
try:
await websocket.send_json(message)
except:
disconnected.add(websocket)
# Clean up disconnected websockets
for ws in disconnected:
self.active_connections[session_id].remove(ws)
async def get_session_status(self, session_id: str) -> Dict:
"""Get current session status"""
agent_statuses = {
agent_id: {
"status": agent["status"],
"current_task": agent["current_task"]
}
for agent_id, agent in self.agent_pool.items()
}
# Get results from Redis
results = await self.redis_client.hgetall(f"session:{session_id}:results")
return {
"session_id": session_id,
"agents": agent_statuses,
"results": {
k.decode(): json.loads(v.decode())
for k, v in results.items()
} if results else {},
"active_connections": len(self.active_connections.get(session_id, set())),
"timestamp": datetime.now().isoformat()
}
async def reset_session(self, session_id: str):
"""Reset session state"""
# Clear Redis data
await self.redis_client.delete(f"session:{session_id}:results")
# Reset agents
for agent in self.agent_pool.values():
agent["status"] = "idle"
agent["current_task"] = None
await self.broadcast_to_session(session_id, {
"type": "system",
"message": "Session reset",
"timestamp": datetime.now().isoformat()
})
async def export_session(self, session_id: str) -> Dict:
"""Export session data"""
results = await self.redis_client.hgetall(f"session:{session_id}:results")
export_data = {
"session_id": session_id,
"timestamp": datetime.now().isoformat(),
"results": {
k.decode(): json.loads(v.decode())
for k, v in results.items()
} if results else {}
}
return export_data
# Create application instance
collab_system = CollaborativeAgentSystem()
app = collab_system.app
# Add startup and shutdown events
@app.on_event("startup")
async def startup_event():
await collab_system.startup()
@app.on_event("shutdown")
async def shutdown_event():
await collab_system.shutdown()
# HTML client for testing
@app.get("/")
async def get():
return HTMLResponse("""
<!DOCTYPE html>
<html>
<head>
<title>Collaborative Agent System</title>
</head>
<body>
<h1>Collaborative Agent System</h1>
<div id="messages"></div>
<input type="text" id="messageInput" placeholder="Enter task...">
<button onclick="sendMessage()">Send</button>
<script>
const sessionId = 'test-session-' + Date.now();
const ws = new WebSocket(`ws://localhost:8000/ws/${sessionId}`);
ws.onmessage = function(event) {
const message = JSON.parse(event.data);
const messages = document.getElementById('messages');
messages.innerHTML += '<div>' + JSON.stringify(message) + '</div>';
};
function sendMessage() {
const input = document.getElementById('messageInput');
ws.send(JSON.stringify({
type: 'task',
content: input.value
}));
input.value = '';
}
</script>
</body>
</html>
""")
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
```
## Summary
This comprehensive LangChain/LangGraph agent development guide provides:
1. **Modern Architecture Patterns**: State-based agent orchestration with LangGraph
2. **Production-Ready Components**: Async patterns, error handling, monitoring
3. **Advanced Memory Systems**: Multiple memory types with distributed storage
4. **RAG Integration**: Vector stores, reranking, and hybrid search
5. **Multi-Agent Coordination**: Specialized agents working together
6. **Real-time Capabilities**: WebSocket support for live updates
7. **Enterprise Features**: Security, scalability, and observability
8. **Complete Examples**: Full implementations ready for production use
The guide emphasizes production reliability, scalability, and maintainability while leveraging the latest LangChain 0.1+ and LangGraph capabilities for building sophisticated AI agent systems.