feat(llm-application-dev): modernize to LangGraph and latest models v2.0.0

- Migrate from LangChain 0.x to LangChain 1.x/LangGraph patterns
- Update model references to Claude 4.5 and GPT-5.2
- Add Voyage AI as primary embedding recommendation
- Add structured outputs with Pydantic
- Replace deprecated initialize_agent() with StateGraph
- Fix security: use AST-based safe math instead of unsafe execution
- Add plugin.json and README.md for consistency
- Bump marketplace version to 1.3.3
This commit is contained in:
Seth Hobson
2026-01-19 15:43:25 -05:00
parent e827cc713a
commit 8be0e8ac7a
12 changed files with 1940 additions and 708 deletions

View File

@@ -1,11 +1,11 @@
---
name: langchain-architecture
description: Design LLM applications using the LangChain framework with agents, memory, and tool integration patterns. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.
description: Design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.
---
# LangChain Architecture
# LangChain & LangGraph Architecture
Master the LangChain framework for building sophisticated LLM applications with agents, chains, memory, and tool integration.
Master modern LangChain 1.x and LangGraph for building sophisticated LLM applications with agents, state management, memory, and tool integration.
## When to Use This Skill
@@ -17,126 +17,100 @@ Master the LangChain framework for building sophisticated LLM applications with
- Implementing document processing pipelines
- Building production-grade LLM applications
## Package Structure (LangChain 1.x)
```
langchain (1.2.x) # High-level orchestration
langchain-core (1.2.x) # Core abstractions (messages, prompts, tools)
langchain-community # Third-party integrations
langgraph # Agent orchestration and state management
langchain-openai # OpenAI integrations
langchain-anthropic # Anthropic/Claude integrations
langchain-voyageai # Voyage AI embeddings
langchain-pinecone # Pinecone vector store
```
## Core Concepts
### 1. Agents
Autonomous systems that use LLMs to decide which actions to take.
### 1. LangGraph Agents
LangGraph is the standard for building agents in 2026. It provides:
**Agent Types:**
- **ReAct**: Reasoning + Acting in interleaved manner
- **OpenAI Functions**: Leverages function calling API
- **Structured Chat**: Handles multi-input tools
- **Conversational**: Optimized for chat interfaces
- **Self-Ask with Search**: Decomposes complex queries
**Key Features:**
- **StateGraph**: Explicit state management with typed state
- **Durable Execution**: Agents persist through failures
- **Human-in-the-Loop**: Inspect and modify state at any point
- **Memory**: Short-term and long-term memory across sessions
- **Checkpointing**: Save and resume agent state
### 2. Chains
Sequences of calls to LLMs or other utilities.
**Agent Patterns:**
- **ReAct**: Reasoning + Acting with `create_react_agent`
- **Plan-and-Execute**: Separate planning and execution nodes
- **Multi-Agent**: Supervisor routing between specialized agents
- **Tool-Calling**: Structured tool invocation with Pydantic schemas
**Chain Types:**
- **LLMChain**: Basic prompt + LLM combination
- **SequentialChain**: Multiple chains in sequence
- **RouterChain**: Routes inputs to specialized chains
- **TransformChain**: Data transformations between steps
- **MapReduceChain**: Parallel processing with aggregation
### 2. State Management
LangGraph uses TypedDict for explicit state:
### 3. Memory
Systems for maintaining context across interactions.
```python
from typing import Annotated, TypedDict
from langgraph.graph import MessagesState
**Memory Types:**
- **ConversationBufferMemory**: Stores all messages
- **ConversationSummaryMemory**: Summarizes older messages
- **ConversationBufferWindowMemory**: Keeps last N messages
- **EntityMemory**: Tracks information about entities
- **VectorStoreMemory**: Semantic similarity retrieval
# Simple message-based state
class AgentState(MessagesState):
"""Extends MessagesState with custom fields."""
context: Annotated[list, "retrieved documents"]
# Custom state for complex agents
class CustomState(TypedDict):
messages: Annotated[list, "conversation history"]
context: Annotated[dict, "retrieved context"]
current_step: str
results: list
```
### 3. Memory Systems
Modern memory implementations:
- **ConversationBufferMemory**: Stores all messages (short conversations)
- **ConversationSummaryMemory**: Summarizes older messages (long conversations)
- **ConversationTokenBufferMemory**: Token-based windowing
- **VectorStoreRetrieverMemory**: Semantic similarity retrieval
- **LangGraph Checkpointers**: Persistent state across sessions
### 4. Document Processing
Loading, transforming, and storing documents for retrieval.
Loading, transforming, and storing documents:
**Components:**
- **Document Loaders**: Load from various sources
- **Text Splitters**: Chunk documents intelligently
- **Vector Stores**: Store and retrieve embeddings
- **Retrievers**: Fetch relevant documents
- **Indexes**: Organize documents for efficient access
### 5. Callbacks
Hooks for logging, monitoring, and debugging.
### 5. Callbacks & Tracing
LangSmith is the standard for observability:
**Use Cases:**
- Request/response logging
- Token usage tracking
- Latency monitoring
- Error handling
- Custom metrics collection
- Error tracking
- Trace visualization
## Quick Start
### Modern ReAct Agent with LangGraph
```python
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI
from langchain.memory import ConversationBufferMemory
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver
from langchain_anthropic import ChatAnthropic
from langchain_core.tools import tool
import ast
import operator
# Initialize LLM
llm = OpenAI(temperature=0)
# Load tools
tools = load_tools(["serpapi", "llm-math"], llm=llm)
# Add memory
memory = ConversationBufferMemory(memory_key="chat_history")
# Create agent
agent = initialize_agent(
tools,
llm,
agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
memory=memory,
verbose=True
)
# Run agent
result = agent.run("What's the weather in SF? Then calculate 25 * 4")
```
## Architecture Patterns
### Pattern 1: RAG with LangChain
```python
from langchain.chains import RetrievalQA
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
# Load and process documents
loader = TextLoader('documents.txt')
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_documents(documents)
# Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(texts, embeddings)
# Create retrieval chain
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=vectorstore.as_retriever(),
return_source_documents=True
)
# Query
result = qa_chain({"query": "What is the main topic?"})
```
### Pattern 2: Custom Agent with Tools
```python
from langchain.agents import Tool, AgentExecutor
from langchain.agents.react.base import ReActDocstoreAgent
from langchain.tools import tool
# Initialize LLM (Claude Sonnet 4.5 recommended)
llm = ChatAnthropic(model="claude-sonnet-4-5", temperature=0)
# Define tools with Pydantic schemas
@tool
def search_database(query: str) -> str:
"""Search internal database for information."""
@@ -144,195 +118,541 @@ def search_database(query: str) -> str:
return f"Results for: {query}"
@tool
def send_email(recipient: str, content: str) -> str:
def calculate(expression: str) -> str:
"""Safely evaluate a mathematical expression.
Supports: +, -, *, /, **, %, parentheses
Example: '(2 + 3) * 4' returns '20'
"""
# Safe math evaluation using ast
allowed_operators = {
ast.Add: operator.add,
ast.Sub: operator.sub,
ast.Mult: operator.mul,
ast.Div: operator.truediv,
ast.Pow: operator.pow,
ast.Mod: operator.mod,
ast.USub: operator.neg,
}
def _eval(node):
if isinstance(node, ast.Constant):
return node.value
elif isinstance(node, ast.BinOp):
left = _eval(node.left)
right = _eval(node.right)
return allowed_operators[type(node.op)](left, right)
elif isinstance(node, ast.UnaryOp):
operand = _eval(node.operand)
return allowed_operators[type(node.op)](operand)
else:
raise ValueError(f"Unsupported operation: {type(node)}")
try:
tree = ast.parse(expression, mode='eval')
return str(_eval(tree.body))
except Exception as e:
return f"Error: {e}"
tools = [search_database, calculate]
# Create checkpointer for memory persistence
checkpointer = MemorySaver()
# Create ReAct agent
agent = create_react_agent(
llm,
tools,
checkpointer=checkpointer
)
# Run agent with thread ID for memory
config = {"configurable": {"thread_id": "user-123"}}
result = await agent.ainvoke(
{"messages": [("user", "Search for Python tutorials and calculate 25 * 4")]},
config=config
)
```
## Architecture Patterns
### Pattern 1: RAG with LangGraph
```python
from langgraph.graph import StateGraph, START, END
from langchain_anthropic import ChatAnthropic
from langchain_voyageai import VoyageAIEmbeddings
from langchain_pinecone import PineconeVectorStore
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate
from typing import TypedDict, Annotated
class RAGState(TypedDict):
question: str
context: Annotated[list[Document], "retrieved documents"]
answer: str
# Initialize components
llm = ChatAnthropic(model="claude-sonnet-4-5")
embeddings = VoyageAIEmbeddings(model="voyage-3-large")
vectorstore = PineconeVectorStore(index_name="docs", embedding=embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
# Define nodes
async def retrieve(state: RAGState) -> RAGState:
"""Retrieve relevant documents."""
docs = await retriever.ainvoke(state["question"])
return {"context": docs}
async def generate(state: RAGState) -> RAGState:
"""Generate answer from context."""
prompt = ChatPromptTemplate.from_template(
"""Answer based on the context below. If you cannot answer, say so.
Context: {context}
Question: {question}
Answer:"""
)
context_text = "\n\n".join(doc.page_content for doc in state["context"])
response = await llm.ainvoke(
prompt.format(context=context_text, question=state["question"])
)
return {"answer": response.content}
# Build graph
builder = StateGraph(RAGState)
builder.add_node("retrieve", retrieve)
builder.add_node("generate", generate)
builder.add_edge(START, "retrieve")
builder.add_edge("retrieve", "generate")
builder.add_edge("generate", END)
rag_chain = builder.compile()
# Use the chain
result = await rag_chain.ainvoke({"question": "What is the main topic?"})
```
### Pattern 2: Custom Agent with Structured Tools
```python
from langchain_core.tools import StructuredTool
from pydantic import BaseModel, Field
class SearchInput(BaseModel):
"""Input for database search."""
query: str = Field(description="Search query")
filters: dict = Field(default={}, description="Optional filters")
class EmailInput(BaseModel):
"""Input for sending email."""
recipient: str = Field(description="Email recipient")
subject: str = Field(description="Email subject")
content: str = Field(description="Email body")
async def search_database(query: str, filters: dict = {}) -> str:
"""Search internal database for information."""
# Your database search logic
return f"Results for '{query}' with filters {filters}"
async def send_email(recipient: str, subject: str, content: str) -> str:
"""Send an email to specified recipient."""
# Email sending logic
return f"Email sent to {recipient}"
tools = [search_database, send_email]
tools = [
StructuredTool.from_function(
coroutine=search_database,
name="search_database",
description="Search internal database",
args_schema=SearchInput
),
StructuredTool.from_function(
coroutine=send_email,
name="send_email",
description="Send an email",
args_schema=EmailInput
)
]
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)
agent = create_react_agent(llm, tools)
```
### Pattern 3: Multi-Step Chain
### Pattern 3: Multi-Step Workflow with StateGraph
```python
from langchain.chains import LLMChain, SequentialChain
from langchain.prompts import PromptTemplate
from langgraph.graph import StateGraph, START, END
from typing import TypedDict, Literal
# Step 1: Extract key information
extract_prompt = PromptTemplate(
input_variables=["text"],
template="Extract key entities from: {text}\n\nEntities:"
)
extract_chain = LLMChain(llm=llm, prompt=extract_prompt, output_key="entities")
class WorkflowState(TypedDict):
text: str
entities: list
analysis: str
summary: str
current_step: str
# Step 2: Analyze entities
analyze_prompt = PromptTemplate(
input_variables=["entities"],
template="Analyze these entities: {entities}\n\nAnalysis:"
)
analyze_chain = LLMChain(llm=llm, prompt=analyze_prompt, output_key="analysis")
async def extract_entities(state: WorkflowState) -> WorkflowState:
"""Extract key entities from text."""
prompt = f"Extract key entities from: {state['text']}\n\nReturn as JSON list."
response = await llm.ainvoke(prompt)
return {"entities": response.content, "current_step": "analyze"}
# Step 3: Generate summary
summary_prompt = PromptTemplate(
input_variables=["entities", "analysis"],
template="Summarize:\nEntities: {entities}\nAnalysis: {analysis}\n\nSummary:"
)
summary_chain = LLMChain(llm=llm, prompt=summary_prompt, output_key="summary")
async def analyze_entities(state: WorkflowState) -> WorkflowState:
"""Analyze extracted entities."""
prompt = f"Analyze these entities: {state['entities']}\n\nProvide insights."
response = await llm.ainvoke(prompt)
return {"analysis": response.content, "current_step": "summarize"}
# Combine into sequential chain
overall_chain = SequentialChain(
chains=[extract_chain, analyze_chain, summary_chain],
input_variables=["text"],
output_variables=["entities", "analysis", "summary"],
verbose=True
)
async def generate_summary(state: WorkflowState) -> WorkflowState:
"""Generate final summary."""
prompt = f"""Summarize:
Entities: {state['entities']}
Analysis: {state['analysis']}
Provide a concise summary."""
response = await llm.ainvoke(prompt)
return {"summary": response.content, "current_step": "complete"}
def route_step(state: WorkflowState) -> Literal["analyze", "summarize", "end"]:
"""Route to next step based on current state."""
step = state.get("current_step", "extract")
if step == "analyze":
return "analyze"
elif step == "summarize":
return "summarize"
return "end"
# Build workflow
builder = StateGraph(WorkflowState)
builder.add_node("extract", extract_entities)
builder.add_node("analyze", analyze_entities)
builder.add_node("summarize", generate_summary)
builder.add_edge(START, "extract")
builder.add_conditional_edges("extract", route_step, {
"analyze": "analyze",
"summarize": "summarize",
"end": END
})
builder.add_conditional_edges("analyze", route_step, {
"summarize": "summarize",
"end": END
})
builder.add_edge("summarize", END)
workflow = builder.compile()
```
## Memory Management Best Practices
### Pattern 4: Multi-Agent Orchestration
### Choosing the Right Memory Type
```python
# For short conversations (< 10 messages)
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage
from typing import Literal
# For long conversations (summarize old messages)
from langchain.memory import ConversationSummaryMemory
memory = ConversationSummaryMemory(llm=llm)
class MultiAgentState(TypedDict):
messages: list
next_agent: str
# For sliding window (last N messages)
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=5)
# Create specialized agents
researcher = create_react_agent(llm, research_tools)
writer = create_react_agent(llm, writing_tools)
reviewer = create_react_agent(llm, review_tools)
# For entity tracking
from langchain.memory import ConversationEntityMemory
memory = ConversationEntityMemory(llm=llm)
async def supervisor(state: MultiAgentState) -> MultiAgentState:
"""Route to appropriate agent based on task."""
prompt = f"""Based on the conversation, which agent should handle this?
# For semantic retrieval of relevant history
from langchain.memory import VectorStoreRetrieverMemory
memory = VectorStoreRetrieverMemory(retriever=retriever)
Options:
- researcher: For finding information
- writer: For creating content
- reviewer: For reviewing and editing
- FINISH: Task is complete
Messages: {state['messages']}
Respond with just the agent name."""
response = await llm.ainvoke(prompt)
return {"next_agent": response.content.strip().lower()}
def route_to_agent(state: MultiAgentState) -> Literal["researcher", "writer", "reviewer", "end"]:
"""Route based on supervisor decision."""
next_agent = state.get("next_agent", "").lower()
if next_agent == "finish":
return "end"
return next_agent if next_agent in ["researcher", "writer", "reviewer"] else "end"
# Build multi-agent graph
builder = StateGraph(MultiAgentState)
builder.add_node("supervisor", supervisor)
builder.add_node("researcher", researcher)
builder.add_node("writer", writer)
builder.add_node("reviewer", reviewer)
builder.add_edge(START, "supervisor")
builder.add_conditional_edges("supervisor", route_to_agent, {
"researcher": "researcher",
"writer": "writer",
"reviewer": "reviewer",
"end": END
})
# Each agent returns to supervisor
for agent in ["researcher", "writer", "reviewer"]:
builder.add_edge(agent, "supervisor")
multi_agent = builder.compile()
```
## Callback System
## Memory Management
### Token-Based Memory with LangGraph
```python
from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt import create_react_agent
# In-memory checkpointer (development)
checkpointer = MemorySaver()
# Create agent with persistent memory
agent = create_react_agent(llm, tools, checkpointer=checkpointer)
# Each thread_id maintains separate conversation
config = {"configurable": {"thread_id": "session-abc123"}}
# Messages persist across invocations with same thread_id
result1 = await agent.ainvoke({"messages": [("user", "My name is Alice")]}, config)
result2 = await agent.ainvoke({"messages": [("user", "What's my name?")]}, config)
# Agent remembers: "Your name is Alice"
```
### Production Memory with PostgreSQL
```python
from langgraph.checkpoint.postgres import PostgresSaver
# Production checkpointer
checkpointer = PostgresSaver.from_conn_string(
"postgresql://user:pass@localhost/langgraph"
)
agent = create_react_agent(llm, tools, checkpointer=checkpointer)
```
### Vector Store Memory for Long-Term Context
```python
from langchain_community.vectorstores import Chroma
from langchain_voyageai import VoyageAIEmbeddings
embeddings = VoyageAIEmbeddings(model="voyage-3-large")
memory_store = Chroma(
collection_name="conversation_memory",
embedding_function=embeddings,
persist_directory="./memory_db"
)
async def retrieve_relevant_memory(query: str, k: int = 5) -> list:
"""Retrieve relevant past conversations."""
docs = await memory_store.asimilarity_search(query, k=k)
return [doc.page_content for doc in docs]
async def store_memory(content: str, metadata: dict = {}):
"""Store conversation in long-term memory."""
await memory_store.aadd_texts([content], metadatas=[metadata])
```
## Callback System & LangSmith
### LangSmith Tracing
```python
import os
from langchain_anthropic import ChatAnthropic
# Enable LangSmith tracing
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-api-key"
os.environ["LANGCHAIN_PROJECT"] = "my-project"
# All LangChain/LangGraph operations are automatically traced
llm = ChatAnthropic(model="claude-sonnet-4-5")
```
### Custom Callback Handler
```python
from langchain.callbacks.base import BaseCallbackHandler
from langchain_core.callbacks import BaseCallbackHandler
from typing import Any, Dict, List
class CustomCallbackHandler(BaseCallbackHandler):
def on_llm_start(self, serialized, prompts, **kwargs):
print(f"LLM started with prompts: {prompts}")
def on_llm_start(
self, serialized: Dict[str, Any], prompts: List[str], **kwargs
) -> None:
print(f"LLM started with {len(prompts)} prompts")
def on_llm_end(self, response, **kwargs):
print(f"LLM ended with response: {response}")
def on_llm_end(self, response, **kwargs) -> None:
print(f"LLM completed: {len(response.generations)} generations")
def on_llm_error(self, error, **kwargs):
def on_llm_error(self, error: Exception, **kwargs) -> None:
print(f"LLM error: {error}")
def on_chain_start(self, serialized, inputs, **kwargs):
print(f"Chain started with inputs: {inputs}")
def on_tool_start(
self, serialized: Dict[str, Any], input_str: str, **kwargs
) -> None:
print(f"Tool started: {serialized.get('name')}")
def on_agent_action(self, action, **kwargs):
print(f"Agent taking action: {action}")
def on_tool_end(self, output: str, **kwargs) -> None:
print(f"Tool completed: {output[:100]}...")
# Use callback
agent.run("query", callbacks=[CustomCallbackHandler()])
# Use callbacks
result = await agent.ainvoke(
{"messages": [("user", "query")]},
config={"callbacks": [CustomCallbackHandler()]}
)
```
## Streaming Responses
```python
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-sonnet-4-5", streaming=True)
# Stream tokens
async for chunk in llm.astream("Tell me a story"):
print(chunk.content, end="", flush=True)
# Stream agent events
async for event in agent.astream_events(
{"messages": [("user", "Search and summarize")]},
version="v2"
):
if event["event"] == "on_chat_model_stream":
print(event["data"]["chunk"].content, end="")
elif event["event"] == "on_tool_start":
print(f"\n[Using tool: {event['name']}]")
```
## Testing Strategies
```python
import pytest
from unittest.mock import Mock
from unittest.mock import AsyncMock, patch
def test_agent_tool_selection():
# Mock LLM to return specific tool selection
mock_llm = Mock()
mock_llm.predict.return_value = "Action: search_database\nAction Input: test query"
@pytest.mark.asyncio
async def test_agent_tool_selection():
"""Test agent selects correct tool."""
with patch.object(llm, 'ainvoke') as mock_llm:
mock_llm.return_value = AsyncMock(content="Using search_database")
agent = initialize_agent(tools, mock_llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
result = await agent.ainvoke({
"messages": [("user", "search for documents")]
})
result = agent.run("test query")
# Verify tool was called
assert "search_database" in str(result)
# Verify correct tool was selected
assert "search_database" in str(mock_llm.predict.call_args)
@pytest.mark.asyncio
async def test_memory_persistence():
"""Test memory persists across invocations."""
config = {"configurable": {"thread_id": "test-thread"}}
def test_memory_persistence():
memory = ConversationBufferMemory()
# First message
await agent.ainvoke(
{"messages": [("user", "Remember: the code is 12345")]},
config
)
memory.save_context({"input": "Hi"}, {"output": "Hello!"})
# Second message should remember
result = await agent.ainvoke(
{"messages": [("user", "What was the code?")]},
config
)
assert "Hi" in memory.load_memory_variables({})['history']
assert "Hello!" in memory.load_memory_variables({})['history']
assert "12345" in result["messages"][-1].content
```
## Performance Optimization
### 1. Caching
```python
from langchain.cache import InMemoryCache
import langchain
### 1. Caching with Redis
langchain.llm_cache = InMemoryCache()
```python
from langchain_community.cache import RedisCache
from langchain_core.globals import set_llm_cache
import redis
redis_client = redis.Redis.from_url("redis://localhost:6379")
set_llm_cache(RedisCache(redis_client))
```
### 2. Batch Processing
### 2. Async Batch Processing
```python
# Process multiple documents in parallel
from langchain.document_loaders import DirectoryLoader
from concurrent.futures import ThreadPoolExecutor
import asyncio
from langchain_core.documents import Document
loader = DirectoryLoader('./docs')
docs = loader.load()
async def process_documents(documents: list[Document]) -> list:
"""Process documents in parallel."""
tasks = [process_single(doc) for doc in documents]
return await asyncio.gather(*tasks)
def process_doc(doc):
return text_splitter.split_documents([doc])
with ThreadPoolExecutor(max_workers=4) as executor:
split_docs = list(executor.map(process_doc, docs))
async def process_single(doc: Document) -> dict:
"""Process a single document."""
chunks = text_splitter.split_documents([doc])
embeddings = await embeddings_model.aembed_documents(
[c.page_content for c in chunks]
)
return {"doc_id": doc.metadata.get("id"), "embeddings": embeddings}
```
### 3. Streaming Responses
```python
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
### 3. Connection Pooling
llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()])
```python
from langchain_pinecone import PineconeVectorStore
from pinecone import Pinecone
# Reuse Pinecone client
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index = pc.Index("my-index")
# Create vector store with existing index
vectorstore = PineconeVectorStore(index=index, embedding=embeddings)
```
## Resources
- **references/agents.md**: Deep dive on agent architectures
- **references/memory.md**: Memory system patterns
- **references/chains.md**: Chain composition strategies
- **references/document-processing.md**: Document loading and indexing
- **references/callbacks.md**: Monitoring and observability
- **assets/agent-template.py**: Production-ready agent template
- **assets/memory-config.yaml**: Memory configuration examples
- **assets/chain-example.py**: Complex chain examples
- [LangChain Documentation](https://python.langchain.com/docs/)
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [LangSmith Platform](https://smith.langchain.com/)
- [LangChain GitHub](https://github.com/langchain-ai/langchain)
- [LangGraph GitHub](https://github.com/langchain-ai/langgraph)
## Common Pitfalls
1. **Memory Overflow**: Not managing conversation history length
2. **Tool Selection Errors**: Poor tool descriptions confuse agents
3. **Context Window Exceeded**: Exceeding LLM token limits
4. **No Error Handling**: Not catching and handling agent failures
5. **Inefficient Retrieval**: Not optimizing vector store queries
1. **Using Deprecated APIs**: Use LangGraph for agents, not `initialize_agent`
2. **Memory Overflow**: Use checkpointers with TTL for long-running agents
3. **Poor Tool Descriptions**: Clear descriptions help LLM select correct tools
4. **Context Window Exceeded**: Use summarization or sliding window memory
5. **No Error Handling**: Wrap tool functions with try/except
6. **Blocking Operations**: Use async methods (`ainvoke`, `astream`)
7. **Missing Observability**: Always enable LangSmith tracing in production
## Production Checklist
- [ ] Implement proper error handling
- [ ] Add request/response logging
- [ ] Monitor token usage and costs
- [ ] Set timeout limits for agent execution
- [ ] Use LangGraph StateGraph for agent orchestration
- [ ] Implement async patterns throughout (`ainvoke`, `astream`)
- [ ] Add production checkpointer (PostgreSQL, Redis)
- [ ] Enable LangSmith tracing
- [ ] Implement structured tools with Pydantic schemas
- [ ] Add timeout limits for agent execution
- [ ] Implement rate limiting
- [ ] Add input validation
- [ ] Test with edge cases
- [ ] Set up observability (callbacks)
- [ ] Implement fallback strategies
- [ ] Add comprehensive error handling
- [ ] Set up health checks
- [ ] Version control prompts and configurations
- [ ] Write integration tests for agent workflows