feat(llm-application-dev): modernize to LangGraph and latest models v2.0.0

- Migrate from LangChain 0.x to LangChain 1.x/LangGraph patterns - Update model references to Claude 4.5 and GPT-5.2 - Add Voyage AI as primary embedding recommendation - Add structured outputs with Pydantic - Replace deprecated initialize_agent() with StateGraph - Fix security: use AST-based safe math instead of unsafe execution - Add plugin.json and README.md for consistency - Bump marketplace version to 1.3.3
2026-03-18 09:37:15 +00:00 · 2026-01-19 15:43:25 -05:00
parent e827cc713a
commit 8be0e8ac7a
12 changed files with 1940 additions and 708 deletions
--- a/plugins/llm-application-dev/skills/langchain-architecture/SKILL.md
+++ b/plugins/llm-application-dev/skills/langchain-architecture/SKILL.md
@@ -1,11 +1,11 @@
 ---
 name: langchain-architecture
-description: Design LLM applications using the LangChain framework with agents, memory, and tool integration patterns. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.
+description: Design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.
 ---

-# LangChain Architecture
+# LangChain & LangGraph Architecture

-Master the LangChain framework for building sophisticated LLM applications with agents, chains, memory, and tool integration.
+Master modern LangChain 1.x and LangGraph for building sophisticated LLM applications with agents, state management, memory, and tool integration.

 ## When to Use This Skill

@@ -17,126 +17,100 @@ Master the LangChain framework for building sophisticated LLM applications with
 - Implementing document processing pipelines
 - Building production-grade LLM applications

+## Package Structure (LangChain 1.x)
+
+```
+langchain (1.2.x)         # High-level orchestration
+langchain-core (1.2.x)    # Core abstractions (messages, prompts, tools)
+langchain-community       # Third-party integrations
+langgraph                 # Agent orchestration and state management
+langchain-openai          # OpenAI integrations
+langchain-anthropic       # Anthropic/Claude integrations
+langchain-voyageai        # Voyage AI embeddings
+langchain-pinecone        # Pinecone vector store
+```
+
 ## Core Concepts

-### 1. Agents
-Autonomous systems that use LLMs to decide which actions to take.
+### 1. LangGraph Agents
+LangGraph is the standard for building agents in 2026. It provides:

-**Agent Types:**
- **ReAct**: Reasoning + Acting in interleaved manner
- **OpenAI Functions**: Leverages function calling API
- **Structured Chat**: Handles multi-input tools
- **Conversational**: Optimized for chat interfaces
- **Self-Ask with Search**: Decomposes complex queries
+**Key Features:**
+- **StateGraph**: Explicit state management with typed state
+- **Durable Execution**: Agents persist through failures
+- **Human-in-the-Loop**: Inspect and modify state at any point
+- **Memory**: Short-term and long-term memory across sessions
+- **Checkpointing**: Save and resume agent state

-### 2. Chains
-Sequences of calls to LLMs or other utilities.
+**Agent Patterns:**
+- **ReAct**: Reasoning + Acting with `create_react_agent`
+- **Plan-and-Execute**: Separate planning and execution nodes
+- **Multi-Agent**: Supervisor routing between specialized agents
+- **Tool-Calling**: Structured tool invocation with Pydantic schemas

-**Chain Types:**
- **LLMChain**: Basic prompt + LLM combination
- **SequentialChain**: Multiple chains in sequence
- **RouterChain**: Routes inputs to specialized chains
- **TransformChain**: Data transformations between steps
- **MapReduceChain**: Parallel processing with aggregation
+### 2. State Management
+LangGraph uses TypedDict for explicit state:

-### 3. Memory
-Systems for maintaining context across interactions.
+```python
+from typing import Annotated, TypedDict
+from langgraph.graph import MessagesState

-**Memory Types:**
- **ConversationBufferMemory**: Stores all messages
- **ConversationSummaryMemory**: Summarizes older messages
- **ConversationBufferWindowMemory**: Keeps last N messages
- **EntityMemory**: Tracks information about entities
- **VectorStoreMemory**: Semantic similarity retrieval
+# Simple message-based state
+class AgentState(MessagesState):
+    """Extends MessagesState with custom fields."""
+    context: Annotated[list, "retrieved documents"]
+
+# Custom state for complex agents
+class CustomState(TypedDict):
+    messages: Annotated[list, "conversation history"]
+    context: Annotated[dict, "retrieved context"]
+    current_step: str
+    results: list
+```
+
+### 3. Memory Systems
+Modern memory implementations:
+
+- **ConversationBufferMemory**: Stores all messages (short conversations)
+- **ConversationSummaryMemory**: Summarizes older messages (long conversations)
+- **ConversationTokenBufferMemory**: Token-based windowing
+- **VectorStoreRetrieverMemory**: Semantic similarity retrieval
+- **LangGraph Checkpointers**: Persistent state across sessions

 ### 4. Document Processing
-Loading, transforming, and storing documents for retrieval.
+Loading, transforming, and storing documents:

 **Components:**
 - **Document Loaders**: Load from various sources
 - **Text Splitters**: Chunk documents intelligently
 - **Vector Stores**: Store and retrieve embeddings
 - **Retrievers**: Fetch relevant documents
- **Indexes**: Organize documents for efficient access

-### 5. Callbacks
-Hooks for logging, monitoring, and debugging.
+### 5. Callbacks & Tracing
+LangSmith is the standard for observability:

-**Use Cases:**
 - Request/response logging
 - Token usage tracking
 - Latency monitoring
- Error handling
- Custom metrics collection
+- Error tracking
+- Trace visualization

 ## Quick Start

+### Modern ReAct Agent with LangGraph
+
 ```python
-from langchain.agents import AgentType, initialize_agent, load_tools
-from langchain.llms import OpenAI
-from langchain.memory import ConversationBufferMemory
+from langgraph.prebuilt import create_react_agent
+from langgraph.checkpoint.memory import MemorySaver
+from langchain_anthropic import ChatAnthropic
+from langchain_core.tools import tool
+import ast
+import operator

-# Initialize LLM
-llm = OpenAI(temperature=0)
-
-# Load tools
-tools = load_tools(["serpapi", "llm-math"], llm=llm)
-
-# Add memory
-memory = ConversationBufferMemory(memory_key="chat_history")
-
-# Create agent
-agent = initialize_agent(
-    tools,
-    llm,
-    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
-    memory=memory,
-    verbose=True
-)
-
-# Run agent
-result = agent.run("What's the weather in SF? Then calculate 25 * 4")
-```
-
-## Architecture Patterns
-
-### Pattern 1: RAG with LangChain
-```python
-from langchain.chains import RetrievalQA
-from langchain.document_loaders import TextLoader
-from langchain.text_splitter import CharacterTextSplitter
-from langchain.vectorstores import Chroma
-from langchain.embeddings import OpenAIEmbeddings
-
-# Load and process documents
-loader = TextLoader('documents.txt')
-documents = loader.load()
-
-text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
-texts = text_splitter.split_documents(documents)
-
-# Create vector store
-embeddings = OpenAIEmbeddings()
-vectorstore = Chroma.from_documents(texts, embeddings)
-
-# Create retrieval chain
-qa_chain = RetrievalQA.from_chain_type(
-    llm=llm,
-    chain_type="stuff",
-    retriever=vectorstore.as_retriever(),
-    return_source_documents=True
-)
-
-# Query
-result = qa_chain({"query": "What is the main topic?"})
-```
-
-### Pattern 2: Custom Agent with Tools
-```python
-from langchain.agents import Tool, AgentExecutor
-from langchain.agents.react.base import ReActDocstoreAgent
-from langchain.tools import tool
+# Initialize LLM (Claude Sonnet 4.5 recommended)
+llm = ChatAnthropic(model="claude-sonnet-4-5", temperature=0)

+# Define tools with Pydantic schemas
@tool
 def search_database(query: str) -> str:
    """Search internal database for information."""
@@ -144,195 +118,541 @@ def search_database(query: str) -> str:
    return f"Results for: {query}"

@tool
-def send_email(recipient: str, content: str) -> str:
+def calculate(expression: str) -> str:
+    """Safely evaluate a mathematical expression.
+
+    Supports: +, -, *, /, **, %, parentheses
+    Example: '(2 + 3) * 4' returns '20'
+    """
+    # Safe math evaluation using ast
+    allowed_operators = {
+        ast.Add: operator.add,
+        ast.Sub: operator.sub,
+        ast.Mult: operator.mul,
+        ast.Div: operator.truediv,
+        ast.Pow: operator.pow,
+        ast.Mod: operator.mod,
+        ast.USub: operator.neg,
+    }
+
+    def _eval(node):
+        if isinstance(node, ast.Constant):
+            return node.value
+        elif isinstance(node, ast.BinOp):
+            left = _eval(node.left)
+            right = _eval(node.right)
+            return allowed_operators[type(node.op)](left, right)
+        elif isinstance(node, ast.UnaryOp):
+            operand = _eval(node.operand)
+            return allowed_operators[type(node.op)](operand)
+        else:
+            raise ValueError(f"Unsupported operation: {type(node)}")
+
+    try:
+        tree = ast.parse(expression, mode='eval')
+        return str(_eval(tree.body))
+    except Exception as e:
+        return f"Error: {e}"
+
+tools = [search_database, calculate]
+
+# Create checkpointer for memory persistence
+checkpointer = MemorySaver()
+
+# Create ReAct agent
+agent = create_react_agent(
+    llm,
+    tools,
+    checkpointer=checkpointer
+)
+
+# Run agent with thread ID for memory
+config = {"configurable": {"thread_id": "user-123"}}
+result = await agent.ainvoke(
+    {"messages": [("user", "Search for Python tutorials and calculate 25 * 4")]},
+    config=config
+)
+```
+
+## Architecture Patterns
+
+### Pattern 1: RAG with LangGraph
+
+```python
+from langgraph.graph import StateGraph, START, END
+from langchain_anthropic import ChatAnthropic
+from langchain_voyageai import VoyageAIEmbeddings
+from langchain_pinecone import PineconeVectorStore
+from langchain_core.documents import Document
+from langchain_core.prompts import ChatPromptTemplate
+from typing import TypedDict, Annotated
+
+class RAGState(TypedDict):
+    question: str
+    context: Annotated[list[Document], "retrieved documents"]
+    answer: str
+
+# Initialize components
+llm = ChatAnthropic(model="claude-sonnet-4-5")
+embeddings = VoyageAIEmbeddings(model="voyage-3-large")
+vectorstore = PineconeVectorStore(index_name="docs", embedding=embeddings)
+retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
+
+# Define nodes
+async def retrieve(state: RAGState) -> RAGState:
+    """Retrieve relevant documents."""
+    docs = await retriever.ainvoke(state["question"])
+    return {"context": docs}
+
+async def generate(state: RAGState) -> RAGState:
+    """Generate answer from context."""
+    prompt = ChatPromptTemplate.from_template(
+        """Answer based on the context below. If you cannot answer, say so.
+
+        Context: {context}
+
+        Question: {question}
+
+        Answer:"""
+    )
+    context_text = "\n\n".join(doc.page_content for doc in state["context"])
+    response = await llm.ainvoke(
+        prompt.format(context=context_text, question=state["question"])
+    )
+    return {"answer": response.content}
+
+# Build graph
+builder = StateGraph(RAGState)
+builder.add_node("retrieve", retrieve)
+builder.add_node("generate", generate)
+builder.add_edge(START, "retrieve")
+builder.add_edge("retrieve", "generate")
+builder.add_edge("generate", END)
+
+rag_chain = builder.compile()
+
+# Use the chain
+result = await rag_chain.ainvoke({"question": "What is the main topic?"})
+```
+
+### Pattern 2: Custom Agent with Structured Tools
+
+```python
+from langchain_core.tools import StructuredTool
+from pydantic import BaseModel, Field
+
+class SearchInput(BaseModel):
+    """Input for database search."""
+    query: str = Field(description="Search query")
+    filters: dict = Field(default={}, description="Optional filters")
+
+class EmailInput(BaseModel):
+    """Input for sending email."""
+    recipient: str = Field(description="Email recipient")
+    subject: str = Field(description="Email subject")
+    content: str = Field(description="Email body")
+
+async def search_database(query: str, filters: dict = {}) -> str:
+    """Search internal database for information."""
+    # Your database search logic
+    return f"Results for '{query}' with filters {filters}"
+
+async def send_email(recipient: str, subject: str, content: str) -> str:
    """Send an email to specified recipient."""
    # Email sending logic
    return f"Email sent to {recipient}"

-tools = [search_database, send_email]
+tools = [
+    StructuredTool.from_function(
+        coroutine=search_database,
+        name="search_database",
+        description="Search internal database",
+        args_schema=SearchInput
+    ),
+    StructuredTool.from_function(
+        coroutine=send_email,
+        name="send_email",
+        description="Send an email",
+        args_schema=EmailInput
+    )
+]

-agent = initialize_agent(
-    tools,
-    llm,
-    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
-    verbose=True
-)
+agent = create_react_agent(llm, tools)
 ```

-### Pattern 3: Multi-Step Chain
+### Pattern 3: Multi-Step Workflow with StateGraph
+
 ```python
-from langchain.chains import LLMChain, SequentialChain
-from langchain.prompts import PromptTemplate
+from langgraph.graph import StateGraph, START, END
+from typing import TypedDict, Literal

-# Step 1: Extract key information
-extract_prompt = PromptTemplate(
-    input_variables=["text"],
-    template="Extract key entities from: {text}\n\nEntities:"
-)
-extract_chain = LLMChain(llm=llm, prompt=extract_prompt, output_key="entities")
+class WorkflowState(TypedDict):
+    text: str
+    entities: list
+    analysis: str
+    summary: str
+    current_step: str

-# Step 2: Analyze entities
-analyze_prompt = PromptTemplate(
-    input_variables=["entities"],
-    template="Analyze these entities: {entities}\n\nAnalysis:"
-)
-analyze_chain = LLMChain(llm=llm, prompt=analyze_prompt, output_key="analysis")
+async def extract_entities(state: WorkflowState) -> WorkflowState:
+    """Extract key entities from text."""
+    prompt = f"Extract key entities from: {state['text']}\n\nReturn as JSON list."
+    response = await llm.ainvoke(prompt)
+    return {"entities": response.content, "current_step": "analyze"}

-# Step 3: Generate summary
-summary_prompt = PromptTemplate(
-    input_variables=["entities", "analysis"],
-    template="Summarize:\nEntities: {entities}\nAnalysis: {analysis}\n\nSummary:"
-)
-summary_chain = LLMChain(llm=llm, prompt=summary_prompt, output_key="summary")
+async def analyze_entities(state: WorkflowState) -> WorkflowState:
+    """Analyze extracted entities."""
+    prompt = f"Analyze these entities: {state['entities']}\n\nProvide insights."
+    response = await llm.ainvoke(prompt)
+    return {"analysis": response.content, "current_step": "summarize"}

-# Combine into sequential chain
-overall_chain = SequentialChain(
-    chains=[extract_chain, analyze_chain, summary_chain],
-    input_variables=["text"],
-    output_variables=["entities", "analysis", "summary"],
-    verbose=True
-)
+async def generate_summary(state: WorkflowState) -> WorkflowState:
+    """Generate final summary."""
+    prompt = f"""Summarize:
+    Entities: {state['entities']}
+    Analysis: {state['analysis']}
+
+    Provide a concise summary."""
+    response = await llm.ainvoke(prompt)
+    return {"summary": response.content, "current_step": "complete"}
+
+def route_step(state: WorkflowState) -> Literal["analyze", "summarize", "end"]:
+    """Route to next step based on current state."""
+    step = state.get("current_step", "extract")
+    if step == "analyze":
+        return "analyze"
+    elif step == "summarize":
+        return "summarize"
+    return "end"
+
+# Build workflow
+builder = StateGraph(WorkflowState)
+builder.add_node("extract", extract_entities)
+builder.add_node("analyze", analyze_entities)
+builder.add_node("summarize", generate_summary)
+
+builder.add_edge(START, "extract")
+builder.add_conditional_edges("extract", route_step, {
+    "analyze": "analyze",
+    "summarize": "summarize",
+    "end": END
+})
+builder.add_conditional_edges("analyze", route_step, {
+    "summarize": "summarize",
+    "end": END
+})
+builder.add_edge("summarize", END)
+
+workflow = builder.compile()
 ```

-## Memory Management Best Practices
+### Pattern 4: Multi-Agent Orchestration

-### Choosing the Right Memory Type
 ```python
-# For short conversations (< 10 messages)
-from langchain.memory import ConversationBufferMemory
-memory = ConversationBufferMemory()
+from langgraph.graph import StateGraph, START, END
+from langgraph.prebuilt import create_react_agent
+from langchain_core.messages import HumanMessage
+from typing import Literal

-# For long conversations (summarize old messages)
-from langchain.memory import ConversationSummaryMemory
-memory = ConversationSummaryMemory(llm=llm)
+class MultiAgentState(TypedDict):
+    messages: list
+    next_agent: str

-# For sliding window (last N messages)
-from langchain.memory import ConversationBufferWindowMemory
-memory = ConversationBufferWindowMemory(k=5)
+# Create specialized agents
+researcher = create_react_agent(llm, research_tools)
+writer = create_react_agent(llm, writing_tools)
+reviewer = create_react_agent(llm, review_tools)

-# For entity tracking
-from langchain.memory import ConversationEntityMemory
-memory = ConversationEntityMemory(llm=llm)
+async def supervisor(state: MultiAgentState) -> MultiAgentState:
+    """Route to appropriate agent based on task."""
+    prompt = f"""Based on the conversation, which agent should handle this?

-# For semantic retrieval of relevant history
-from langchain.memory import VectorStoreRetrieverMemory
-memory = VectorStoreRetrieverMemory(retriever=retriever)
+    Options:
+    - researcher: For finding information
+    - writer: For creating content
+    - reviewer: For reviewing and editing
+    - FINISH: Task is complete
+
+    Messages: {state['messages']}
+
+    Respond with just the agent name."""
+
+    response = await llm.ainvoke(prompt)
+    return {"next_agent": response.content.strip().lower()}
+
+def route_to_agent(state: MultiAgentState) -> Literal["researcher", "writer", "reviewer", "end"]:
+    """Route based on supervisor decision."""
+    next_agent = state.get("next_agent", "").lower()
+    if next_agent == "finish":
+        return "end"
+    return next_agent if next_agent in ["researcher", "writer", "reviewer"] else "end"
+
+# Build multi-agent graph
+builder = StateGraph(MultiAgentState)
+builder.add_node("supervisor", supervisor)
+builder.add_node("researcher", researcher)
+builder.add_node("writer", writer)
+builder.add_node("reviewer", reviewer)
+
+builder.add_edge(START, "supervisor")
+builder.add_conditional_edges("supervisor", route_to_agent, {
+    "researcher": "researcher",
+    "writer": "writer",
+    "reviewer": "reviewer",
+    "end": END
+})
+
+# Each agent returns to supervisor
+for agent in ["researcher", "writer", "reviewer"]:
+    builder.add_edge(agent, "supervisor")
+
+multi_agent = builder.compile()
 ```

-## Callback System
+## Memory Management
+
+### Token-Based Memory with LangGraph
+
+```python
+from langgraph.checkpoint.memory import MemorySaver
+from langgraph.prebuilt import create_react_agent
+
+# In-memory checkpointer (development)
+checkpointer = MemorySaver()
+
+# Create agent with persistent memory
+agent = create_react_agent(llm, tools, checkpointer=checkpointer)
+
+# Each thread_id maintains separate conversation
+config = {"configurable": {"thread_id": "session-abc123"}}
+
+# Messages persist across invocations with same thread_id
+result1 = await agent.ainvoke({"messages": [("user", "My name is Alice")]}, config)
+result2 = await agent.ainvoke({"messages": [("user", "What's my name?")]}, config)
+# Agent remembers: "Your name is Alice"
+```
+
+### Production Memory with PostgreSQL
+
+```python
+from langgraph.checkpoint.postgres import PostgresSaver
+
+# Production checkpointer
+checkpointer = PostgresSaver.from_conn_string(
+    "postgresql://user:pass@localhost/langgraph"
+)
+
+agent = create_react_agent(llm, tools, checkpointer=checkpointer)
+```
+
+### Vector Store Memory for Long-Term Context
+
+```python
+from langchain_community.vectorstores import Chroma
+from langchain_voyageai import VoyageAIEmbeddings
+
+embeddings = VoyageAIEmbeddings(model="voyage-3-large")
+memory_store = Chroma(
+    collection_name="conversation_memory",
+    embedding_function=embeddings,
+    persist_directory="./memory_db"
+)
+
+async def retrieve_relevant_memory(query: str, k: int = 5) -> list:
+    """Retrieve relevant past conversations."""
+    docs = await memory_store.asimilarity_search(query, k=k)
+    return [doc.page_content for doc in docs]
+
+async def store_memory(content: str, metadata: dict = {}):
+    """Store conversation in long-term memory."""
+    await memory_store.aadd_texts([content], metadatas=[metadata])
+```
+
+## Callback System & LangSmith
+
+### LangSmith Tracing
+
+```python
+import os
+from langchain_anthropic import ChatAnthropic
+
+# Enable LangSmith tracing
+os.environ["LANGCHAIN_TRACING_V2"] = "true"
+os.environ["LANGCHAIN_API_KEY"] = "your-api-key"
+os.environ["LANGCHAIN_PROJECT"] = "my-project"
+
+# All LangChain/LangGraph operations are automatically traced
+llm = ChatAnthropic(model="claude-sonnet-4-5")
+```

 ### Custom Callback Handler
+
 ```python
-from langchain.callbacks.base import BaseCallbackHandler
+from langchain_core.callbacks import BaseCallbackHandler
+from typing import Any, Dict, List

 class CustomCallbackHandler(BaseCallbackHandler):
-    def on_llm_start(self, serialized, prompts, **kwargs):
-        print(f"LLM started with prompts: {prompts}")
+    def on_llm_start(
+        self, serialized: Dict[str, Any], prompts: List[str], **kwargs
+    ) -> None:
+        print(f"LLM started with {len(prompts)} prompts")

-    def on_llm_end(self, response, **kwargs):
-        print(f"LLM ended with response: {response}")
+    def on_llm_end(self, response, **kwargs) -> None:
+        print(f"LLM completed: {len(response.generations)} generations")

-    def on_llm_error(self, error, **kwargs):
+    def on_llm_error(self, error: Exception, **kwargs) -> None:
        print(f"LLM error: {error}")

-    def on_chain_start(self, serialized, inputs, **kwargs):
-        print(f"Chain started with inputs: {inputs}")
+    def on_tool_start(
+        self, serialized: Dict[str, Any], input_str: str, **kwargs
+    ) -> None:
+        print(f"Tool started: {serialized.get('name')}")

-    def on_agent_action(self, action, **kwargs):
-        print(f"Agent taking action: {action}")
+    def on_tool_end(self, output: str, **kwargs) -> None:
+        print(f"Tool completed: {output[:100]}...")

-# Use callback
-agent.run("query", callbacks=[CustomCallbackHandler()])
+# Use callbacks
+result = await agent.ainvoke(
+    {"messages": [("user", "query")]},
+    config={"callbacks": [CustomCallbackHandler()]}
+)
+```
+
+## Streaming Responses
+
+```python
+from langchain_anthropic import ChatAnthropic
+
+llm = ChatAnthropic(model="claude-sonnet-4-5", streaming=True)
+
+# Stream tokens
+async for chunk in llm.astream("Tell me a story"):
+    print(chunk.content, end="", flush=True)
+
+# Stream agent events
+async for event in agent.astream_events(
+    {"messages": [("user", "Search and summarize")]},
+    version="v2"
+):
+    if event["event"] == "on_chat_model_stream":
+        print(event["data"]["chunk"].content, end="")
+    elif event["event"] == "on_tool_start":
+        print(f"\n[Using tool: {event['name']}]")
 ```

 ## Testing Strategies

 ```python
 import pytest
-from unittest.mock import Mock
+from unittest.mock import AsyncMock, patch

-def test_agent_tool_selection():
-    # Mock LLM to return specific tool selection
-    mock_llm = Mock()
-    mock_llm.predict.return_value = "Action: search_database\nAction Input: test query"
+@pytest.mark.asyncio
+async def test_agent_tool_selection():
+    """Test agent selects correct tool."""
+    with patch.object(llm, 'ainvoke') as mock_llm:
+        mock_llm.return_value = AsyncMock(content="Using search_database")

-    agent = initialize_agent(tools, mock_llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
+        result = await agent.ainvoke({
+            "messages": [("user", "search for documents")]
+        })

-    result = agent.run("test query")
+        # Verify tool was called
+        assert "search_database" in str(result)

-    # Verify correct tool was selected
-    assert "search_database" in str(mock_llm.predict.call_args)
+@pytest.mark.asyncio
+async def test_memory_persistence():
+    """Test memory persists across invocations."""
+    config = {"configurable": {"thread_id": "test-thread"}}

-def test_memory_persistence():
-    memory = ConversationBufferMemory()
+    # First message
+    await agent.ainvoke(
+        {"messages": [("user", "Remember: the code is 12345")]},
+        config
+    )

-    memory.save_context({"input": "Hi"}, {"output": "Hello!"})
+    # Second message should remember
+    result = await agent.ainvoke(
+        {"messages": [("user", "What was the code?")]},
+        config
+    )

-    assert "Hi" in memory.load_memory_variables({})['history']
-    assert "Hello!" in memory.load_memory_variables({})['history']
+    assert "12345" in result["messages"][-1].content
 ```

 ## Performance Optimization

-### 1. Caching
-```python
-from langchain.cache import InMemoryCache
-import langchain
+### 1. Caching with Redis

-langchain.llm_cache = InMemoryCache()
+```python
+from langchain_community.cache import RedisCache
+from langchain_core.globals import set_llm_cache
+import redis
+
+redis_client = redis.Redis.from_url("redis://localhost:6379")
+set_llm_cache(RedisCache(redis_client))
 ```

-### 2. Batch Processing
+### 2. Async Batch Processing
+
 ```python
-# Process multiple documents in parallel
-from langchain.document_loaders import DirectoryLoader
-from concurrent.futures import ThreadPoolExecutor
+import asyncio
+from langchain_core.documents import Document

-loader = DirectoryLoader('./docs')
-docs = loader.load()
+async def process_documents(documents: list[Document]) -> list:
+    """Process documents in parallel."""
+    tasks = [process_single(doc) for doc in documents]
+    return await asyncio.gather(*tasks)

-def process_doc(doc):
-    return text_splitter.split_documents([doc])
-
-with ThreadPoolExecutor(max_workers=4) as executor:
-    split_docs = list(executor.map(process_doc, docs))
+async def process_single(doc: Document) -> dict:
+    """Process a single document."""
+    chunks = text_splitter.split_documents([doc])
+    embeddings = await embeddings_model.aembed_documents(
+        [c.page_content for c in chunks]
+    )
+    return {"doc_id": doc.metadata.get("id"), "embeddings": embeddings}
 ```

-### 3. Streaming Responses
-```python
-from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
+### 3. Connection Pooling

-llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()])
+```python
+from langchain_pinecone import PineconeVectorStore
+from pinecone import Pinecone
+
+# Reuse Pinecone client
+pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
+index = pc.Index("my-index")
+
+# Create vector store with existing index
+vectorstore = PineconeVectorStore(index=index, embedding=embeddings)
 ```

 ## Resources

- **references/agents.md**: Deep dive on agent architectures
- **references/memory.md**: Memory system patterns
- **references/chains.md**: Chain composition strategies
- **references/document-processing.md**: Document loading and indexing
- **references/callbacks.md**: Monitoring and observability
- **assets/agent-template.py**: Production-ready agent template
- **assets/memory-config.yaml**: Memory configuration examples
- **assets/chain-example.py**: Complex chain examples
+- [LangChain Documentation](https://python.langchain.com/docs/)
+- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
+- [LangSmith Platform](https://smith.langchain.com/)
+- [LangChain GitHub](https://github.com/langchain-ai/langchain)
+- [LangGraph GitHub](https://github.com/langchain-ai/langgraph)

 ## Common Pitfalls

-1. **Memory Overflow**: Not managing conversation history length
-2. **Tool Selection Errors**: Poor tool descriptions confuse agents
-3. **Context Window Exceeded**: Exceeding LLM token limits
-4. **No Error Handling**: Not catching and handling agent failures
-5. **Inefficient Retrieval**: Not optimizing vector store queries
+1. **Using Deprecated APIs**: Use LangGraph for agents, not `initialize_agent`
+2. **Memory Overflow**: Use checkpointers with TTL for long-running agents
+3. **Poor Tool Descriptions**: Clear descriptions help LLM select correct tools
+4. **Context Window Exceeded**: Use summarization or sliding window memory
+5. **No Error Handling**: Wrap tool functions with try/except
+6. **Blocking Operations**: Use async methods (`ainvoke`, `astream`)
+7. **Missing Observability**: Always enable LangSmith tracing in production

 ## Production Checklist

- [ ] Implement proper error handling
- [ ] Add request/response logging
- [ ] Monitor token usage and costs
- [ ] Set timeout limits for agent execution
+- [ ] Use LangGraph StateGraph for agent orchestration
+- [ ] Implement async patterns throughout (`ainvoke`, `astream`)
+- [ ] Add production checkpointer (PostgreSQL, Redis)
+- [ ] Enable LangSmith tracing
+- [ ] Implement structured tools with Pydantic schemas
+- [ ] Add timeout limits for agent execution
 - [ ] Implement rate limiting
- [ ] Add input validation
- [ ] Test with edge cases
- [ ] Set up observability (callbacks)
- [ ] Implement fallback strategies
+- [ ] Add comprehensive error handling
+- [ ] Set up health checks
 - [ ] Version control prompts and configurations
+- [ ] Write integration tests for agent workflows