agents

mirror of https://github.com/wshobson/agents.git synced 2026-03-18 17:47:16 +00:00

Author	SHA1	Message	Date
Seth Hobson	56848874a2	style: format all files with prettier	2026-01-19 17:07:03 -05:00
Seth Hobson	8be0e8ac7a	feat(llm-application-dev): modernize to LangGraph and latest models v2.0.0 - Migrate from LangChain 0.x to LangChain 1.x/LangGraph patterns - Update model references to Claude 4.5 and GPT-5.2 - Add Voyage AI as primary embedding recommendation - Add structured outputs with Pydantic - Replace deprecated initialize_agent() with StateGraph - Fix security: use AST-based safe math instead of unsafe execution - Add plugin.json and README.md for consistency - Bump marketplace version to 1.3.3	2026-01-19 15:43:25 -05:00
google-labs-jules[bot]	a86384334b	⚡ Bolt: optimize prompt evaluation loop to skip redundant calls (#152 ) - Avoid re-evaluating the current prompt if metrics are already available from the previous iteration. - Pass metrics from the best variation to the next iteration. - Reduces N-1 expensive LLM calls in an N-iteration optimization loop. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>	2025-12-21 19:02:37 -05:00
google-labs-jules[bot]	fda45604b7	⚡ Bolt: Optimize PromptOptimizer thread pool usage (#147 ) * ⚡ Bolt: Reuse ThreadPoolExecutor in PromptOptimizer 💡 What: Initialized `ThreadPoolExecutor` in `PromptOptimizer.__init__` and reused it in `evaluate_prompt`. 🎯 Why: The previous implementation created a new `ThreadPoolExecutor` for every call to `evaluate_prompt`. Since `evaluate_prompt` is called repeatedly inside the `optimize` loop (and for every variation), this caused significant overhead from repeatedly creating and destroying thread pools. 📊 Impact: Benchmark showed a reduction in execution time from ~5.36s to ~3.76s (~30% improvement) for 500 iterations with a mocked LLM. 🔬 Measurement: Ran a benchmark script executing `evaluate_prompt` 500 times. Before: 5.36s After: 3.76s * ⚡ Bolt: Reuse ThreadPoolExecutor in PromptOptimizer 💡 What: Initialized `ThreadPoolExecutor` in `PromptOptimizer.__init__` and reused it in `evaluate_prompt`. Added a `shutdown` method for proper cleanup. 🎯 Why: The previous implementation created a new `ThreadPoolExecutor` for every call to `evaluate_prompt`. Since `evaluate_prompt` is called repeatedly inside the `optimize` loop (and for every variation), this caused significant overhead from repeatedly creating and destroying thread pools. 📊 Impact: Benchmark showed a reduction in execution time from ~5.36s to ~3.76s (~30% improvement) for 500 iterations with a mocked LLM. 🔬 Measurement: Ran a benchmark script executing `evaluate_prompt` 500 times. Before: 5.36s After: 3.76s * ⚡ Bolt: Reuse ThreadPoolExecutor in PromptOptimizer 💡 What: Initialized `ThreadPoolExecutor` in `PromptOptimizer.__init__` and reused it in `evaluate_prompt`. Added a `shutdown` method and wrapped execution in `try...finally` for proper resource management. 🎯 Why: The previous implementation created a new `ThreadPoolExecutor` for every call to `evaluate_prompt`. Since `evaluate_prompt` is called repeatedly inside the `optimize` loop (and for every variation), this caused significant overhead from repeatedly creating and destroying thread pools. 📊 Impact: Benchmark showed a reduction in execution time from ~5.36s to ~3.76s (~30% improvement) for 500 iterations with a mocked LLM. 🔬 Measurement: Ran a benchmark script executing `evaluate_prompt` 500 times. Before: 5.36s After: 3.76s --------- Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>	2025-12-20 21:28:39 -05:00
google-labs-jules[bot]	70cf3f3682	⚡ Bolt: Parallelize Prompt Evaluation in optimize-prompt.py (#145 ) * feat: Parallelize prompt evaluation in optimize-prompt.py - Update `PromptOptimizer.evaluate_prompt` to use `ThreadPoolExecutor` for concurrent test case processing - Significantly reduces total execution time when using high-latency LLM clients (network IO bound) - Maintain accurate metric aggregation (latency, accuracy, token count) from parallel results - This prepares the script for real-world usage where sequential execution is a major bottleneck ⚡ Bolt: Reduces total evaluation time from O(n) to O(1) latency-wise (bounded by max_workers) for concurrent requests. * feat: Parallelize prompt evaluation in optimize-prompt.py - Update `PromptOptimizer.evaluate_prompt` to use `ThreadPoolExecutor` for concurrent test case processing - Significantly reduces total execution time when using high-latency LLM clients (network IO bound) - Maintain accurate metric aggregation (latency, accuracy, token count) from parallel results - Ensure no generated artifacts (`optimization_results.json`) are committed ⚡ Bolt: Reduces total evaluation time from O(n) to O(1) latency-wise (bounded by max_workers) for concurrent requests. --------- Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>	2025-12-19 09:12:15 -05:00
Seth Hobson	01d93fc227	feat: add 5 new specialized agents with 20 skills Add domain expert agents with comprehensive skill sets: - service-mesh-expert (cloud-infrastructure): Istio/Linkerd patterns, mTLS, observability - event-sourcing-architect (backend-development): CQRS, event stores, projections, sagas - vector-database-engineer (llm-application-dev): embeddings, similarity search, hybrid search - monorepo-architect (developer-essentials): Nx, Turborepo, Bazel, pnpm workspaces - threat-modeling-expert (security-scanning): STRIDE, attack trees, security requirements Update all documentation to reflect correct counts: - 67 plugins, 99 agents, 107 skills, 71 commands	2025-12-16 16:00:58 -05:00
Kunal Shah	1305e48672	Replace GPT and Claude models to latest, better and cheaper models (#118 ) * Updated GPT and Claude models to latest, better and cheaper models * updated more files to use GPT-5 and Sonnet/Haiku 4.5 because theu are the latest, cheaper and better models	2025-11-16 20:22:36 -05:00
Seth Hobson	65e5cb093a	feat: add Agent Skills and restructure documentation - Add 47 Agent Skills across 14 plugins following Anthropic's specification - Python (5): async patterns, testing, packaging, performance, UV package manager - JavaScript/TypeScript (4): advanced types, Node.js patterns, testing, modern JS - Kubernetes (4): manifests, Helm charts, GitOps, security policies - Cloud Infrastructure (4): Terraform, multi-cloud, hybrid networking, cost optimization - CI/CD (4): pipeline design, GitHub Actions, GitLab CI, secrets management - Backend (3): API design, architecture patterns, microservices - LLM Applications (4): LangChain, prompt engineering, RAG, evaluation - Blockchain/Web3 (4): DeFi protocols, NFT standards, Solidity security, Web3 testing - Framework Migration (4): React, Angular, database, dependency upgrades - Observability (4): Prometheus, Grafana, distributed tracing, SLO - Payment Processing (4): Stripe, PayPal, PCI compliance, billing - API Scaffolding (1): FastAPI templates - ML Operations (1): ML pipeline workflow - Security (1): SAST configuration - Restructure documentation into /docs directory - agent-skills.md: Complete guide to all 47 skills - agents.md: All 85 agents with model configuration - plugins.md: Complete catalog of 63 plugins - usage.md: Commands, workflows, and best practices - architecture.md: Design principles and patterns - Update README.md - Add Agent Skills banner announcement - Reduce length by ~75% with links to detailed docs - Add What's New section showcasing Agent Skills - Add Popular Use Cases with real examples - Improve navigation with Core Guides and Quick Links - Update marketplace.json with skills arrays for 14 plugins All 47 skills follow Agent Skills Specification: - Required YAML frontmatter (name, description) - Use when activation clauses - Progressive disclosure architecture - Under 1024 character descriptions	2025-10-16 20:33:27 -04:00

8 Commits