Commit Graph

9 Commits

Author SHA1 Message Date
Seth Hobson
47a5dbc3f9 fix(skills): remove phantom resource references and fix CoC links (#447)
Remove references to non-existent resource files (references/, assets/,
scripts/, examples/) from 115 skill SKILL.md files. These sections
pointed to directories and files that were never created, causing
confusion when users install skills.

Also fix broken Code of Conduct links in issue templates to use
absolute GitHub URLs instead of relative paths that 404.
2026-03-07 10:53:17 -05:00
Seth Hobson
086557180a chore: update model references to Claude 4.6 and GPT-5.2
- Claude Opus 4.5 → Opus 4.6, Claude Sonnet 4.5 → Sonnet 4.6 (Haiku stays 4.5)
- Update claude-sonnet-4-5 model IDs to claude-sonnet-4-6 in code examples
- Update SWE-bench stat from 80.9% to 80.8% for Opus 4.6
- Update GPT refs: GPT-5 → GPT-5.2, GPT-4o → gpt-5.2, GPT-4o-mini → GPT-5-mini
- Fix GPT-5.2-mini → GPT-5-mini (correct model name per OpenAI)
- Bump marketplace to v1.5.2 and affected plugin versions
2026-02-19 14:03:46 -05:00
Seth Hobson
56848874a2 style: format all files with prettier 2026-01-19 17:07:03 -05:00
Seth Hobson
8be0e8ac7a feat(llm-application-dev): modernize to LangGraph and latest models v2.0.0
- Migrate from LangChain 0.x to LangChain 1.x/LangGraph patterns
- Update model references to Claude 4.5 and GPT-5.2
- Add Voyage AI as primary embedding recommendation
- Add structured outputs with Pydantic
- Replace deprecated initialize_agent() with StateGraph
- Fix security: use AST-based safe math instead of unsafe execution
- Add plugin.json and README.md for consistency
- Bump marketplace version to 1.3.3
2026-01-19 15:43:25 -05:00
google-labs-jules[bot]
a86384334b Bolt: optimize prompt evaluation loop to skip redundant calls (#152)
- Avoid re-evaluating the current prompt if metrics are already available from the previous iteration.
- Pass metrics from the best variation to the next iteration.
- Reduces N-1 expensive LLM calls in an N-iteration optimization loop.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
2025-12-21 19:02:37 -05:00
google-labs-jules[bot]
fda45604b7 Bolt: Optimize PromptOptimizer thread pool usage (#147)
*  Bolt: Reuse ThreadPoolExecutor in PromptOptimizer

💡 What:
Initialized `ThreadPoolExecutor` in `PromptOptimizer.__init__` and reused it in `evaluate_prompt`.

🎯 Why:
The previous implementation created a new `ThreadPoolExecutor` for every call to `evaluate_prompt`. Since `evaluate_prompt` is called repeatedly inside the `optimize` loop (and for every variation), this caused significant overhead from repeatedly creating and destroying thread pools.

📊 Impact:
Benchmark showed a reduction in execution time from ~5.36s to ~3.76s (~30% improvement) for 500 iterations with a mocked LLM.

🔬 Measurement:
Ran a benchmark script executing `evaluate_prompt` 500 times.
Before: 5.36s
After: 3.76s

*  Bolt: Reuse ThreadPoolExecutor in PromptOptimizer

💡 What:
Initialized `ThreadPoolExecutor` in `PromptOptimizer.__init__` and reused it in `evaluate_prompt`. Added a `shutdown` method for proper cleanup.

🎯 Why:
The previous implementation created a new `ThreadPoolExecutor` for every call to `evaluate_prompt`. Since `evaluate_prompt` is called repeatedly inside the `optimize` loop (and for every variation), this caused significant overhead from repeatedly creating and destroying thread pools.

📊 Impact:
Benchmark showed a reduction in execution time from ~5.36s to ~3.76s (~30% improvement) for 500 iterations with a mocked LLM.

🔬 Measurement:
Ran a benchmark script executing `evaluate_prompt` 500 times.
Before: 5.36s
After: 3.76s

*  Bolt: Reuse ThreadPoolExecutor in PromptOptimizer

💡 What:
Initialized `ThreadPoolExecutor` in `PromptOptimizer.__init__` and reused it in `evaluate_prompt`. Added a `shutdown` method and wrapped execution in `try...finally` for proper resource management.

🎯 Why:
The previous implementation created a new `ThreadPoolExecutor` for every call to `evaluate_prompt`. Since `evaluate_prompt` is called repeatedly inside the `optimize` loop (and for every variation), this caused significant overhead from repeatedly creating and destroying thread pools.

📊 Impact:
Benchmark showed a reduction in execution time from ~5.36s to ~3.76s (~30% improvement) for 500 iterations with a mocked LLM.

🔬 Measurement:
Ran a benchmark script executing `evaluate_prompt` 500 times.
Before: 5.36s
After: 3.76s

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
2025-12-20 21:28:39 -05:00
google-labs-jules[bot]
70cf3f3682 Bolt: Parallelize Prompt Evaluation in optimize-prompt.py (#145)
* feat: Parallelize prompt evaluation in optimize-prompt.py

- Update `PromptOptimizer.evaluate_prompt` to use `ThreadPoolExecutor` for concurrent test case processing
- Significantly reduces total execution time when using high-latency LLM clients (network IO bound)
- Maintain accurate metric aggregation (latency, accuracy, token count) from parallel results
- This prepares the script for real-world usage where sequential execution is a major bottleneck

 Bolt: Reduces total evaluation time from O(n) to O(1) latency-wise (bounded by max_workers) for concurrent requests.

* feat: Parallelize prompt evaluation in optimize-prompt.py

- Update `PromptOptimizer.evaluate_prompt` to use `ThreadPoolExecutor` for concurrent test case processing
- Significantly reduces total execution time when using high-latency LLM clients (network IO bound)
- Maintain accurate metric aggregation (latency, accuracy, token count) from parallel results
- Ensure no generated artifacts (`optimization_results.json`) are committed

 Bolt: Reduces total evaluation time from O(n) to O(1) latency-wise (bounded by max_workers) for concurrent requests.

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
2025-12-19 09:12:15 -05:00
Kunal Shah
1305e48672 Replace GPT and Claude models to latest, better and cheaper models (#118)
* Updated GPT and Claude models to latest, better and cheaper models

* updated more files to use GPT-5 and Sonnet/Haiku 4.5 because theu are the latest, cheaper and better models
2025-11-16 20:22:36 -05:00
Seth Hobson
65e5cb093a feat: add Agent Skills and restructure documentation
- Add 47 Agent Skills across 14 plugins following Anthropic's specification
  - Python (5): async patterns, testing, packaging, performance, UV package manager
  - JavaScript/TypeScript (4): advanced types, Node.js patterns, testing, modern JS
  - Kubernetes (4): manifests, Helm charts, GitOps, security policies
  - Cloud Infrastructure (4): Terraform, multi-cloud, hybrid networking, cost optimization
  - CI/CD (4): pipeline design, GitHub Actions, GitLab CI, secrets management
  - Backend (3): API design, architecture patterns, microservices
  - LLM Applications (4): LangChain, prompt engineering, RAG, evaluation
  - Blockchain/Web3 (4): DeFi protocols, NFT standards, Solidity security, Web3 testing
  - Framework Migration (4): React, Angular, database, dependency upgrades
  - Observability (4): Prometheus, Grafana, distributed tracing, SLO
  - Payment Processing (4): Stripe, PayPal, PCI compliance, billing
  - API Scaffolding (1): FastAPI templates
  - ML Operations (1): ML pipeline workflow
  - Security (1): SAST configuration

- Restructure documentation into /docs directory
  - agent-skills.md: Complete guide to all 47 skills
  - agents.md: All 85 agents with model configuration
  - plugins.md: Complete catalog of 63 plugins
  - usage.md: Commands, workflows, and best practices
  - architecture.md: Design principles and patterns

- Update README.md
  - Add Agent Skills banner announcement
  - Reduce length by ~75% with links to detailed docs
  - Add What's New section showcasing Agent Skills
  - Add Popular Use Cases with real examples
  - Improve navigation with Core Guides and Quick Links

- Update marketplace.json with skills arrays for 14 plugins

All 47 skills follow Agent Skills Specification:
- Required YAML frontmatter (name, description)
- Use when activation clauses
- Progressive disclosure architecture
- Under 1024 character descriptions
2025-10-16 20:33:27 -04:00