* feat: Parallelize prompt evaluation in optimize-prompt.py
- Update `PromptOptimizer.evaluate_prompt` to use `ThreadPoolExecutor` for concurrent test case processing
- Significantly reduces total execution time when using high-latency LLM clients (network IO bound)
- Maintain accurate metric aggregation (latency, accuracy, token count) from parallel results
- This prepares the script for real-world usage where sequential execution is a major bottleneck
⚡ Bolt: Reduces total evaluation time from O(n) to O(1) latency-wise (bounded by max_workers) for concurrent requests.
* feat: Parallelize prompt evaluation in optimize-prompt.py
- Update `PromptOptimizer.evaluate_prompt` to use `ThreadPoolExecutor` for concurrent test case processing
- Significantly reduces total execution time when using high-latency LLM clients (network IO bound)
- Maintain accurate metric aggregation (latency, accuracy, token count) from parallel results
- Ensure no generated artifacts (`optimization_results.json`) are committed
⚡ Bolt: Reduces total evaluation time from O(n) to O(1) latency-wise (bounded by max_workers) for concurrent requests.
---------
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
* Updated GPT and Claude models to latest, better and cheaper models
* updated more files to use GPT-5 and Sonnet/Haiku 4.5 because theu are the latest, cheaper and better models