Bolt: Optimize PromptOptimizer thread pool usage (#147)

*  Bolt: Reuse ThreadPoolExecutor in PromptOptimizer

💡 What:
Initialized `ThreadPoolExecutor` in `PromptOptimizer.__init__` and reused it in `evaluate_prompt`.

🎯 Why:
The previous implementation created a new `ThreadPoolExecutor` for every call to `evaluate_prompt`. Since `evaluate_prompt` is called repeatedly inside the `optimize` loop (and for every variation), this caused significant overhead from repeatedly creating and destroying thread pools.

📊 Impact:
Benchmark showed a reduction in execution time from ~5.36s to ~3.76s (~30% improvement) for 500 iterations with a mocked LLM.

🔬 Measurement:
Ran a benchmark script executing `evaluate_prompt` 500 times.
Before: 5.36s
After: 3.76s

*  Bolt: Reuse ThreadPoolExecutor in PromptOptimizer

💡 What:
Initialized `ThreadPoolExecutor` in `PromptOptimizer.__init__` and reused it in `evaluate_prompt`. Added a `shutdown` method for proper cleanup.

🎯 Why:
The previous implementation created a new `ThreadPoolExecutor` for every call to `evaluate_prompt`. Since `evaluate_prompt` is called repeatedly inside the `optimize` loop (and for every variation), this caused significant overhead from repeatedly creating and destroying thread pools.

📊 Impact:
Benchmark showed a reduction in execution time from ~5.36s to ~3.76s (~30% improvement) for 500 iterations with a mocked LLM.

🔬 Measurement:
Ran a benchmark script executing `evaluate_prompt` 500 times.
Before: 5.36s
After: 3.76s

*  Bolt: Reuse ThreadPoolExecutor in PromptOptimizer

💡 What:
Initialized `ThreadPoolExecutor` in `PromptOptimizer.__init__` and reused it in `evaluate_prompt`. Added a `shutdown` method and wrapped execution in `try...finally` for proper resource management.

🎯 Why:
The previous implementation created a new `ThreadPoolExecutor` for every call to `evaluate_prompt`. Since `evaluate_prompt` is called repeatedly inside the `optimize` loop (and for every variation), this caused significant overhead from repeatedly creating and destroying thread pools.

📊 Impact:
Benchmark showed a reduction in execution time from ~5.36s to ~3.76s (~30% improvement) for 500 iterations with a mocked LLM.

🔬 Measurement:
Ran a benchmark script executing `evaluate_prompt` 500 times.
Before: 5.36s
After: 3.76s

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
This commit is contained in:
google-labs-jules[bot]
2025-12-20 21:28:39 -05:00
committed by GitHub
parent 70cf3f3682
commit fda45604b7
2 changed files with 19 additions and 9 deletions

View File

@@ -25,6 +25,11 @@ class PromptOptimizer:
self.client = llm_client
self.test_suite = test_suite
self.results_history = []
self.executor = ThreadPoolExecutor()
def shutdown(self):
"""Shutdown the thread pool executor."""
self.executor.shutdown(wait=True)
def evaluate_prompt(self, prompt_template: str, test_cases: List[TestCase] = None) -> Dict[str, float]:
"""Evaluate a prompt template against test cases in parallel."""
@@ -63,8 +68,7 @@ class PromptOptimizer:
}
# Run test cases in parallel
with ThreadPoolExecutor() as executor:
results = list(executor.map(process_test_case, test_cases))
results = list(self.executor.map(process_test_case, test_cases))
# Aggregate metrics
for result in results:
@@ -247,16 +251,19 @@ def main():
optimizer = PromptOptimizer(MockLLMClient(), test_suite)
base_prompt = "Classify the sentiment of: {text}\nSentiment:"
try:
base_prompt = "Classify the sentiment of: {text}\nSentiment:"
results = optimizer.optimize(base_prompt)
results = optimizer.optimize(base_prompt)
print("\n" + "="*50)
print("Optimization Complete!")
print(f"Best Accuracy: {results['best_score']:.2f}")
print(f"Best Prompt:\n{results['best_prompt']}")
print("\n" + "="*50)
print("Optimization Complete!")
print(f"Best Accuracy: {results['best_score']:.2f}")
print(f"Best Prompt:\n{results['best_prompt']}")
optimizer.export_results('optimization_results.json')
optimizer.export_results('optimization_results.json')
finally:
optimizer.shutdown()
if __name__ == '__main__':