mirror of
https://github.com/wshobson/agents.git
synced 2026-03-18 09:37:15 +00:00
Plugin Scope Improvements: - Remove language-specialists plugin (not task-focused) - Split specialized-domains into 5 focused plugins: * blockchain-web3 - Smart contract development only * quantitative-trading - Financial modeling and trading only * payment-processing - Payment gateway integration only * game-development - Unity and Minecraft only * accessibility-compliance - WCAG auditing only - Split business-operations into 3 focused plugins: * business-analytics - Metrics and reporting only * hr-legal-compliance - HR and legal docs only * customer-sales-automation - Support and sales workflows only - Fix infrastructure-devops scope: * Remove database concerns (db-migrate, database-admin) * Remove observability concerns (observability-engineer) * Move slo-implement to incident-response * Focus purely on container orchestration (K8s, Docker, Terraform) - Fix customer-sales-automation scope: * Remove content-marketer (unrelated to customer/sales workflows) Marketplace Statistics: - Total plugins: 27 (was 22) - Tool coverage: 100% (42/42 tools referenced) - Fat plugins removed: 3 (language-specialists, specialized-domains, business-operations) - All plugins now have clear, focused tasks Model Migration: - Migrate all 42 tools from claude-sonnet-4-0/opus-4-1 to model: sonnet - Migrate all 15 workflows from claude-opus-4-1 to model: sonnet - Use short model syntax consistent with agent files Documentation Updates: - Update README.md with refined plugin structure - Update plugin descriptions to be task-focused - Remove anthropomorphic and marketing language - Improve category organization (now 16 distinct categories) Ready for October 9, 2025 @ 9am PST launch
68 lines
1.6 KiB
Markdown
68 lines
1.6 KiB
Markdown
---
|
|
model: sonnet
|
|
---
|
|
|
|
# AI/ML Code Review
|
|
|
|
Perform a specialized AI/ML code review for: $ARGUMENTS
|
|
|
|
Conduct comprehensive review focusing on:
|
|
|
|
1. **Model Code Quality**:
|
|
- Reproducibility checks
|
|
- Random seed management
|
|
- Data leakage detection
|
|
- Train/test split validation
|
|
- Feature engineering clarity
|
|
|
|
2. **AI Best Practices**:
|
|
- Prompt injection prevention
|
|
- Token limit handling
|
|
- Cost optimization
|
|
- Fallback strategies
|
|
- Timeout management
|
|
|
|
3. **Data Handling**:
|
|
- Privacy compliance (PII handling)
|
|
- Data versioning
|
|
- Preprocessing consistency
|
|
- Batch processing efficiency
|
|
- Memory optimization
|
|
|
|
4. **Model Management**:
|
|
- Version control for models
|
|
- A/B testing setup
|
|
- Rollback capabilities
|
|
- Performance benchmarks
|
|
- Drift detection
|
|
|
|
5. **LLM-Specific Checks**:
|
|
- Context window management
|
|
- Prompt template security
|
|
- Response validation
|
|
- Streaming implementation
|
|
- Rate limit handling
|
|
|
|
6. **Vector Database Review**:
|
|
- Embedding consistency
|
|
- Index optimization
|
|
- Query performance
|
|
- Metadata management
|
|
- Backup strategies
|
|
|
|
7. **Production Readiness**:
|
|
- GPU/CPU optimization
|
|
- Batching strategies
|
|
- Caching implementation
|
|
- Monitoring hooks
|
|
- Error recovery
|
|
|
|
8. **Testing Coverage**:
|
|
- Unit tests for preprocessing
|
|
- Integration tests for pipelines
|
|
- Model performance tests
|
|
- Edge case handling
|
|
- Mocked LLM responses
|
|
|
|
Provide specific recommendations with severity levels (Critical/High/Medium/Low). Include code examples for improvements and links to relevant best practices.
|