mirror of
https://github.com/wshobson/agents.git
synced 2026-03-18 09:37:15 +00:00
Repository Restructure: - Move all 83 agent .md files to agents/ subdirectory - Add 15 workflow orchestrators from commands repo to workflows/ - Add 42 development tools from commands repo to tools/ - Update README for unified repository structure This prepares the repository for unified plugin marketplace integration. The commands repository functionality is now fully integrated, providing complete workflow orchestration and development tooling alongside agents. Directory Structure: - agents/ - 83 specialized AI agents - workflows/ - 15 multi-agent orchestration commands - tools/ - 42 focused development utilities No breaking changes to agent functionality - all agents remain accessible with same names and behavior. Adds workflow and tool commands for enhanced multi-agent coordination capabilities.
61 lines
1.5 KiB
Markdown
61 lines
1.5 KiB
Markdown
---
|
|
model: claude-sonnet-4-0
|
|
---
|
|
|
|
# Data Pipeline Architecture
|
|
|
|
Design and implement a scalable data pipeline for: $ARGUMENTS
|
|
|
|
Create a production-ready data pipeline including:
|
|
|
|
1. **Data Ingestion**:
|
|
- Multiple source connectors (APIs, databases, files, streams)
|
|
- Schema evolution handling
|
|
- Incremental/batch loading
|
|
- Data quality checks at ingestion
|
|
- Dead letter queue for failures
|
|
|
|
2. **Transformation Layer**:
|
|
- ETL/ELT architecture decision
|
|
- Apache Beam/Spark transformations
|
|
- Data cleansing and normalization
|
|
- Feature engineering pipeline
|
|
- Business logic implementation
|
|
|
|
3. **Orchestration**:
|
|
- Airflow/Prefect DAGs
|
|
- Dependency management
|
|
- Retry and failure handling
|
|
- SLA monitoring
|
|
- Dynamic pipeline generation
|
|
|
|
4. **Storage Strategy**:
|
|
- Data lake architecture
|
|
- Partitioning strategy
|
|
- Compression choices
|
|
- Retention policies
|
|
- Hot/cold storage tiers
|
|
|
|
5. **Streaming Pipeline**:
|
|
- Kafka/Kinesis integration
|
|
- Real-time processing
|
|
- Windowing strategies
|
|
- Late data handling
|
|
- Exactly-once semantics
|
|
|
|
6. **Data Quality**:
|
|
- Automated testing
|
|
- Data profiling
|
|
- Anomaly detection
|
|
- Lineage tracking
|
|
- Quality metrics and dashboards
|
|
|
|
7. **Performance & Scale**:
|
|
- Horizontal scaling
|
|
- Resource optimization
|
|
- Caching strategies
|
|
- Query optimization
|
|
- Cost management
|
|
|
|
Include monitoring, alerting, and data governance considerations. Make it cloud-agnostic with specific implementation examples for AWS/GCP/Azure.
|