mirror of
https://github.com/wshobson/agents.git
synced 2026-03-18 17:47:16 +00:00
Repository Restructure: - Move all 83 agent .md files to agents/ subdirectory - Add 15 workflow orchestrators from commands repo to workflows/ - Add 42 development tools from commands repo to tools/ - Update README for unified repository structure The commands repository functionality is now fully integrated, providing complete workflow orchestration and development tooling alongside agents. Directory Structure: - agents/ - 83 specialized AI agents - workflows/ - 15 multi-agent orchestration commands - tools/ - 42 focused development utilities No breaking changes to agent functionality - all agents remain accessible with same names and behavior. Adds workflow and tool commands for enhanced multi-agent coordination capabilities.
61 lines
1.5 KiB
Markdown
61 lines
1.5 KiB
Markdown
---
|
|
model: claude-sonnet-4-0
|
|
---
|
|
|
|
# Data Pipeline Architecture
|
|
|
|
Design and implement a scalable data pipeline for: $ARGUMENTS
|
|
|
|
Create a production-ready data pipeline including:
|
|
|
|
1. **Data Ingestion**:
|
|
- Multiple source connectors (APIs, databases, files, streams)
|
|
- Schema evolution handling
|
|
- Incremental/batch loading
|
|
- Data quality checks at ingestion
|
|
- Dead letter queue for failures
|
|
|
|
2. **Transformation Layer**:
|
|
- ETL/ELT architecture decision
|
|
- Apache Beam/Spark transformations
|
|
- Data cleansing and normalization
|
|
- Feature engineering pipeline
|
|
- Business logic implementation
|
|
|
|
3. **Orchestration**:
|
|
- Airflow/Prefect DAGs
|
|
- Dependency management
|
|
- Retry and failure handling
|
|
- SLA monitoring
|
|
- Dynamic pipeline generation
|
|
|
|
4. **Storage Strategy**:
|
|
- Data lake architecture
|
|
- Partitioning strategy
|
|
- Compression choices
|
|
- Retention policies
|
|
- Hot/cold storage tiers
|
|
|
|
5. **Streaming Pipeline**:
|
|
- Kafka/Kinesis integration
|
|
- Real-time processing
|
|
- Windowing strategies
|
|
- Late data handling
|
|
- Exactly-once semantics
|
|
|
|
6. **Data Quality**:
|
|
- Automated testing
|
|
- Data profiling
|
|
- Anomaly detection
|
|
- Lineage tracking
|
|
- Quality metrics and dashboards
|
|
|
|
7. **Performance & Scale**:
|
|
- Horizontal scaling
|
|
- Resource optimization
|
|
- Caching strategies
|
|
- Query optimization
|
|
- Cost management
|
|
|
|
Include monitoring, alerting, and data governance considerations. Make it cloud-agnostic with specific implementation examples for AWS/GCP/Azure.
|