Claude-skill-registry authoring-dags
Workflow and best practices for writing Apache Airflow DAGs. Use when the user wants to create a new DAG, write pipeline code, or asks about DAG patterns and conventions. For testing and debugging DAGs, see the testing-dags skill.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/authoring-dags" ~/.claude/skills/majiayu000-claude-skill-registry-authoring-dags && rm -rf "$T"
skills/data/authoring-dags/SKILL.mdDAG Authoring Skill
This skill guides you through creating and validating Airflow DAGs using best practices and MCP tools.
For testing and debugging DAGs, see the testing-dags skill which covers the full test → debug → fix → retest workflow.
⚠️ CRITICAL WARNING: Use MCP Tools, NOT CLI Commands ⚠️
STOP! Before running ANY Airflow-related command, read this.
You MUST use MCP tools for ALL Airflow interactions. CLI commands like
,astro dev run, or shell commands to read logs are FORBIDDEN.airflow dagsWhy? MCP tools provide structured, reliable output. CLI commands are fragile, produce unstructured text, and often fail silently.
CLI vs MCP Quick Reference
ALWAYS use Airflow MCP tools. NEVER use CLI commands.
| ❌ DO NOT USE | ✅ USE INSTEAD |
|---|---|
| MCP tool |
| MCP tool |
| MCP tool |
| MCP tool |
/ on Airflow logs | MCP tool |
in dags folder | or MCP tool |
Any | Equivalent MCP tool |
Any CLI | Equivalent MCP tool |
on | or MCP tool |
to filter MCP results | Read the JSON directly from MCP response |
Remember:
- ✅ Airflow is ALREADY running — the MCP server handles the connection
- ❌ Do NOT attempt to start, stop, or manage the Airflow environment
- ❌ Do NOT use shell commands to check DAG status, logs, or errors
- ❌ Do NOT use bash to parse or filter MCP tool results — read the JSON directly
- ❌ Do NOT use
,ls
, orfind
on Airflow container paths (cat
)/usr/local/airflow/... - ✅ ALWAYS use MCP tools — they return structured JSON you can read directly
Workflow Overview
┌─────────────────────────────────────┐ │ 1. DISCOVER │ │ Understand codebase & environment│ └─────────────────────────────────────┘ ↓ ┌─────────────────────────────────────┐ │ 2. PLAN │ │ Propose structure, get approval │ └─────────────────────────────────────┘ ↓ ┌─────────────────────────────────────┐ │ 3. IMPLEMENT │ │ Write DAG following patterns │ └─────────────────────────────────────┘ ↓ ┌─────────────────────────────────────┐ │ 4. VALIDATE │ │ Check import errors, warnings │ └─────────────────────────────────────┘ ↓ ┌─────────────────────────────────────┐ │ 5. TEST (with user consent) │ │ Trigger, monitor, check logs │ └─────────────────────────────────────┘ ↓ ┌─────────────────────────────────────┐ │ 6. ITERATE │ │ Fix issues, re-validate │ └─────────────────────────────────────┘
Phase 1: Discover
Before writing code, understand the context.
Explore the Codebase
Use file tools to find existing patterns:
forGlob
to find existing DAGs**/dags/**/*.py
similar DAGs to understand conventionsRead- Check
for available packagesrequirements.txt
Query the Airflow Environment
Use MCP tools to understand what's available:
| Tool | Purpose |
|---|---|
| What external systems are configured |
| What configuration values exist |
| What operator packages are installed |
| Version constraints and features |
| Existing DAGs and naming conventions |
| Resource pools for concurrency |
Example discovery questions:
- "Is there a Snowflake connection?" →
list_connections - "What Airflow version?" →
get_airflow_version - "Are S3 operators available?" →
list_providers
Phase 2: Plan
Based on discovery, propose:
- DAG structure - Tasks, dependencies, schedule
- Operators to use - Based on available providers
- Connections needed - Existing or to be created
- Variables needed - Existing or to be created
- Packages needed - Additions to requirements.txt
Get user approval before implementing.
Phase 3: Implement
Write the DAG following best practices (see below). Key steps:
- Create DAG file in appropriate location
- Update
if neededrequirements.txt - Save the file
Phase 4: Validate
Use the Airflow MCP as a feedback loop. Do NOT use CLI commands.
Step 1: Check Import Errors
After saving, call the MCP tool (Airflow will have already parsed the file):
MCP tool:
list_import_errors
- If your file appears → fix and retry
- If no errors → continue
Common causes: missing imports, syntax errors, missing packages.
Step 2: Verify DAG Exists
MCP tool:
get_dag_details(dag_id="your_dag_id")
Check: DAG exists, schedule correct, tags set, paused status.
Step 3: Check Warnings
MCP tool:
list_dag_warnings
Look for deprecation warnings or configuration issues.
Step 4: Explore DAG Structure
MCP tool:
explore_dag(dag_id="your_dag_id")
Returns in one call: metadata, tasks, dependencies, source code.
Phase 5: Test
📘 See the testing-dags skill for comprehensive testing guidance.
Once validation passes, test the DAG using the workflow in the testing-dags skill:
- Get user consent — Always ask before triggering
- Trigger and wait — Use
trigger_dag_and_wait(dag_id, timeout=300) - Analyze results — Check success/failure status
- Debug if needed — Use
anddiagnose_dag_runget_task_logs
Quick Test (Minimal)
# Ask user first, then: trigger_dag_and_wait(dag_id="your_dag_id", timeout=300)
For the full test → debug → fix → retest loop, see testing-dags.
Phase 6: Iterate
If issues found:
- Fix the code
- Check for import errors with
MCP toollist_import_errors - Re-validate using MCP tools (Phase 4)
- Re-test using the testing-dags skill workflow (Phase 5)
Never use CLI commands to check status or logs. Always use MCP tools.
MCP Tools Quick Reference
| Phase | Tool | Purpose |
|---|---|---|
| Discover | | Available connections |
| Discover | | Configuration values |
| Discover | | Installed operators |
| Discover | | Version info |
| Validate | | Parse errors (check first!) |
| Validate | | Verify DAG config |
| Validate | | Configuration warnings |
| Validate | | Full DAG inspection |
Testing tools — See the testing-dags skill for
,trigger_dag_and_wait,diagnose_dag_run, etc.get_task_logs
Best Practices & Anti-Patterns
For detailed code examples and patterns, see reference/best-practices.md.
Key topics covered:
- TaskFlow API usage
- Credentials management (connections, variables)
- Provider operators
- Idempotency patterns
- Data intervals
- Task groups
- Setup/Teardown patterns
- Data quality checks
- Anti-patterns to avoid
Related Skills
- testing-dags: For testing DAGs, debugging failures, and the test → fix → retest loop
- debugging-dags: For troubleshooting failed DAGs
- migrating-airflow-2-to-3: For migrating DAGs to Airflow 3