Claude-code-plugins-plus-skills databricks-debug-bundle
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/databricks-pack/skills/databricks-debug-bundle" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-databricks-debug-bundle && rm -rf "$T"
manifest:
plugins/saas-packs/databricks-pack/skills/databricks-debug-bundle/SKILL.mdsource content
Databricks Debug Bundle
Current State
!
databricks --version 2>/dev/null || echo 'CLI not installed'
!python3 -c "import databricks.sdk; print(f'SDK {databricks.sdk.__version__}')" 2>/dev/null || echo 'SDK not installed'
Overview
Collect all diagnostic information needed for Databricks support tickets: environment info, cluster state, cluster events, job run details, Spark driver logs, and Delta table history. Produces a redacted tar.gz bundle safe to share with support.
Prerequisites
- Databricks CLI installed and configured
- Access to cluster logs (admin or cluster owner)
- Permission to access job run details
Instructions
Step 1: Create Debug Collection Script
#!/bin/bash set -euo pipefail # databricks-debug-bundle.sh [cluster_id] [run_id] [table_name] BUNDLE_DIR="databricks-debug-$(date +%Y%m%d-%H%M%S)" mkdir -p "$BUNDLE_DIR" CLUSTER_ID="${1:-}" RUN_ID="${2:-}" TABLE_NAME="${3:-}" echo "=== Databricks Debug Bundle ===" | tee "$BUNDLE_DIR/summary.txt" echo "Generated: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> "$BUNDLE_DIR/summary.txt" echo "Workspace: ${DATABRICKS_HOST:-unset}" >> "$BUNDLE_DIR/summary.txt"
Step 2: Collect Environment Info
{ echo "" echo "--- Environment ---" echo "CLI: $(databricks --version 2>&1)" echo "SDK: $(pip show databricks-sdk 2>/dev/null | grep Version || echo 'not installed')" echo "Python: $(python3 --version 2>&1)" echo "OS: $(uname -srm)" echo "" echo "--- Current User ---" databricks current-user me --output json 2>&1 | jq '{userName, active}' || echo "Auth failed" } >> "$BUNDLE_DIR/summary.txt"
Step 3: Collect Cluster Information
if [ -n "$CLUSTER_ID" ]; then echo "" >> "$BUNDLE_DIR/summary.txt" echo "--- Cluster: $CLUSTER_ID ---" >> "$BUNDLE_DIR/summary.txt" # Full cluster config databricks clusters get --cluster-id "$CLUSTER_ID" --output json \ > "$BUNDLE_DIR/cluster_config.json" 2>&1 # Key fields summary jq '{state, spark_version, node_type_id, num_workers, autotermination_minutes, termination_reason}' \ "$BUNDLE_DIR/cluster_config.json" >> "$BUNDLE_DIR/summary.txt" # Recent cluster events (state changes, errors, resizing) databricks clusters events --cluster-id "$CLUSTER_ID" --limit 30 --output json \ > "$BUNDLE_DIR/cluster_events.json" 2>&1 # Extract event timeline jq -r '.events[]? | "\(.timestamp): \(.type) — \(.details // "no details")"' \ "$BUNDLE_DIR/cluster_events.json" >> "$BUNDLE_DIR/summary.txt" 2>/dev/null fi
Step 4: Collect Job Run Information
if [ -n "$RUN_ID" ]; then echo "" >> "$BUNDLE_DIR/summary.txt" echo "--- Run: $RUN_ID ---" >> "$BUNDLE_DIR/summary.txt" # Full run details databricks runs get --run-id "$RUN_ID" --output json \ > "$BUNDLE_DIR/run_details.json" 2>&1 # Run state summary jq '{state: .state, start_time, end_time, run_duration}' \ "$BUNDLE_DIR/run_details.json" >> "$BUNDLE_DIR/summary.txt" # Task-level breakdown jq -r '.tasks[]? | " Task \(.task_key): \(.state.result_state // "RUNNING") — \(.state.state_message // "ok")"' \ "$BUNDLE_DIR/run_details.json" >> "$BUNDLE_DIR/summary.txt" # Run output (error messages, stdout) databricks runs get-output --run-id "$RUN_ID" --output json \ > "$BUNDLE_DIR/run_output.json" 2>&1 jq '{error, error_trace: (.error_trace // "" | .[0:2000])}' \ "$BUNDLE_DIR/run_output.json" >> "$BUNDLE_DIR/summary.txt" 2>/dev/null fi
Step 5: Collect Spark Driver Logs
if [ -n "$CLUSTER_ID" ]; then echo "" >> "$BUNDLE_DIR/summary.txt" echo "--- Spark Driver Logs (last 500 lines) ---" >> "$BUNDLE_DIR/summary.txt" python3 << 'PYEOF' > "$BUNDLE_DIR/driver_logs.txt" 2>&1 from databricks.sdk import WorkspaceClient w = WorkspaceClient() try: content = w.dbfs.read("/cluster-logs/${CLUSTER_ID}/driver/log4j-active.log") # Take last 500 lines lines = content.data.decode().splitlines()[-500:] print("\n".join(lines)) except Exception as e: print(f"Could not fetch driver logs: {e}") print("Tip: Enable cluster log delivery in cluster config for persistent logs") PYEOF fi
Step 6: Collect Delta Table Diagnostics
if [ -n "$TABLE_NAME" ]; then echo "" >> "$BUNDLE_DIR/summary.txt" echo "--- Delta Table: $TABLE_NAME ---" >> "$BUNDLE_DIR/summary.txt" python3 << PYEOF > "$BUNDLE_DIR/delta_diagnostics.txt" 2>&1 from databricks.connect import DatabricksSession spark = DatabricksSession.builder.getOrCreate() print("=== Table Details ===") spark.sql("DESCRIBE DETAIL ${TABLE_NAME}").show(truncate=False) print("\n=== Recent History (last 20 operations) ===") spark.sql("DESCRIBE HISTORY ${TABLE_NAME} LIMIT 20").show(truncate=False) print("\n=== Schema ===") spark.sql("DESCRIBE ${TABLE_NAME}").show(truncate=False) print("\n=== File Stats ===") detail = spark.sql("DESCRIBE DETAIL ${TABLE_NAME}").first() print(f"Files: {detail.numFiles}, Size: {detail.sizeInBytes / 1024 / 1024:.1f} MB") PYEOF fi
Step 7: Package Bundle (Redacted)
# Redact sensitive data from config snapshot echo "" >> "$BUNDLE_DIR/summary.txt" echo "--- Config (redacted) ---" >> "$BUNDLE_DIR/summary.txt" if [ -f ~/.databrickscfg ]; then sed 's/token = .*/token = ***REDACTED***/' \ ~/.databrickscfg > "$BUNDLE_DIR/config-redacted.txt" sed -i 's/client_secret = .*/client_secret = ***REDACTED***/' \ "$BUNDLE_DIR/config-redacted.txt" fi # Network connectivity test echo "--- Network ---" >> "$BUNDLE_DIR/summary.txt" echo -n "API reachable: " >> "$BUNDLE_DIR/summary.txt" curl -s -o /dev/null -w "%{http_code}" \ "${DATABRICKS_HOST}/api/2.0/clusters/list" \ -H "Authorization: Bearer ${DATABRICKS_TOKEN}" >> "$BUNDLE_DIR/summary.txt" echo "" >> "$BUNDLE_DIR/summary.txt" # Create archive tar -czf "$BUNDLE_DIR.tar.gz" "$BUNDLE_DIR" rm -rf "$BUNDLE_DIR" echo "" echo "Bundle created: $BUNDLE_DIR.tar.gz" echo "Contents: summary.txt, cluster_config.json, cluster_events.json," echo " run_details.json, run_output.json, driver_logs.txt," echo " delta_diagnostics.txt, config-redacted.txt"
Output
containing:databricks-debug-YYYYMMDD-HHMMSS.tar.gz
— Human-readable diagnostic summarysummary.txt
— Full cluster configurationcluster_config.json
— State changes, errors, resizing eventscluster_events.json
— Job run with task-level breakdownrun_details.json
— Stdout/stderr and error tracesrun_output.json
— Last 500 lines of Spark driver logdriver_logs.txt
— Table details, history, schemadelta_diagnostics.txt
— CLI config with secrets removedconfig-redacted.txt
Error Handling
| Item | Included | Notes |
|---|---|---|
| Tokens/secrets | NEVER | Redacted with |
| PII in logs | Review before sharing | Scan driver_logs.txt manually |
| Cluster IDs | Yes | Safe to share with support |
| Error traces | Yes | Check for embedded connection strings |
Examples
Usage
# Environment only bash databricks-debug-bundle.sh # With cluster diagnostics bash databricks-debug-bundle.sh 0123-456789-abcde # With cluster + job run bash databricks-debug-bundle.sh 0123-456789-abcde 12345 # Full diagnostics including Delta table bash databricks-debug-bundle.sh 0123-456789-abcde 12345 catalog.schema.table
Submit to Support
- Generate bundle:
bash databricks-debug-bundle.sh [args] - Review
for sensitive datasummary.txt - Open ticket at help.databricks.com
- Attach the
bundle.tar.gz - Include workspace ID (found in workspace URL:
)adb-<workspace-id>
Resources
Next Steps
For rate limit issues, see
databricks-rate-limits.