Claude-skill-registry ds-brainstorm
This skill should be used when the user asks to \"start a data science project\", \"brainstorm analysis\", \"plan a data analysis\", or wants to clarify analysis requirements. REQUIRED Phase 1 of /ds workflow. Uses Socratic questioning to clarify goals, data sources, and constraints.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/ds-brainstorm" ~/.claude/skills/majiayu000-claude-skill-registry-ds-brainstorm && rm -rf "$T"
skills/data/ds-brainstorm/SKILL.mdContents
- The Iron Law of DS Brainstorming
- What Brainstorm Does
- Critical Questions to Ask
- Process
- Red Flags - STOP If You're About To
- Output
Brainstorming (Questions Only)
Refine vague analysis requests into clear objectives through Socratic questioning. NO data exploration, NO coding - just questions and objectives.
<EXTREMELY-IMPORTANT> ## The Iron Law of DS BrainstormingASK QUESTIONS BEFORE ANYTHING ELSE. This is not negotiable.
Before loading data, before exploring, before proposing approaches, you MUST:
- Ask clarifying questions using AskUserQuestion
- Understand what the user actually wants to learn
- Identify data sources and constraints
- Define success criteria
- Only THEN propose analysis approaches
STOP - You're about to load data or explore before asking questions. Don't do this. </EXTREMELY-IMPORTANT>
What Brainstorm Does
| DO | DON'T |
|---|---|
| Ask clarifying questions | Load or explore data |
| Understand analysis objectives | Run queries |
| Identify data sources | Profile data (that's /ds-plan) |
| Define success criteria | Create visualizations |
| Ask about constraints | Write analysis code |
| Check if replicating existing analysis | Propose specific methodology |
Brainstorm answers: WHAT and WHY Plan answers: HOW (data profile + tasks) (separate skill)
Critical Questions to Ask
Data Source Questions
- What data sources are available?
- Where is the data located (files, database, API)?
- What time period does the data cover?
- How frequently is the data updated?
Objective Questions
- What question are you trying to answer?
- Who is the audience for this analysis?
- What decisions will be made based on results?
- What would a successful outcome look like?
Constraint Questions
- Are you replicating an existing analysis? (Critical for methodology)
- Are there specific methodologies required?
- What is the timeline for this analysis?
- Are there computational resource constraints?
Output Questions
- What format should results be in (report, dashboard, model)?
- What visualizations are expected?
- How will results be validated?
Process
1. Ask Questions First
Employ
AskUserQuestion immediately:
- One question at a time - never batch
- Multiple-choice preferred - easier to answer
- Focus on: objectives, data sources, constraints, replication requirements
2. Identify Replication Requirements
CRITICAL: Ask early if replicating existing work:
AskUserQuestion: question: "Are you replicating or extending existing analysis?" options: - label: "Replicating existing" description: "Must match specific methodology/results" - label: "Extending existing" description: "Building on prior work with modifications" - label: "New analysis" description: "Fresh analysis, methodology flexible"
When replicating:
- Obtain reference to original (paper, code, report)
- Document exact methodology requirements
- Define acceptable deviation from original results
3. Propose Approaches
After objectives are clear:
- Propose 2-3 different approaches with trade-offs
- Lead with recommendation (mark as "Recommended")
- Use
for the user to select the preferred approachAskUserQuestion
4. Write Spec Doc
After selecting an approach:
- Write to
.claude/SPEC.md - Include: objectives, data sources, success criteria, constraints
- NO implementation details - reserve those for /ds-plan
# Spec: [Analysis Name] > **For Claude:** After writing this spec, use `Read("${CLAUDE_PLUGIN_ROOT}/lib/skills/ds-plan/SKILL.md")` for Phase 2. ## Objective [What question this analysis answers] ## Data Sources - [Source 1]: [location, format, time period] - [Source 2]: [location, format, time period] ## Success Criteria - [ ] Criterion 1 - [ ] Criterion 2 ## Constraints - Replication: [yes/no - if yes, reference source] - Timeline: [deadline] - Methodology: [required approaches] ## Chosen Approach [Description of selected approach] ## Rejected Alternatives - Option B: [why rejected] - Option C: [why rejected]
Red Flags - STOP If You Catch Yourself Doing This:
| Action | Why It's Wrong | Do Instead |
|---|---|---|
| Loading data | You're exploring before understanding goals | Ask what the user wants to learn |
| Running describe() | You're profiling data when that's for /ds-plan | Finish defining objectives first |
| Proposing specific models | You're jumping to HOW before clarifying WHAT | Define success criteria first |
| Creating task lists | You're planning before objectives are clear | Complete brainstorm first |
| Skipping replication question | You might miss critical methodology constraints | Always ask about replication upfront |
Output
Declare brainstorm complete when:
- Analysis objectives clearly understood
- Data sources identified
- Success criteria defined
- Constraints documented (especially replication requirements)
- Approach chosen from alternatives
written.claude/SPEC.md- User confirms ready for data exploration
Workflow Context
This skill is Phase 1 of the 5-phase
/ds workflow:
- Phase 1: ds-brainstorm (current) - Clarify objectives through Socratic questioning
- Phase 2: ds-plan - Profile data and break analysis into tasks
- Phase 3: ds-implement - Execute analysis tasks with output-first verification
- Phase 4: ds-review - Review methodology, data quality, and statistical validity
- Phase 5: ds-verify - Check reproducibility and obtain user acceptance
Phase Complete
After completing brainstorm, IMMEDIATELY invoke the next phase:
# Invoke Phase 2: Data profiling and task breakdown /ds-plan
Or use the Skill tool directly:
Read("${CLAUDE_PLUGIN_ROOT}/lib/skills/ds-plan/SKILL.md")
CRITICAL: Do not skip to analysis implementation. Phase 2 profiles data and breaks down the analysis into discrete, manageable tasks.