Awesome-omni-skill brutal-deepresearch
Structured deep research pipeline with confirmation gates and resume support. Generates outline, launches parallel research agents, produces validated JSON results and markdown report.
git clone https://github.com/diegosouzapw/awesome-omni-skill
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data-ai/brutal-deepresearch" ~/.claude/skills/diegosouzapw-awesome-omni-skill-brutal-deepresearch && rm -rf "$T"
skills/data-ai/brutal-deepresearch/SKILL.mdStructured deep research pipeline with confirmation gates and resume support. Generates research outline from model knowledge + web search, launches parallel research agents, produces validated JSON results per item, and generates a markdown report. Supports resuming interrupted sessions.
Agent assumptions (applies to all agents and subagents):
- All tools are functional and will work without error. Do not test tools or make exploratory calls.
- Only call a tool if it is required to complete the task. Every tool call should have a clear purpose.
Brutal Deep Research Process
Step 0: Load Context & Session Setup
0.1 Load Project Target Context
Check for
TARGET.md in the project root directory.
- If
exists, read it in full and treat it as required context.TARGET.md - Do not proceed to Step 1 until this check/read has been completed.
0.2 Get Current Date
date +%Y-%m-%d
0.3 Determine Next DR Number
Find the highest existing DR number:
ls workspace/research/ 2>/dev/null | grep -oE '[0-9]{4}' | sort -rn | head -1
New DR number = highest + 1. If no research sessions exist, start at
0001.
0.4 Generate Session Slug
Generate a slug from the research topic:
- Lowercase
- Replace spaces and special characters with hyphens
- Max 40 characters
- Remove trailing hyphens
Session directory:
workspace/research/DR-<NNNN>-<slug>/
Do not create the directory yet. Wait until Gate 2 confirmation.
Step 0.5: Resume Mode (Conditional)
Trigger: User args contain the word "resume" and a path to an existing session directory.
Example invocation:
/brutal-deepresearch resume research workspace/research/DR-0001-ai-coding
If resume mode is NOT detected, skip to Step 1.
0.5.1 Read Existing Session
- Read
from the given session path to get items listoutline.yaml - Read
from the given session path to get field definitionsfields.yaml - Read
(if it exists) for previous execution contextprogress.yaml
0.5.2 Check Completed Results
ls <session_path>/results/*.json 2>/dev/null ls <session_path>/results/*.started 2>/dev/null
Determine item status:
exists → completed (skip this item).json
exists but no.started
→ interrupted (re-research this item).json- Neither exists → never started (research this item)
0.5.3 Calculate Remaining Items
Compare items in
outline.yaml against completed results:
- Log which items are completed and will be skipped
- Log which items were interrupted and will be re-researched
- Log which items never started and will be researched
0.5.4 Branch to Execution or Report
- If remaining items exist: Skip directly to Step 6 (Execute Deep Research), launching agents only for remaining items
- If all items are complete: Report "All items already completed" and skip directly to Step 7 (Report Configuration)
Resume mode skips Steps 1-4 entirely — the outline and fields are already confirmed from the previous session.
Step 1: Generate Initial Framework
Based on the user's research topic, use model knowledge to generate:
-
Items List: The main research objects/items in this domain. Each item should have:
: Item namename
: Classification (if applicable)category
: Brief description of why this item is relevantdescription
-
Field Framework: Suggested research field categories and fields per category. Each field should have:
: Field name (snake_case)name
: What this field capturesdescription
: One ofdetail_level
,brief
, ormoderatedetailed
Present the framework to the user in a readable format.
Step 2 - GATE 1: Confirm Initial Framework
This is a hard gate. Do not proceed past this step without explicit user confirmation.
Present:
- The items list with names, categories, and descriptions
- The field framework organized by category
Use AskUserQuestion to ask:
- "Are these items and fields correct? Add/remove anything?"
Hard gate: Do not proceed until user confirms. User can request additions or removals here.
Step 3: Web Search Supplement
3.1 Get Time Range
Use AskUserQuestion to ask for time range:
- Last 6 months
- Since 2024
- Since 2025
- Unlimited
3.2 Launch Web Search Agent
Launch 1 web-search-agent (background) using the Task tool with
model: sonnet and max_turns: 20.
Parameter Retrieval:
: User's research topic{topic}
: Current date from Step 0.2{YYYY-MM-DD}
: Complete output from Step 1 (items list + field framework){step1_output}
: User-specified time range{time_range}
Hard Constraint: The following prompt must be strictly reproduced, only replacing variables in
{xxx}. Do not modify structure or wording.
Prompt Template:
You are an elite internet researcher. Your task is to supplement an existing research framework with missing items and recommended fields. ## Research Methodology Before searching, determine which search strategies apply to this topic. Use the appropriate strategies from the Search Strategy Reference below. Get today's date first: date +%Y-%m-%d Generate 5-10 different search query variations to maximize coverage: - Include technical terms, product names, and common variations - Think of how different people might describe the same topic - Use exact phrases in quotes for specific names - Include version numbers and dates when relevant ## Information Gathering Standards - Read beyond the first few results - valuable information is often buried - Look for patterns across different sources - Pay attention to dates to ensure relevance - Note different approaches and their trade-offs - Identify authoritative sources and experienced contributors - Check for updated information or superseded approaches - Verify across multiple sources when possible ## Task Research topic: {topic} Current date: {YYYY-MM-DD} Based on the following initial framework, supplement latest items and recommended research fields. ## Existing Framework {step1_output} ## Goals 1. Verify if existing items are missing important objects 2. Supplement items based on missing objects 3. Continue searching for {topic} related items within {time_range} and supplement 4. Supplement new fields ## Output Requirements Return structured results directly (do not write files): ### Supplementary Items - item_name: Brief explanation (why it should be added) ... ### Recommended Supplementary Fields - field_name: Field description (why this dimension is needed) ... ### Sources - [Source1](url1) - [Source2](url2)
One-shot Example (assuming researching AI Coding History):
## Task Research topic: AI Coding History Current date: 2025-12-30 Based on the following initial framework, supplement latest items and recommended research fields. ## Existing Framework ### Items List 1. GitHub Copilot: Developed by Microsoft/GitHub, first mainstream AI coding assistant 2. Cursor: AI-first IDE, based on VSCode ... ### Field Framework - Basic Info: name, release_date, company - Technical Features: underlying_model, context_window ... ## Goals 1. Verify if existing items are missing important objects 2. Supplement items based on missing objects 3. Continue searching for AI Coding History related items within since 2024 and supplement 4. Supplement new fields ## Output Requirements Return structured results directly (do not write files): ### Supplementary Items - item_name: Brief explanation (why it should be added) ... ### Recommended Supplementary Fields - field_name: Field description (why this dimension is needed) ... ### Sources - [Source1](url1) - [Source2](url2)
3.3 Merge Findings
After the web search agent completes, merge its findings with the initial framework:
- Add supplementary items to the items list (avoid duplicates)
- Add recommended fields to the field framework
- Note sources for traceability
Step 4 - GATE 2: Confirm Final Outline
This is a hard gate. Do not proceed past this step without explicit user confirmation.
Present the merged outline:
- Complete items list (original + web search additions, clearly marked)
- Complete field framework (original + web search additions, clearly marked)
Use AskUserQuestion to confirm the outline is correct.
Add-Items/Add-Fields Loop
User can say "add X item" or "add Y field" at this gate. If they do:
- Add the requested item/field to the framework
- Re-present the updated framework
- Ask for confirmation again
Repeat until user explicitly confirms. Do not generate files until confirmed.
4.1 Create Session Directory and Write Files
After confirmation:
mkdir -p workspace/research/DR-<NNNN>-<slug>/results
Write
outline.yaml:
topic: "<research topic>" session: "DR-<NNNN>-<slug>" created: "<YYYY-MM-DD>" items: - name: "<item name>" category: "<category>" description: "<description>" # ... more items output_dir: "./results"
Write
fields.yaml:
categories: <category_name>: fields: - name: "<field_name>" description: "<field description>" detail_level: "<brief|moderate|detailed>" # ... more fields # ... more categories
Step 5: Deep Research - Preparation
5.1 Read Outline
Read
workspace/research/DR-<NNNN>-<slug>/outline.yaml to get items list.
5.2 Resume Check
Check for completed and in-progress results in
results/:
ls workspace/research/DR-<NNNN>-<slug>/results/*.json 2>/dev/null ls workspace/research/DR-<NNNN>-<slug>/results/*.started 2>/dev/null
Determine item status:
exists → completed (skip this item).json
exists but no.started
→ interrupted (re-research this item).json- Neither exists → never started (research this item)
Log which items are being resumed vs skipped. Update
progress.yaml (if it exists) with the current state before proceeding.
5.3 Prepare Execution Plan
Calculate:
- Total remaining items (after subtracting completed and in-progress items)
- Display which items will be researched and which are being skipped
Step 6: Execute Deep Research
6.1 Launch Research Agents
Launch remaining agents using the Task tool with
model: sonnet, run_in_background: true, and max_turns: 25.
Batching strategy based on remaining item count:
- 10 or fewer items: Launch ALL agents in a single parallel batch.
- More than 10 items: Split into batches of 10. Launch each batch in parallel, wait for the batch to complete (using filesystem polling per 6.4), then launch the next batch. No inter-batch user approval needed — batching is automatic.
Each agent researches one item and outputs JSON for that item.
Agent Prompt Template (per item):
Hard Constraint: The following prompt must be strictly reproduced, only replacing variables in
{xxx}. Do not modify structure or wording.
You are an elite internet researcher specializing in finding relevant information across diverse online sources. Your expertise lies in creative search strategies, thorough investigation, and comprehensive compilation of findings. ## Progress Tracking Before starting research, write a marker file to signal that this agent has started: Write an empty file to {started_path} After self-validation passes and the JSON result is confirmed correct, delete the marker file: rm {started_path} ## Research Methodology Get today's date first: date +%Y-%m-%d Generate 5-10 different search query variations to maximize coverage: - Include technical terms, product names, and common variations - Think of how different people might describe the same topic - Use exact phrases in quotes for specific names - Include version numbers and dates when relevant ### Search Strategy Reference Use the following search strategies based on what is relevant to the research topic: **GitHub/Debug Strategy** (for software, tools, technical projects): - Search GitHub Issues (open and closed) for known bugs and workarounds - Search for exact error messages in quotes - Look for issue templates that match the problem pattern - Check closed issues for resolution patterns - Identify version-specific issues **General Web Strategy** (for broad information gathering): Sources: Reddit, official documentation, blog posts, Hacker News, Dev.to, Medium, Discord, X/Twitter - Look for official recommendations first - Cross-reference with community consensus - Find examples from production use - Identify anti-patterns and common pitfalls - Note evolving best practices - Create structured comparisons with clear criteria - Find real-world usage examples and case studies - Look for performance benchmarks and user experiences **Academic Papers Strategy** (for research, algorithms, scientific topics): Sources: Google Scholar, arXiv, Hugging Face Papers, bioRxiv, ResearchGate, Semantic Scholar, ACM Digital Library, IEEE Xplore - Use Google Scholar as primary source with advanced search operators - Search by author names, paper titles, DOI numbers - Include year ranges to find seminal works and recent publications - Look for related papers and citation patterns - Search for preprints on arXiv and bioRxiv - Track citation networks to understand research evolution **Stack Overflow Strategy** (for programming, APIs, implementation): Sources: Stack Overflow, Stack Exchange, technical forums - Search for exact error messages and API names - Look for accepted answers and highly-voted alternatives - Check for version-specific solutions ## Information Gathering Standards - Read beyond the first few results - valuable information is often buried - Look for patterns in solutions across different sources - Pay attention to dates to ensure relevance (note if information is outdated) - Note different approaches and their trade-offs - Identify authoritative sources and experienced contributors - Verify information across multiple sources when possible - Clearly indicate when information is speculative or unverified ## Task Research {item_related_info}, output structured JSON to {output_path} ## Field Definitions Read {fields_path} to get all field definitions ## Output Requirements 1. Output JSON according to fields defined in fields.yaml 2. Mark uncertain field values with [uncertain] 3. Add uncertain array at the end of JSON, listing all uncertain field names 4. All field values must be in English ## Self-Validation After writing the JSON file, read it back and verify: 1. Every field defined in fields.yaml has a corresponding entry in the JSON 2. The JSON is valid (properly formatted) 3. All uncertain fields are listed in the uncertain array If validation fails, fix the JSON and re-write it. ## Output Path {output_path}
One-shot Example (assuming researching GitHub Copilot):
## Progress Tracking Before starting research, write a marker file to signal that this agent has started: Write an empty file to /home/user/workspace/research/DR-0001-ai-coding/results/GitHub_Copilot.started After self-validation passes and the JSON result is confirmed correct, delete the marker file: rm /home/user/workspace/research/DR-0001-ai-coding/results/GitHub_Copilot.started ## Task Research name: GitHub Copilot category: International Product description: Developed by Microsoft/GitHub, first mainstream AI coding assistant, ~40% market share, output structured JSON to /home/user/workspace/research/DR-0001-ai-coding/results/GitHub_Copilot.json ## Field Definitions Read /home/user/workspace/research/DR-0001-ai-coding/fields.yaml to get all field definitions ## Output Requirements 1. Output JSON according to fields defined in fields.yaml 2. Mark uncertain field values with [uncertain] 3. Add uncertain array at the end of JSON, listing all uncertain field names 4. All field values must be in English ## Self-Validation After writing the JSON file, read it back and verify: 1. Every field defined in fields.yaml has a corresponding entry in the JSON 2. The JSON is valid (properly formatted) 3. All uncertain fields are listed in the uncertain array If validation fails, fix the JSON and re-write it. ## Output Path /home/user/workspace/research/DR-0001-ai-coding/results/GitHub_Copilot.json
6.2 Parameter Construction
For each item being researched:
: The item's complete YAML content (name + category + description){item_related_info}
: Absolute path to{output_path}workspace/research/DR-<NNNN>-<slug>/results/<item_name_slug>.json- Slugify: replace spaces with
, remove special characters_
- Slugify: replace spaces with
: Absolute path to{fields_path}workspace/research/DR-<NNNN>-<slug>/fields.yaml
: Absolute path to{started_path}workspace/research/DR-<NNNN>-<slug>/results/<item_name_slug>.started
6.3 Write progress.yaml
Immediately after launching all agents, write
progress.yaml in the session directory:
status: in_progress started: "<YYYY-MM-DD HH:MM>" total_items: <N> items: - name: "<Item Name>" slug: "<Item_Name>" status: pending # ... all items being researched
Items that were already completed (skipped) should not be listed — only items that agents were launched for.
6.4 Monitor Progress (Filesystem-Based)
CRITICAL: Do NOT use TaskOutput to read agent results. Agent outputs are large (extensive web search transcripts) and reading them into the orchestrator context will cause context window exhaustion. All research results are already persisted to disk as JSON files — the orchestrator only needs to check file existence.
Polling loop — repeat until all items are resolved:
- Check completion status via filesystem:
ls <session_path>/results/*.json 2>/dev/null | wc -l ls <session_path>/results/*.started 2>/dev/null | wc -l - Calculate:
= count ofcompleted
files,.json
= count ofin_progress
files without matching.started
,.json
= total - completedremaining - Display progress: "Progress: X/Y items completed, Z still running."
- If
, wait ~30 seconds (usein_progress > 0
via Bash) then poll againsleep 30 - If
(all agents have finished — either producedin_progress == 0
or exited), exit the loop.json
After polling loop completes:
- Update
:progress.yaml- Set each item's status to
if itscompleted
file exists, or.json
if onlyfailed
exists or neither exists.started - Set overall
tostatus
if all items done, orcompleted
if some are missingpartial
- Set each item's status to
- Report final status: "All agents complete. X/Y items researched successfully."
- If any items failed, list them and suggest using resume mode to retry
Step 7: Report Configuration
7.1 Scan Summary Fields
Read all completed JSON results and identify fields suitable for TOC display:
- Numeric fields (stars, scores, citations)
- Short metric fields (dates, versions, ratings)
- Fields that appear across most/all items
7.2 Present Options
Present a dynamic options list based on actual fields found in the JSON results.
Step 7.5 - GATE 3: Confirm Report Config
This is a hard gate. Do not proceed past this step without explicit user confirmation.
Use AskUserQuestion to ask:
- Which summary fields to display in the TOC alongside item names?
- Present the available fields as options
Hard gate: Do not generate report until user confirms field selection.
Step 8: Generate Report
8.1 Generate Report Script
Generate
generate_report.py in the session directory.
The script must handle:
1. JSON Structure Compatibility
Support two JSON structures:
- Flat structure: Fields directly at top level
{"name": "xxx", "release_date": "xxx"} - Nested structure: Fields in category sub-dict
{"basic_info": {"name": "xxx"}, "technical_features": {...}}
Field lookup order: Top level -> category mapping key -> Traverse all nested dicts
2. Category Mapping
Map between fields.yaml category names and JSON keys:
CATEGORY_MAPPING = { "Basic Info": ["basic_info", "Basic Info"], "Technical Features": ["technical_features", "technical_characteristics", "Technical Features"], "Performance Metrics": ["performance_metrics", "performance", "Performance Metrics"], "Milestone Significance": ["milestone_significance", "milestones", "Milestone Significance"], "Business Info": ["business_info", "commercial_info", "Business Info"], "Competition & Ecosystem": ["competition_ecosystem", "competition", "Competition & Ecosystem"], "History": ["history", "History"], "Market Positioning": ["market_positioning", "market", "Market Positioning"], }
3. Complex Value Formatting
- List of dicts (e.g., key_events, funding_history): Format each dict as one line, separate kv with
| - Normal list: Short lists joined with comma, long lists displayed with line breaks
- Nested dict: Recursive formatting, display with semicolon or line breaks
- Long text strings (over 100 chars): Add line breaks
or use blockquote format for readability<br>
4. Extra Fields Collection
Collect fields that exist in JSON but not defined in fields.yaml, put in "Other Info" category. Filter out:
- Internal fields:
,_source_fileuncertain - Nested structure top-level keys matching category names
array: Display each field name on separate lineuncertain
5. Uncertain Value Skipping
Skip conditions:
- Field value contains
string[uncertain] - Field name is in
arrayuncertain - Field value is None or empty string
6. Report Format
The generated report.md must have:
- Table of Contents: Every item with number, name (anchor link), and user-selected summary fields
- Example:
1. [GitHub Copilot](#github-copilot) - Stars: 10k | Score: 85%
- Example:
- Detailed Sections: One section per item, organized by field category
- Each category as a subsection heading
- Each field as a labeled entry
8.2 Execute Script
python workspace/research/DR-<NNNN>-<slug>/generate_report.py
Step 9: Present Summary
Present completion stats:
## Research Complete: <topic> **Session**: workspace/research/DR-<NNNN>-<slug>/ **Items Researched**: <count> ### Output Files - outline.yaml - Research outline and items list - fields.yaml - Field definitions - progress.yaml - Execution progress tracking - results/ - JSON results per item (<count> files) - generate_report.py - Report generation script - report.md - Final markdown report ### Items with High Uncertainty - <item name>: <count> uncertain fields - ...
Search Strategy Reference
These strategies are used by research agents to systematically explore information sources.
GitHub Debug Strategy
Trigger: Software bugs, error debugging, issue lookup, version-specific problems
Sources: GitHub Issues (open and closed)
Query Strategy:
- Search for exact error messages in quotes
- Look for issue templates that match the problem pattern
- Find workarounds, not just explanations
- Check if it's a known bug with existing patches or PRs
- Look for similar issues even if not exact matches
- Identify if the issue is version-specific
- Search for both the library name + error and more general descriptions
- Check closed issues for resolution patterns
General Web Strategy
Trigger: General information, news, product comparisons, best practices
Sources:
- Reddit (r/programming, r/webdev, r/javascript, topic-specific subreddits) - real-world experiences
- Official documentation and changelogs - authoritative information
- Blog posts and tutorials - detailed explanations
- Hacker News - high-quality technical discourse
- Dev.to - developer community with technical articles
- Medium - technical blog platform with in-depth articles
- Discord - official discussion channels for open source projects
- X/Twitter - technical announcements and discussions from developers
Query Strategy:
- Look for official recommendations first
- Cross-reference with community consensus
- Find examples from production codebases
- Identify anti-patterns and common pitfalls
- Note evolving best practices and deprecated approaches
- Create structured comparisons with clear criteria
- Find real-world usage examples and case studies
- Look for performance benchmarks and user experiences
- Identify trade-offs and decision factors
- Consider scalability, maintenance, and learning curve
Academic Papers Strategy
Trigger: Paper search, academic research, algorithm fundamentals
Sources:
- Google Scholar (scholar.google.com) - comprehensive academic search engine
- arXiv (arxiv.org) - preprints in physics, math, CS, and related fields
- Hugging Face Papers (huggingface.co/papers) - trending ML/AI papers with community upvotes
- bioRxiv (biorxiv.org) - preprints in biology and life sciences
- ResearchGate (researchgate.net) - academic social network with papers and author profiles
- Semantic Scholar (semanticscholar.org) - AI-powered academic search
- ACM Digital Library and IEEE Xplore - CS and engineering papers
Query Strategy:
- Use Google Scholar as primary source with advanced search operators
- Search by author names, paper titles, DOI numbers, institutions, and publication years
- Use quotation marks for exact titles and author name combinations
- Include year ranges to find seminal works and recent publications
- Look for related papers and citation patterns to identify seminal works
- Search for preprints on arXiv, bioRxiv, and institutional repositories
- Check author profiles and ResearchGate for publications and PDFs
- Identify open-access versions and legal paper download sources
- Track citation networks to understand research evolution
- Note impact factors, h-index, and citation counts for relevance assessment
- Search for conference proceedings, journals, and workshop papers
Stack Overflow Strategy
Trigger: Programming Q&A, code implementation, API usage
Sources:
- Stack Overflow and other Stack Exchange sites - technical Q&A
- Technical forums and discussion boards - community wisdom
Query Strategy:
- Search for exact error messages and API signatures
- Look for accepted answers and highly-voted alternatives
- Check for version-specific solutions and deprecated approaches
- Cross-reference with official documentation
- Note common pitfalls mentioned in answers
Mindset
You are not here to rubber-stamp research topics. You are here to ensure comprehensive, structured, high-quality research output. Every gap you miss is a blind spot in the final report.
Be direct. Be thorough. Be systematic.
Do not:
- Accept vague topics without clarification
- Skip confirmation gates
- Launch research agents before user confirms the outline
- Generate files before Gate 2 confirmation
- Generate reports before Gate 3 confirmation
- Use TaskOutput to read research agent results (causes context exhaustion — use filesystem polling instead)
- Launch more than 10 agents simultaneously (use automatic batching for larger sets)
Do:
- Generate comprehensive initial frameworks from domain knowledge
- Supplement with web search to catch missing items
- Present clear, structured summaries at each gate
- Support add-items/add-fields within Gate 2
- Track uncertain values across all results
- Generate reproducible report scripts
- Track progress in progress.yaml throughout execution
- Write .started markers before each agent begins research
- Support resume from any interrupted state