Learn-skills.dev multi-ai-research
Comprehensive research and analysis using Claude (subagents), Gemini CLI, and Codex CLI. Multi-perspective research with cross-verification, iterative refinement, and 100% citation coverage. Use for security analysis, architecture research, code quality assessment, performance analysis, or any research requiring rigorous verification and multiple AI perspectives.
git clone https://github.com/NeverSight/learn-skills.dev
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/adaptationio/skrillz/multi-ai-research" ~/.claude/skills/neversight-learn-skills-dev-multi-ai-research && rm -rf "$T"
data/skills-md/adaptationio/skrillz/multi-ai-research/SKILL.mdMulti-AI Research & Analysis
Overview
Harnesses three AI systems (Claude via Task tool, Gemini CLI, Codex CLI) for comprehensive research and analysis with multi-perspective verification and iterative refinement.
Purpose: Produce analysis more thorough than any single AI could achieve through specialized roles, cross-validation, and systematic verification.
Key Innovation: Not just parallel execution - specialized research roles with cross-verification and iterative refinement until production-ready (quality ≥95/100, 100% citations, zero gaps).
The 3 AI Systems:
- Claude Subagents (via Task tool) - Documentation, codebase analysis, synthesis
- Gemini CLI - Web research, latest trends, community practices
- Codex CLI - GitHub patterns, code examples, deep reasoning
Quality Guarantees:
- ✓ 100% coverage - All objectives addressed, zero gaps
- ✓ 100% citations - Every claim sourced (file:line or URL)
- ✓ Multi-perspective - 3 AI systems cross-validated
- ✓ ≥95/100 quality - Verified through 3-pass system
- ✓ Actionable - Specific recommendations with examples
- ✓ Resumable - External memory enables multi-session work
When to Use
Use this skill for:
Security Analysis
- Authentication/authorization assessment
- Vulnerability identification
- Best practice validation
- OWASP Top 10 coverage
- Penetration testing preparation
Architecture Analysis
- System design review
- Component mapping
- Integration pattern analysis
- Scalability assessment
- Technical debt evaluation
Code Quality Analysis
- Pattern detection
- Code smell identification
- Complexity metrics
- Refactoring opportunities
- Best practice adherence
Performance Analysis
- Bottleneck identification
- Algorithm complexity
- Resource usage patterns
- Optimization opportunities
- Benchmark analysis
Research Synthesis
- Multi-source research compilation
- Best practice identification
- Technology evaluation
- Pattern discovery
- Trend analysis
Comprehensive Reviews
- Pre-production audit
- System health check
- Compliance verification
- Documentation audit
- Knowledge transfer
Quick Start
Option 1: Automated Script
# Run complete analysis automatically bash .claude/skills/multi-ai-research/scripts/analyze.sh "Security analysis of authentication system"
This will:
- Create analysis plan
- Launch parallel research (Claude + Gemini + Codex)
- Perform deep analysis
- Synthesize and verify
- Iterate if needed
- Generate final report
Option 2: Interactive Mode
Ask Claude Code to use this skill:
"Use multi-ai-research to analyze [objective]"
Claude will:
- Create comprehensive analysis plan
- Coordinate all three AI systems
- Synthesize findings
- Verify quality
- Iterate until ≥95 quality
- Deliver final report
The 5-Phase Pipeline
Phase 1: Planning & Strategy
Duration: 5-10 minutes Output:
.analysis/ANALYSIS_PLAN.md
Claude creates comprehensive plan:
- Defines objectives and scope
- Plans file reading strategy (glob → grep → read)
- Assigns tasks to AI systems
- Sets verification criteria
- Defines success thresholds
Phase 2: Parallel Research
Duration: 10-20 minutes Output:
.analysis/research/*.md
All three systems research simultaneously:
Claude Subagent:
- Official documentation analysis
- Codebase examination (progressive disclosure)
- Architecture mapping
- Pattern identification
Gemini CLI:
- Web research (latest 2024-2025)
- Community best practices
- Industry trends
- Common pitfalls
Codex CLI:
- GitHub pattern analysis
- Code examples from top repos
- Implementation references
- Testing strategies
Phase 3: Deep Analysis
Duration: 15-30 minutes Output:
.analysis/analysis/code-patterns.md
Claude Analysis Agent with extended thinking:
- Progressive codebase analysis
- Pattern recognition across sources
- Architecture mapping
- Metrics calculation
- Risk assessment
Phase 4: Synthesis & Verification
Duration: 10-20 minutes Outputs:
.analysis/SYNTHESIS_REPORT.md.analysis/verification/cross-check.md
Synthesis (Claude with extended thinking):
- Read all research findings
- Identify themes across sources
- Resolve contradictions
- Create unified narrative
- Full citations
Verification (Verification Subagent):
- 3-pass verification (completeness, accuracy, quality)
- Cross-source validation
- Citation checking
- Gap analysis
- Quality scoring
Phase 5: Iteration (if needed)
Duration: 10-30 minutes Output:
.analysis/iterations/ITERATION_2.md
If quality <95 or gaps exist:
- Targeted research for gaps
- Quality improvements
- Re-verification
- Repeat until ≥95
Phase 6: Final Report
Duration: 5-10 minutes Output:
.analysis/ANALYSIS_FINAL.md
Comprehensive final report:
- Executive summary
- Complete findings
- All sources synthesized
- Prioritized recommendations
- Implementation guidance
- Full citations
Total Time: 45-90 minutes for comprehensive analysis
Analysis Types
Security Analysis
What it checks:
- Authentication/authorization patterns
- Input validation
- Secret management
- Injection vulnerabilities (SQL, XSS, etc.)
- Dependency vulnerabilities
- Rate limiting
- Session security
Example:
Use multi-ai-research for "Security audit of authentication system"
Output:
- Critical/High/Medium/Low priority issues
- OWASP Top 10 coverage
- Code examples with file:line
- Specific remediation steps
- Industry best practices comparison
Architecture Analysis
What it examines:
- System components and boundaries
- Integration patterns
- Data flow
- Dependency relationships
- Scalability considerations
- Design patterns used
Example:
Use multi-ai-research for "Architecture analysis of microservices system"
Output:
- Component map with relationships
- Integration pattern analysis
- Scalability assessment
- Technical debt identification
- Refactoring recommendations
Code Quality Analysis
What it analyzes:
- Code patterns and organization
- Complexity metrics
- Code smells
- Best practice adherence
- Test coverage
- Documentation quality
Example:
Use multi-ai-research for "Code quality assessment for ./src"
Output:
- Quality score with breakdown
- Pattern analysis
- Refactoring priorities
- Specific code improvements
- Complexity hotspots
Performance Analysis
What it identifies:
- Algorithm complexity
- Bottlenecks
- Resource usage patterns
- Database query efficiency
- Network call patterns
Example:
Use multi-ai-research for "Performance bottleneck identification"
Output:
- Bottleneck analysis with file:line
- Optimization opportunities
- Before/after estimations
- Implementation guidance
Research Synthesis
What it compiles:
- Official documentation
- Web best practices
- GitHub patterns
- Industry standards
- Community insights
Example:
Use multi-ai-research for "Research GraphQL federation patterns 2024-2025"
Output:
- Multi-source synthesis
- Consensus findings (all sources agree)
- Multiple perspectives (sources differ)
- Code examples
- Implementation recommendations
How It Works
Progressive Disclosure
Never reads files blindly. Always uses 3-level approach:
Level 1: Metadata (glob) - ~50 tokens
glob "**/*.{ts,js,py}" # Understand structure glob "**/*.md" # Find documentation glob "**/package.json" # Check dependencies
Level 2: Patterns (grep) - ~5k tokens
grep "export class|interface" --glob "**/*.ts" grep "TODO|FIXME|BUG" --glob "**/*" grep "password|secret|token" --glob "**/*.ts"
Level 3: Reading (read) - ~50k tokens
read "src/auth/login.ts" # Only critical files read "docs/architecture.md"
Result: 90%+ reduction in unnecessary file reads
External Memory Architecture
All state saved to files, not context:
.analysis/ ├── ANALYSIS_PLAN.md # Strategy and assignments ├── research/ │ ├── claude-docs.md # Claude research │ ├── gemini-web.md # Gemini research │ └── codex-github.md # Codex research ├── analysis/ │ ├── code-patterns.md # Pattern analysis │ └── architecture-map.md # System map ├── verification/ │ └── cross-check.md # Verification results ├── iterations/ │ ├── ITERATION_1.md # First pass │ └── ITERATION_2.md # Gap fills └── ANALYSIS_FINAL.md # Complete report
Benefits:
- Survives context window limits
- Enables multi-session analysis
- Resumable from any checkpoint
- No information loss
Cross-Validation Pattern
High Confidence (★★★★★): All 3 sources agree + code verification Medium Confidence (★★★☆☆): 2/3 sources agree Requires Investigation (★★☆☆☆): Sources conflict
Example:
## JWT Implementation (High Confidence ★★★★★) **Claude**: "Uses JWT with HS256" (src/auth/jwt.ts:15) **Gemini**: "HS256 is industry standard 2024" (URL) **Codex**: "150+ repos use HS256 pattern" (GitHub) **Code**: Verified at src/auth/jwt.ts:18-22 **Recommendation**: Implementation correct per standards
Quality Scoring
Comprehensive rubric (0-100):
- Comprehensiveness (/20): All aspects covered
- Accuracy (/20): All claims sourced and verified
- Specificity (/20): File:line precision, not vague
- Actionability (/20): Specific recommendations
- Consistency (/20): No contradictions
Quality Gates:
- ≥95: Production-ready
- 85-94: Needs minor refinement
- 75-84: Needs iteration
- <75: Requires rework
Iterative Refinement
Iteration 1 (Breadth): Broad coverage, identifies gaps Iteration 2 (Depth): Fill gaps, improve quality Iteration 3 (Polish): Final verification, perfection
Automatic iteration until:
- Quality ≥95
- Citation coverage = 100%
- Critical gaps = 0
AI System Roles
Claude Subagents (via Task tool)
Research Agent (Haiku):
- Progressive disclosure expert
- Documentation analysis
- Codebase examination
- Pattern detection
Analysis Agent (Sonnet):
- Extended thinking for synthesis
- Multi-source integration
- Pattern recognition
- Architectural insights
Verification Agent (Haiku):
- 3-pass verification
- Citation checking
- Gap analysis
- Quality scoring
Gemini CLI
Strengths:
- Native web search
- Latest trends (2024-2025)
- Community practices
- Multimodal analysis (if needed)
Use for:
- Best practice research
- Industry standards
- Latest vulnerabilities
- Framework comparisons
Codex CLI
Strengths:
- GitHub integration
- Code pattern search
- Deep reasoning (o3 model)
- Implementation examples
Use for:
- Code examples
- Design patterns
- Architecture reasoning
- Testing strategies
Configuration
Prerequisites
Required:
- Claude Code (with Task tool access)
Optional but Recommended:
- Gemini CLI:
npm install -g @google/gemini-cli - Codex CLI:
npm install -g @openai/codex
Note: Skill works with Claude-only fallback if Gemini/Codex unavailable.
Gemini CLI Setup
# Install npm install -g @google/gemini-cli # Authenticate (OAuth - free) gemini # Follow browser authentication # Test gemini -p "test prompt"
Codex CLI Setup
# Install npm install -g @openai/codex # Authenticate (ChatGPT Plus/Pro account) codex login # Follow browser authentication # Test codex exec "test prompt"
Model Selection
Claude:
- Haiku: Research & verification (fast, efficient)
- Sonnet: Analysis & synthesis (balanced)
- Opus: Complex reasoning (if needed)
Gemini:
- gemini-2.5-flash: Quick research
- gemini-2.5-pro: Complex analysis
Codex:
- gpt-5.1-codex: Standard tasks
- o3: Deep architectural reasoning
- o4-mini: Quick operations
Examples
Example 1: Security Analysis
Objective: "Security audit of authentication system" Phase 2 - Parallel Research: ├─ Claude: Analyzes src/auth/* for patterns ├─ Gemini: Researches "OAuth 2.0 security best practices 2024" └─ Codex: Finds GitHub examples of secure auth Phase 3 - Analysis: └─ Claude: Identifies 3 critical, 5 high priority issues Phase 4 - Synthesis: └─ All agree: Missing rate limiting (CRITICAL) - Claude: No rate limit found in src/auth/login.ts - Gemini: OWASP recommends max 5 attempts/hour - Codex: 150+ repos use express-rate-limit - Recommendation: Implement with Redis backend Final Report: ├─ Executive summary ├─ 8 issues (3 critical, 5 high) with fixes ├─ OWASP Top 10 coverage ├─ Specific code examples └─ Priority implementation plan Quality: 97/100 ✓
Example 2: Architecture Analysis
Objective: "Analyze microservices architecture" Phase 2: ├─ Claude: Maps services via glob + grep ├─ Gemini: Researches microservices patterns 2024 └─ Codex: Finds service mesh examples Phase 3: └─ Claude: Identifies 7 services, 12 integration points Phase 4: └─ Synthesis: Service communication patterns - Consensus: REST for external, gRPC for internal - Trade-offs documented - Scaling strategies from Codex examples Final Report: ├─ Component map (7 services, dependencies) ├─ Integration analysis (12 patterns) ├─ Scalability assessment └─ Modernization recommendations Quality: 96/100 ✓
Example 3: Research Synthesis
Objective: "Research state management patterns for React 2024" Phase 2: ├─ Claude: Reviews React docs + examples ├─ Gemini: Web research "React state management 2024" └─ Codex: Analyzes top 50 React repos Phase 3: └─ Pattern analysis: 5 major approaches identified Phase 4: └─ Synthesis by use case: - Small apps: Context (all sources agree) - Medium apps: Zustand (Gemini + Codex recommend) - Large apps: Redux Toolkit (battle-tested, Codex data) - Server state: TanStack Query (trending, Gemini research) Final Report: ├─ Decision tree by project size ├─ Pros/cons with sources ├─ Migration strategies └─ Code examples from Codex Quality: 98/100 ✓
Best Practices
1. Be Specific with Objectives
❌ "Analyze the code" ✅ "Security analysis of authentication module for OWASP Top 10 compliance"
2. Trust the Verification
Multi-pass verification catches issues. If quality <95, iteration happens automatically.
3. Review External Memory
Check
.analysis/ folder during execution to see progress.
4. Leverage Citations
Every claim has file:line or URL. Use for validation and deep dives.
5. Multi-Session Projects
Large projects can span sessions:
Session 1: Initial analysis → ITERATION_1.md Session 2: Gap filling → ITERATION_2.md Session 3: Final polish → ANALYSIS_FINAL.md
6. Check All Three Perspectives
High-value insights often come from comparing AI perspectives.
Troubleshooting
Low Quality Score (<95)
Cause: Gaps in coverage or missing citations Solution: Automatic iteration 2 fills gaps Check:
.analysis/verification/cross-check.md for details
Missing Citations
Cause: Verification flags uncited claims Solution: Iteration adds missing attributions Prevention: All agents trained to cite sources
Gemini/Codex Unavailable
Fallback: Claude-only analysis with warning Impact: Reduced perspectives but still comprehensive Install:
npm install -g @google/gemini-cli @openai/codex
Conflicting Information
Resolution: Synthesis phase investigates conflicts Method: Check ground truth (actual code/docs) Output: Documented reasoning for resolution
Related Skills
: Anthropic product expertiseanthropic-expert
: Codex integration patternscodex-cli
: Gemini integration patternsgemini-cli
: General tri-AI workflowstri-ai-collaboration
: Code/skill/process analysisanalysis
Quick Reference
Command Line
# Full automated analysis bash .claude/skills/multi-ai-research/scripts/analyze.sh "objective" # Interactive with Claude Code # Just ask: "Use multi-ai-research for [objective]"
File Locations
| File | Purpose |
|---|---|
| Strategy and assignments |
| All AI research outputs |
| Multi-source synthesis |
| Complete final report |
Quality Metrics
| Metric | Threshold | Meaning |
|---|---|---|
| Quality Score | ≥95/100 | Production-ready |
| Citation Coverage | 100% | All claims sourced |
| Completeness | ≥95% | All objectives met |
| Critical Gaps | 0 | No missing essentials |
Analysis Time Estimates
| Type | Time | Iterations |
|---|---|---|
| Security | 45-60 min | 1-2 |
| Architecture | 60-90 min | 1-2 |
| Code Quality | 30-45 min | 1 |
| Performance | 45-60 min | 1-2 |
| Research | 30-60 min | 1 |
multi-ai-research delivers production-ready analysis through systematic multi-AI collaboration, rigorous verification, and iterative refinement - ensuring nothing is missed and every claim is verified.