Sherlock-ai-plugin paper2code
install
source · Clone the upstream repo
git clone https://github.com/proyecto26/sherlock-ai-plugin
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/proyecto26/sherlock-ai-plugin "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/paper2code" ~/.claude/skills/proyecto26-sherlock-ai-plugin-paper2code && rm -rf "$T"
manifest:
skills/paper2code/SKILL.mdsource content
Paper2Code: AI Agent for Converting Research Papers into Code
Overview
This Skill executes a 4+2 stage pipeline effectively systematically analyzing research papers and converting them into executable code.
Core Principle: Do not simply read the paper and generate code; generate a structured intermediate representation (YAML) first, then write the code.
⚠️ Critical Behavioral Control Rules (CRITICAL)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ⚠️ MANDATORY BEHAVIORAL RULES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1. Implement one file at a time 2. Proceed to the next file only after completing the current file, without asking for confirmation 3. Original paper specifications always take precedence over reference code 4. Perform a Self-Check for each Phase before completion 5. Save all intermediate results as YAML files DO: ✓ Implementing exactly what is stated in the paper ✓ Write simple and direct code ✓ Working code first, elegant code later ✓ Test each component immediately ✓ Move to the next file immediately after implementation is complete DON'T: ✗ Do not ask "Shall I implement the next file?" between files ✗ Extensive documentation not required for core functionality ✗ Optimization not needed for reproducibility ✗ Excessive abstraction or design patterns ✗ Providing instructions without writing actual code ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Input Processing
Supported Formats
- arXiv URL:
orhttps://arxiv.org/abs/xxxx.xxxxxhttps://arxiv.org/pdf/xxxx.xxxxx.pdf - PDF File Path:
/path/to/paper.pdf - Converted Text/Markdown: When paper content is provided as text
Input Processing Method
For arXiv URL:
# Convert to PDF URL and download curl -L "https://arxiv.org/pdf/xxxx.xxxxx.pdf" -o paper.pdf # Convert PDF to text (using pdftotext) pdftotext -layout paper.pdf paper.txt
For PDF File:
pdftotext -layout "/path/to/paper.pdf" paper.txt
Pipeline Overview
[User Input: Paper URL/File] │ ▼ ┌─────────────────────────────────────────────┐ │ Step 0: Acquire Paper Text │ │ - arXiv URL → Download PDF │ │ - PDF → Convert to Text │ └─────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────┐ │ Phase 0: Search Reference Code (Optional) │ │ @[05_reference_search.md] │ │ Output: reference_search.yaml │ └─────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────┐ │ Phase 1: Algorithm Extraction │ │ @[01_algorithm_extraction.md] │ │ Output: 01_algorithm_extraction.yaml │ └─────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────┐ │ Phase 2: Concept Analysis │ │ @[02_concept_analysis.md] │ │ Output: 02_concept_analysis.yaml │ └─────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────┐ │ Phase 3: Implementation Plan │ │ @[03_code_planning.md] │ │ Output: 03_implementation_plan.yaml │ └─────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────┐ │ Phase 4: Code Implementation │ │ @[04_implementation_guide.md] │ │ Output: Complete Project Directory │ └─────────────────────────────────────────────┘
Data Transfer Format Between Stages
Phase 1 → Phase 2 Transfer
phase1_to_phase2: algorithms_found: "[Number of found algorithms]" key_algorithms: - name: "[Algorithm Name]" section: "[Paper Section]" complexity: "[Simple/Medium/Complex]" hyperparameters_count: "[Number of collected hyperparameters]" critical_equations: "[List of critical equation numbers]" missing_info: "[List of missing information]"
Phase 2 → Phase 3 Transfer
phase2_to_phase3: components_count: "[Number of identified components]" implementation_complexity: "[Low/Medium/High]" key_dependencies: - "[Component A] → [Component B]" experiments_to_reproduce: - "[Experiment Name]: [Expected Result]" success_criteria: - "[Specific Success Criteria]"
Phase 3 → Phase 4 Transfer
phase3_to_phase4: file_order: "[List of files in implementation order]" current_file: "[Currently implementing file]" completed_files: "[List of completed files]" blocking_dependencies: "[Dependencies to resolve]"
Detail of Each Phase
Phase 0: Reference Code Search (Optional)
Using the @05_reference_search.md prompt:
- Search for and evaluate 5 similar implementations
- Secure references to improve implementation quality
- Output: Reference list in YAML format
Phase 1: Algorithm Extraction
Using the @01_algorithm_extraction.md prompt:
- Extract all algorithms, equations, and pseudocode
- Collect hyperparameters and configuration values
- Organize training procedures and optimization methods
- Output: Complete algorithm specification in YAML format
Phase 2: Concept Analysis
Using the @02_concept_analysis.md prompt:
- Map paper structure and sections
- Analyze system architecture
- Identify component relationships and data flow
- Organize experiment and validation requirements
- Output: Implementation requirements specification in YAML format
Phase 3: Establish Implementation Plan
Using the @03_code_planning.md prompt:
- Integrate results from Phase 1 and 2
- Generate detailed implementation plans for 5 essential sections:
: Project file structurefile_structure
: Implementation component detailsimplementation_components
: Validation and testing methodsvalidation_approach
: Environment and dependenciesenvironment_setup
: Step-by-step implementation strategyimplementation_strategy
- Output: Complete YAML implementation plan (8000-10000 characters)
Phase 4: Code Implementation
Following the guide @04_implementation_guide.md:
- Generate code file by file according to the plan
- Implement in dependency order
- Each file must be complete and executable
- Output: Executable codebase
Memory Management
Refer to the guide @06_memory_management.md:
- Context management when processing long papers
- Saving step-by-step outputs
- Recovery protocol in case of interruption
Quality Standards
Principles that Must Be Followed
- Completeness: Complete implementation without placeholders or TODOs
- Accuracy: Accurately reflect equations and parameters specified in the paper
- Executability: Code that can be executed immediately
- Reproducibility: Must be able to reproduce the results of the paper
File Implementation Order
- Configuration and environment files (config, requirements.txt initialization)
- Core utilities and base classes
- Main algorithm/model implementation
- Training and evaluation scripts
- Documentation (README.md, requirements.txt finalization)
✅ Final Completion Checklist (MANDATORY)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ⚠️ BEFORE DECLARING COMPLETE - ALL MUST BE YES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ □ All algorithms in the paper implemented? → YES / NO □ Correct versions of environment/datasets set? → YES / NO □ All comparison methods referenced implemented? → YES / NO □ Working integration to run paper experiments? → YES / NO □ All metrics, figures, tables reproducible? → YES / NO □ Basic docs explaining how to reproduce? → YES / NO □ Code runs without errors? → YES / NO ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ⚠️ If even one is NO, it is NOT complete! ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Usage Examples
Example 1: arXiv Paper
User: Implement this paper https://arxiv.org/abs/2301.12345 Claude: I will analyze the paper and convert it to code. [Phase 0: Reference Code Search (Optional)...] [Phase 1: Algorithm Extraction...] [Phase 2: Concept Analysis...] [Phase 3: Establish Implementation Plan...] [Phase 4: Code Generation...]
Example 2: PDF File
User: Implement the algorithms from this paper /home/user/papers/attention.pdf
Example 3: Specific Request
User: Implement only the algorithm in Section 3 of this paper
Related Files
- 01_algorithm_extraction.md - Phase 1: Algorithm Extraction
- 02_concept_analysis.md - Phase 2: Concept Analysis
- 03_code_planning.md - Phase 3: Implementation Plan
- 04_implementation_guide.md - Phase 4: Implementation Guide
- 05_reference_search.md - Phase 0: Reference Search (Optional)
- 06_memory_management.md - Memory Management Guide
Precautions
⚠️ REMEMBER: 1. Read the paper thoroughly: Start implementation after understanding the entire content 2. Save detailed results: Save YAML output of each Phase as a file 3. Incremental implementation: Do not generate all code at once, proceed file by file 4. Include verification: Include simple test code if possible 5. Reference is inspiration: Reference code is for understanding and application, not copying