Skillshub paper-expert-generator
Generate a specialized domain-expert research agent modeled on PaperClaw architecture. Use this skill when a user wants to create an AI agent that can automatically search, filter, summarize, and evaluate academic papers in a specific research field. Trigger phrases include help me create a paper tracking agent for my field, I want an agent to monitor latest papers in bioinformatics, build me a paper review agent for computer vision, create a PaperClaw-style agent for my domain, generate a domain-specific paper expert agent. The generated agent is a complete OpenClaw agent with all required skills (arxiv-search, semantic-scholar, paper-review, daily-search, weekly-report) fully adapted for the target domain.
git clone https://github.com/ComeOnOliver/skillshub
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/guhaohao0991/PaperClaw/paper-expert-generator" ~/.claude/skills/comeonoliver-skillshub-paper-expert-generator && rm -rf "$T"
skills/guhaohao0991/PaperClaw/paper-expert-generator/SKILL.mdPaper Expert Generator
Generate a complete, ready-to-use domain-specific paper expert agent by adapting the PaperClaw architecture for any research field.
Workflow
Step 1: Domain Interview
Collect these details from the user before generating anything. Ask conversationally – do not dump all questions at once. Start with the most critical ones:
Critical (ask first):
- Research domain – Primary field (e.g., "bioinformatics", "quantum computing", "computer vision")
- Core topics – Specific sub-areas or problems (e.g., "protein folding, drug discovery, single-cell sequencing")
- Key methods/techniques – Central methodologies (e.g., "transformers, GNN, diffusion models, RL")
Important (ask second): 4. Evaluation priorities – What dimensions matter most for paper quality in this domain? 5. Exclusion topics – What should be filtered out? (e.g., "finance, social media, NLP") 6. Output location – Where to create the agent? (default:
~/agents/<domain-slug>/)
Optional (ask only if needed): 7. Notification channel – Feishu/Lark webhook URL for push notifications 8. LLM config – API base URL, model name, API key (default: same as PaperClaw models.json) 9. Schedule timezone – Default is
Asia/Singapore
Infer reasonable defaults for anything not provided and confirm before proceeding.
Step 2: Build Keyword Library
Construct a structured keyword library from the domain interview. Aim for:
- Core queries (3–5): Direct topic+method combinations for arXiv
searchesti: - Method queries (3–5): Method+application combinations
- Application queries (2–3): Use-case-specific terms
- Exclusion keywords (3–6): Out-of-scope terms to filter
See
references/domain-adaptation-guide.md Section 1 for keyword examples across 8 common domains.
Step 3: Design Evaluation Rubric
Design 4 domain-specific scoring dimensions (each scored 1–10) that replace PaperClaw's SciML dimensions (
engineering_value, architecture_innovation, theoretical_contribution, result_reliability).
The scoring formula is unchanged:
final_score = base_score × 0.9 + impact_score × 0.1 base_score = (dim1 + dim2 + dim3 + dim4) / 4 impact_score = date_citation_adjustment(citations, age_months)
See
references/domain-adaptation-guide.md Section 2 for rubric examples by domain.
Step 4: Generate Agent Files
Run the scaffolding script to create the directory structure:
python ~/.comate/skills/paper-expert-generator/scripts/init_domain_agent.py \ --domain "<domain_slug>" \ --output "<output_dir>" \ --paperclaw-skills "<paperclaw_skills_path>"
Example:
python ~/.comate/skills/paper-expert-generator/scripts/init_domain_agent.py \ --domain "bioinfo-ml" \ --output ~/agents/bioinfo-ml \ --paperclaw-skills /work/work/PaperClaw/skills
Generated structure:
<output_dir>/ ├── agent/ │ ├── AGENT.md ← write domain content here │ ├── models.json ← pre-filled from template │ └── schedules.json ← pre-filled from template ├── skills/ │ ├── arxiv-search/ ← copy from PaperClaw (needs keyword update) │ ├── semantic-scholar/ ← copy from PaperClaw (no changes needed) │ ├── paper-review/ ← copy from PaperClaw (needs rubric update) │ ├── daily-search/ ← copy from PaperClaw (minor text update) │ └── weekly-report/ ← copy from PaperClaw (minor text update) └── workspace/ └── evaluated_papers.json ← initialized empty
Step 5: Write AGENT.md
Use
assets/templates/AGENT.md.template as the base. The AGENT.md must include:
-
Role Definition – Domain expert persona with specific depth. Replace SciML expertise with domain-specific expertise (key algorithms, theoretical foundations, benchmark datasets, top venues/conferences).
-
Keyword Library – Paste structured keywords from Step 2.
-
Four Core Tasks (preserve exact structure from PaperClaw):
- Task 1 (Paper Research): Download PDF → write
answering 10 domain-adapted questionssummary.md - Task 2 (Paper Evaluation): 4-dimension scoring → write
→ updatescores.md
→ update registrymetadata.json - Task 3 (Daily Search): Cron trigger →
→ dedup → trigger Task 1+2daily_paper_search.py --top 3 - Task 4 (Weekly Report): Cron trigger →
→ push notificationgenerate_weekly_report_v2.py
- Task 1 (Paper Research): Download PDF → write
-
Mandatory
Reasoning – Required in Task 2 evaluation.<think> -
Dedup Gate – Always check
before starting paper review.evaluated_papers.json
See
references/agent-template-guide.md for the full AGENT.md authoring guide.
Step 6: Adapt Skill SKILL.md Files
Minimal adaptation needed – Python scripts are domain-agnostic:
| Skill | Required changes to SKILL.md |
|---|---|
| Replace the keyword list with domain keywords from Step 2 |
| Replace 4 scoring dimensions + update the 10 summary questions |
| Update domain name in task description text |
| Update domain name in report title |
| No changes needed |
Step 7: Configure models.json and schedules.json
models.json: Edit
agent/models.json, fill in:
: LLM API endpointbaseUrl
: API key placeholderapiKey
andid
: Model identifiername
schedules.json: Default schedule is pre-filled. Adjust
tz field if not in Singapore timezone.
Step 8: Validate and Deliver
Checklist before presenting results:
-
has domain role, keywords, 4 tasks, rubricAGENT.md -
has domain scoring dimensions + 10 adapted questionspaper-review/SKILL.md -
has domain keyword listarxiv-search/SKILL.md -
has correct structure (API key placeholder)models.json -
initialized asworkspace/evaluated_papers.json[] - All 5 skill directories exist
Then present the output summary (see next section).
Output Summary Format
Always deliver this summary after generation:
## Generated Agent: <Domain Name> Paper Expert **Domain**: <domain> **Location**: `<output_dir>` **Model**: <model_name> ### Keyword Library (<N> total queries) **Core**: <query1>, <query2>, <query3> **Methods**: <query1>, <query2> **Exclusions**: <term1>, <term2>, ... ### Evaluation Rubric | Dimension | Score Weight | Measures | |-----------|-------------|---------| | <dim1> | 25% | ... | | <dim2> | 25% | ... | | <dim3> | 25% | ... | | <dim4> | 25% | ... | ### Schedule - Daily search: `0 20 * * *` (<timezone>) - Weekly report: `0 10 * * 0` (<timezone>) ### Quick Start 1. Open OpenClaw → select agent from `<output_dir>/agent/` 2. Set API key in `agent/models.json` 3. Test: "Search for recent papers on <core_topic>" 4. Or wait for first daily trigger at 20:00
References
– Keyword and rubric examples for 8 common domainsreferences/domain-adaptation-guide.md
– Full AGENT.md authoring guide with annotated sectionsreferences/agent-template-guide.md
– Base template for the generated AGENT.mdassets/templates/AGENT.md.template
– Base models config templateassets/templates/models.json
– Base schedules config templateassets/templates/schedules.json