Goose-skills review-site-scraper
install
source · Clone the upstream repo
git clone https://github.com/gooseworks-ai/goose-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/gooseworks-ai/goose-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/capabilities/review-site-scraper" ~/.claude/skills/gooseworks-ai-goose-skills-review-site-scraper && rm -rf "$T"
manifest:
skills/capabilities/review-site-scraper/SKILL.mdsource content
Review Site Scraper
Scrape product reviews from G2, Capterra, and Trustpilot using platform-specific Apify actors.
Quick Start
Requires
APIFY_API_TOKEN env var (or --token flag). No external dependencies needed (uses stdlib urllib).
# Trustpilot reviews python3 skills/capabilities/review-site-scraper/scripts/scrape_reviews.py \ --platform trustpilot \ --url "https://www.trustpilot.com/review/example.com" \ --max-reviews 10 --output summary # G2 reviews with keyword filter python3 skills/capabilities/review-site-scraper/scripts/scrape_reviews.py \ --platform g2 \ --url "https://www.g2.com/products/example/reviews" \ --keywords "pricing,support" # Capterra reviews (uses company name, not URL) python3 skills/capabilities/review-site-scraper/scripts/scrape_reviews.py \ --platform capterra \ --company-name "HubSpot CRM" \ --max-reviews 20
Supported Platforms
| Platform | Actor | Input | Cost |
|---|---|---|---|
| G2 | | (G2 product page URL) | Free tier available |
| Capterra | | (company name, not a URL) | Pay-per-result |
| Trustpilot | | (Trustpilot review page URL) | ~$0.20/1k reviews |
CLI Reference
| Flag | Default | Description |
|---|---|---|
| required | , , or |
| none | Product review page URL (required for G2 and Trustpilot) |
| none | Company name to search (Capterra only) |
| 50 | Max reviews to scrape |
| none | Keywords to filter (comma-separated, OR logic) |
| none | Only include reviews from last N days |
| json | Output format: or |
| env var | Apify token (prefer env var) |
| 300 | Max seconds for Apify run |
Normalized Output Schema
All platforms are normalized but each has platform-specific fields.
G2 output fields:
{ "platform": "g2", "id": "review-id", "product_name": "Product Name", "title": null, "text": "Review body text", "rating": 4, "author": "Reviewer Name", "author_title": "Job Title", "author_company": "Company Name", "author_company_size": "51-200", "author_industry": "Software", "date": "2026-02-18", "source": "organic", "url": "https://..." }
Capterra output fields:
{ "platform": "capterra", "title": "Review title", "text": "Review body text", "overall_rating": 4, "ease_of_use": 5, "customer_service": 3, "features": 4, "author": "Reviewer Name", "job_title": "Marketing Manager", "industry": "Marketing and Advertising", "usage_duration": "1-2 years", "date": "2026-02-18", "url": "https://..." }
Trustpilot output fields:
{ "platform": "trustpilot", "id": "review-id", "title": "Review title", "text": "Review body text", "rating": 4, "author": "Reviewer Name", "date": "2026-02-18T12:00:00.000Z", "experienced_date": "2026-02-15T00:00:00.000Z", "likes": 2, "input_source": "organic", "url": "https://..." }