Goose-skills reddit-post-finder
Scrape and search Reddit posts using Apify. Use when you need to find Reddit discussions, track competitor mentions, monitor product feedback, discover pain points, or analyze subreddit content. Supports keyword filtering, time-based searches, and subreddit-specific queries.
git clone https://github.com/gooseworks-ai/goose-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/gooseworks-ai/goose-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/capabilities/reddit-post-finder" ~/.claude/skills/gooseworks-ai-goose-skills-reddit-post-finder && rm -rf "$T"
skills/capabilities/reddit-post-finder/SKILL.mdReddit Post Finder
Scrape Reddit posts and comments using the Apify
trudax/reddit-scraper-lite actor.
Quick Start
Requires
APIFY_API_TOKEN env var (or --token flag).
# Top posts from r/growthhacking in last week python3 skills/reddit-post-finder/scripts/search_reddit.py \ --subreddit growthhacking --days 7 --sort top --time week # Hot posts from multiple subreddits python3 skills/reddit-post-finder/scripts/search_reddit.py \ --subreddit "growthhacking,gtmengineering" --days 7 --sort hot # Keyword-filtered competitor tracking python3 skills/reddit-post-finder/scripts/search_reddit.py \ --subreddit LLMDevs \ --keywords "Langfuse,Arize,Langsmith" \ --days 30 # Human-readable summary table python3 skills/reddit-post-finder/scripts/search_reddit.py \ --subreddit growthhacking --days 7 --output summary
How the Script Works
- Builds full Reddit URLs for each subreddit (e.g.
)https://www.reddit.com/r/growthhacking/top/?t=week - Calls the Apify
actor via REST APItrudax/reddit-scraper-lite - Polls until the run completes, then fetches the dataset
- Applies client-side keyword and date filtering
- Sorts by upvotes (descending) and outputs JSON or summary
CLI Reference
| Flag | Default | Description |
|---|---|---|
| required | Subreddit name(s), comma-separated |
| none | Keywords to filter (comma-separated, OR logic) |
| 30 | Only include posts from the last N days |
| 50 | Max posts to scrape per subreddit |
| top | Sort: , , , |
| week | Time window for sort: , , , , , |
| json | Output format: or |
| env var | Apify token (prefer env var) |
| 300 | Max seconds to wait for the Apify run |
Tips for Small Subreddits
Small or low-traffic subreddits (e.g.
r/gtmengineering) may return zero posts with --sort hot because the hot feed is nearly empty. Use --sort top --time week (or month) instead — this scrapes the top-ranked posts over the time window and reliably returns results.
Direct API Usage
If calling the Apify API directly (e.g. via curl), note these required fields:
{ "startUrls": [{"url": "https://www.reddit.com/r/growthhacking/top/?t=week"}], "maxItems": 50 }
Key notes for
trudax/reddit-scraper-lite:
- Uses
with full Reddit URLs (not astartUrls
array for subreddit browsing)searches - Sort/time are controlled via the URL path (e.g.
), not separate input fields/top/?t=week - Only
andstartUrls
are confirmed working input fieldsmaxItems - Does not support
,proxyConfiguration
, orscrollTimeoutsearchType
Output fields:
—dataType
or"post""comment"
— Post titletitle
— Post body textbody
— Subreddit name (withoutcommunityName
prefix)r/
— Number of upvotesupVotes
— Comment countnumberOfComments
— Full URL to the posturl
— ISO timestamp of when the post was createdcreatedAt
Common Workflows
1. Competitor Tracking
python3 skills/reddit-post-finder/scripts/search_reddit.py \ --subreddit "LLMDevs,MachineLearning,LocalLLaMA" \ --keywords "Langfuse,Arize,Weights & Biases,Langsmith,Braintrust" \ --days 30 --sort top --time month
2. Pain Point Discovery
python3 skills/reddit-post-finder/scripts/search_reddit.py \ --subreddit LLMDevs \ --keywords "frustrating,difficult,hard to,wish there was,better way" \ --days 30
3. Brand Monitoring
python3 skills/reddit-post-finder/scripts/search_reddit.py \ --subreddit "LLMDevs,MachineLearning" \ --keywords "YourProductName" \ --days 7 --sort new
Important: Always Include Post URLs
When presenting Reddit results to the user, always include the original post URL for every post. This is critical for allowing users to read the full discussion, comments, and context. Never return a summary table without links.
Output Format
Posts are returned as JSON array sorted by upvotes. Each post has:
{ "dataType": "post", "title": "Post title", "body": "Post body...", "communityName": "growthhacking", "upVotes": 42, "numberOfComments": 15, "createdAt": "2026-02-18T12:00:00.000Z", "url": "https://reddit.com/r/..." }
Configuration
See
references/apify-config.md for detailed API configuration, token setup, and rate limits.