git clone https://github.com/ComeOnOliver/skillshub
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/Harmeet10000/skills/google-serp-lead-scraper" ~/.claude/skills/comeonoliver-skillshub-google-serp-lead-scraper && rm -rf "$T"
skills/Harmeet10000/skills/google-serp-lead-scraper/SKILL.mdGoogle SERP Lead Scraper
Scrapes Google search results for local businesses, fetches their websites, extracts contact information using GPT-5, and stores structured leads in Google Sheets.
When to Use
- Building lead lists for local service businesses (plumbers, electricians, roofers, etc.)
- Prospecting for outreach campaigns targeting specific geographic areas
- Populating CRM with enriched contact data
How to Call
Webhook URL (test mode):
GET https://nicksaraev.app.n8n.cloud/webhook-test/8aee83a4-ae72-4f96-a834-e1c6afd4d080
Production URL: TBD (deploy workflow to get production webhook)
No parameters required currently—query is hardcoded to "calgary plumber".
What It Does
-
Google Search → Apify's
actor searches with:google-search-scraper- Query: "calgary plumber" (hardcoded)
- Country: Canada (
)ca - Language: English
- 5 pages × 10 results = up to 50 organic results
-
Limit → Currently capped at 2 results (for testing). Remove or adjust the Limit node for full runs.
-
Fetch & Convert → Each result URL is fetched and converted to markdown.
-
GPT-5 Extraction → Extracts 100+ fields per lead including:
- Company info (name, tagline, industry, keywords)
- Owner/decision-maker details
- Multiple emails with confidence scores and provenance
- Phones normalized to E.164
- Full address parsing
- Social profiles (LinkedIn, Facebook, Instagram, Twitter, etc.)
- Best contact method recommendation
- Custom icebreaker line for outreach
-
Google Sheets → Appends to Google SERP Scraping Database
Output Schema (Key Fields)
| Field | Description |
|---|---|
| Business name |
| Decision-maker name (if found) |
| Highest-confidence email for outreach |
| Recommended phone number |
, , | All extracted emails |
| 0.0–1.0 confidence score |
| Source location (e.g., `/contact |
| Phone in E.164 format |
| Complete address string |
| Company LinkedIn page |
| Pre-formatted outreach opener |
See the full 100+ field schema in the n8n workflow's GPT prompt.
Icebreaker Format
The extraction generates icebreakers in this format:
Hey {FirstName}. I work with a $2M/yr plumber out of Calgary (NE-specific), pretty similar to {CompanyName}. Not sure if you have exposure to the NE, but wanted to run something by you.
Confidence Scoring
Extraction uses tiered confidence based on source:
- 1.0 — Schema.org structured data
- 0.9 — OpenGraph/meta tags
- 0.85 — /contact or /about pages
- 0.8 — Footer/header blocks
- 0.6 — Visible text near contact labels
- 0.4 — Inferred/heuristic values
Fields below 0.6 confidence are flagged for manual review.
Current Limitations
-
Hardcoded query — "calgary plumber" is baked into the workflow. To change:
- Edit the Apify node's
parameter in n8nqueries - Or parameterize via webhook query string (requires workflow update)
- Edit the Apify node's
-
Test limit — Only processes 2 results. Remove the Limit node for production.
-
No deduplication — Repeated runs may create duplicate rows. Consider adding a check against
orsource_url
.domain -
Rate limits — Apify has usage limits; large batches may need pagination or scheduling.
Future Improvements
- Accept
andquery
as webhook parameterslocation - Add deduplication against existing sheet rows
- Batch processing with progress tracking
- Error handling for failed URL fetches
- Deploy to production webhook URL
Related Files
- Output: Google SERP Scraping Database
- Workflow platform: n8n Cloud (nicksaraev.app.n8n.cloud)
- Scraping service: Apify (google-search-scraper actor)