Memstack memstack-seo-site-audit

Use this skill when the user says 'SEO audit', 'site audit', 'check SEO', 'audit my site', 'SEO check', 'technical SEO', or is evaluating a website's search engine optimization health, meta tags, performance, or structured data. Do NOT use for keyword research or schema markup generation alone.

install
source · Clone the upstream repo
git clone https://github.com/cwinvestments/memstack
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/cwinvestments/memstack "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/seo-geo/site-audit" ~/.claude/skills/cwinvestments-memstack-memstack-seo-site-audit && rm -rf "$T"
manifest: skills/seo-geo/site-audit/SKILL.md
source content

🔎 Site Audit — Running comprehensive SEO audit...

Scans every page for meta tags, heading hierarchy, broken links, image optimization, Core Web Vitals, robots/sitemap, performance, and structured data — producing a prioritized fix list.

Activation

When this skill activates, output:

🔎 Site Audit — Scanning site for SEO issues...

Then execute the protocol below.

ContextStatus
User says "SEO audit" or "site audit" or "check SEO"ACTIVE
User says "technical SEO" or "audit my site"ACTIVE
Evaluating a website's overall search healthACTIVE
Optimizing specific meta tags (not full audit)DORMANT — use meta-tag-optimizer
Adding schema markup (not auditing)DORMANT — use schema-markup
Researching keywords (not auditing)DORMANT — use keyword-research

Anti-patterns

TrapReality Check
"The site looks fine in the browser"Crawlers and users see different things. Missing meta tags, broken links, and slow JS are invisible to humans.
"We'll optimize after launch"Technical debt compounds. Fixing 5 issues now prevents 50 issues later when Google has already indexed the problems.
"One audit is enough"Sites change with every deploy. Audit after major content or code changes, minimum quarterly.
"Performance doesn't affect SEO"Core Web Vitals are ranking signals. Slow sites rank lower regardless of content quality.
"Robots.txt is set and forget"One wrong rule can deindex your entire site. Review after every major structural change.

Protocol

For how each AI search engine ranks and cites content (ChatGPT, Perplexity, Google AI Overview, Copilot, Claude, and Google traditional as baseline), see

references/platform-ranking-factors.md
. Use it to prioritize fixes — a client whose audience is primarily on Perplexity benefits most from FAQ schema and PDF-friendly pages, while a ChatGPT-first audience needs backlinks and 30-day-fresh content.

Step 0: Live site snapshot (optional, for deployed URLs)

If the site is already deployed, run the bundled stdlib-only audit script for a one-shot snapshot before walking the source-tree checks below:

python scripts/live_audit.py https://example.com

No API keys, no pip installs, no external dependencies — pure Python 3 stdlib (

urllib
,
html.parser
,
json
). It fetches the page once and checks title, meta description, heading hierarchy, image alt attributes, JSON-LD structured data, canonical URL, Open Graph tags, robots meta, viewport,
robots.txt
(including AI bot rules),
sitemap.xml
, page load time, and internal vs. external link counts. Output uses the same severity format (🔴 Critical / 🟠 High / 🟡 Medium / 🔵 Low) as the scorecard in Step 9, and the script exit code reflects severity (
0
clean,
1
high,
2
critical) so it can also gate CI.

Use this for deployed-URL audits; use Steps 1-8 below for source-tree scans of unshipped code. The two are complementary — the live script confirms what actually reaches crawlers, the grep steps catch issues before they ship.


Step 1: Check Meta Tags

Scan every page for required meta tags:

# Find all page files
find app/ pages/ src/ -name "*.tsx" -o -name "*.jsx" -o -name "*.html" | grep -v node_modules

# Search for meta tag patterns
grep -rn "title\|<meta\|og:title\|og:description\|og:image\|canonical" --include="*.tsx" --include="*.jsx" --include="*.html" . | grep -v node_modules

Required meta tags per page:

TagRequiredRule
<title>
YesUnique per page, 50-60 chars, includes primary keyword
<meta name="description">
YesUnique per page, 150-155 chars, includes CTA
<link rel="canonical">
YesPoints to the canonical URL (prevents duplicate content)
<meta property="og:title">
YesSocial sharing title (can differ from page title)
<meta property="og:description">
YesSocial sharing description
<meta property="og:image">
YesSocial sharing image (1200x630px recommended)
<meta property="og:url">
YesCanonical URL
<meta name="robots">
Conditional
noindex
on admin, auth, and internal pages
<meta name="viewport">
Yes
width=device-width, initial-scale=1

Score per page:

  • ✅ All required tags present and valid
  • ⚠️ Missing optional tags
  • ❌ Missing required tags or invalid values

Step 2: Check Heading Hierarchy

# Find heading usage across all pages
grep -rn "<h[1-6]\|<H[1-6]" --include="*.tsx" --include="*.jsx" --include="*.html" . | grep -v node_modules

Heading rules:

RuleCheckWhy
Single H1 per pageCount H1 tagsMultiple H1s confuse crawlers about the page topic
H1 matches page title (roughly)Compare H1 to
<title>
Consistency signals relevance
No skipped levelsH1 → H2 → H3, not H1 → H3Broken hierarchy hurts accessibility and SEO
Keywords in H1 and H2Check for target keywordHeadings are high-weight ranking signals
No empty headingsCheck for
<h2></h2>
Empty headings are wasted signals
Headings aren't used for stylingCheck if headings are decorativeUse CSS for size, headings for structure

Step 3: Check for Broken Links

# Find all links in the codebase
grep -rn 'href="\|href={' --include="*.tsx" --include="*.jsx" --include="*.html" --include="*.md" . | grep -v node_modules | grep -v "mailto:\|tel:\|#"

Check each link category:

CategoryWhat to CheckCommon Issues
Internal linksPath exists in the appRenamed pages, deleted routes
External linksURL returns 200Dead links, moved pages, domain expired
Anchor linksTarget ID exists on pageRenamed sections, removed elements
Image linksImage file exists, loadsWrong path, deleted assets
API endpointsEndpoint is liveChanged API routes

Flag: Any link returning 404, 500, or timeout. Prioritize fixing internal broken links (you control these).

Step 4: Check Image Optimization

# Find all images
grep -rn '<img\|<Image\|background-image' --include="*.tsx" --include="*.jsx" --include="*.css" . | grep -v node_modules

# Check for alt text
grep -rn '<img\|<Image' --include="*.tsx" --include="*.jsx" . | grep -v 'alt=' | grep -v node_modules

Image optimization checklist:

CheckRuleImpact
Alt text presentEvery
<img>
has
alt
attribute
Accessibility + image SEO
Alt text descriptiveNot empty, not "image", includes contextScreen readers and image search
File sizeUnder 200KB for standard images, under 100KB for thumbnailsPage speed
Modern formatWebP or AVIF preferred over PNG/JPG25-50% smaller file size
Lazy loading
loading="lazy"
on below-fold images
Reduces initial page load
Dimensions
width
and
height
attributes set
Prevents Cumulative Layout Shift
Next.js ImageUsing
next/image
component
Automatic optimization, lazy loading, sizing
Responsive
srcset
or responsive sizing
Serves appropriate size per device

Step 5: Check Core Web Vitals Indicators

Scan code for patterns that affect CWV scores:

# Render-blocking resources
grep -rn '<link.*stylesheet\|<script(?!.*defer\|.*async)' --include="*.html" . | grep -v node_modules

# Layout shift indicators (missing dimensions)
grep -rn '<img\|<video\|<iframe' --include="*.tsx" --include="*.jsx" . | grep -v 'width\|height\|aspect' | grep -v node_modules

# Large bundle imports
grep -rn "import .* from ['\"]lodash['\"]" --include="*.ts" --include="*.tsx" . | grep -v node_modules

CWV targets:

MetricGoodNeeds ImprovementPoorWhat Causes Failure
LCP (Largest Contentful Paint)< 2.5s2.5-4.0s> 4.0sLarge hero images, slow server, render-blocking CSS
INP (Interaction to Next Paint)< 200ms200-500ms> 500msHeavy JS on main thread, long tasks
CLS (Cumulative Layout Shift)< 0.10.1-0.25> 0.25Images without dimensions, dynamic content injection, web fonts

Code-level checks:

  • ❌ Large JavaScript bundles without code splitting
  • ❌ Synchronous
    <script>
    tags (use
    defer
    or
    async
    )
  • ❌ CSS loaded without
    media
    queries for non-critical styles
  • ❌ Web fonts without
    font-display: swap
    (causes invisible text flash)
  • ❌ Third-party scripts loaded in
    <head>
    (move to end of
    <body>
    or lazy load)

Step 6: Check robots.txt and Sitemap

# Check for robots.txt
cat public/robots.txt 2>/dev/null

# Check for sitemap
cat public/sitemap.xml 2>/dev/null
ls public/sitemap*.xml 2>/dev/null

robots.txt checklist:

CheckExpectedIssue if Missing
File exists
public/robots.txt
Crawlers assume everything is allowed
Sitemap referenced
Sitemap: https://domain.com/sitemap.xml
Crawlers may not find your sitemap
Admin pages blocked
Disallow: /admin
Internal pages get indexed
API routes blocked
Disallow: /api
API endpoints appear in search
Auth pages blocked
Disallow: /login
,
/register
Auth pages waste crawl budget
Assets allowedCSS/JS not blockedCrawlers can't render the page

sitemap.xml checklist:

CheckExpectedIssue if Missing
File exists
public/sitemap.xml
Crawlers may miss pages
All public pages listedEvery indexable pageUnlisted pages get lower priority
<lastmod>
dates
Accurate last modified dateStale dates reduce crawl frequency
No noindex pages includedOnly indexable pagesContradictory signals confuse crawlers
Submitted to Search ConsoleVerified in GSCGoogle may not discover it

Step 7: Check Page Performance

# Check bundle size (Next.js)
ls -la .next/static/chunks/*.js 2>/dev/null | sort -k5 -n -r | head -10

# Check for render-blocking patterns
grep -rn "useEffect\|useState" --include="*.tsx" --include="*.jsx" . | grep -v node_modules | wc -l

Performance checklist:

CheckTargetHow to Fix
Total page weight< 1MB (initial load)Code split, lazy load, compress
JavaScript bundle< 200KB (main bundle)Tree shaking, dynamic imports
CSS< 50KB (critical CSS)Purge unused CSS, inline critical
Fonts< 100KB totalSubset fonts, use
font-display: swap
Images< 200KB eachCompress, use WebP, lazy load
Time to First Byte< 200msServer optimization, CDN
First Contentful Paint< 1.8sOptimize critical rendering path

Step 8: Check Structured Data

# Search for JSON-LD schema markup
grep -rn "application/ld+json\|schema.org" --include="*.tsx" --include="*.jsx" --include="*.html" . | grep -v node_modules

# Check for common schema types
grep -rn "Organization\|Product\|Article\|BreadcrumbList\|FAQPage\|WebSite" --include="*.tsx" --include="*.jsx" . | grep -v node_modules

Required schema by page type:

Page TypeRequired SchemaRich Result Eligibility
HomepageOrganization, WebSite, SearchActionSitelinks search box
Product pageProduct (with price, availability)Product rich results
Blog postArticle (with author, datePublished)Article rich results
FAQ pageFAQPageFAQ rich results
How-to pageHowToHow-to rich results
Local businessLocalBusiness (with address, hours)Local pack, knowledge panel
BreadcrumbsBreadcrumbListBreadcrumb trail in SERP

Flag if: No structured data found at all — this is a significant missed opportunity for rich results in search.

Step 9: Output Scorecard

🔎 Site Audit — Complete

Site: [domain]
Pages scanned: [count]
Overall score: [X/100]

Category scores:
  Meta tags:        [X/10] — [count] issues
  Heading hierarchy: [X/10] — [count] issues
  Broken links:      [X/10] — [count] issues
  Image optimization: [X/10] — [count] issues
  Core Web Vitals:   [X/10] — [count] issues
  Robots/sitemap:    [X/10] — [count] issues
  Performance:       [X/10] — [count] issues
  Structured data:   [X/10] — [count] issues

Priority fix list (by impact):
  🔴 Critical: [count]
    1. [Most impactful issue — e.g., "No sitemap.xml found"]
    2. [Second issue]
  🟠 High: [count]
    3. [Issue]
    4. [Issue]
  🟡 Medium: [count]
    5. [Issue]
  🔵 Low: [count]
    6. [Issue]

Page-by-page breakdown:
  / (homepage)           — Score: [X/10] — [issues summary]
  /about                 — Score: [X/10] — [issues summary]
  /blog/[slug]           — Score: [X/10] — [issues summary]
  /pricing               — Score: [X/10] — [issues summary]

Next steps:
1. Fix all Critical issues immediately
2. Run meta-tag-optimizer for tag-specific fixes
3. Run schema-markup to add missing structured data
4. Re-audit after fixes to verify improvement
5. Submit updated sitemap to Google Search Console

Level History

  • Lv.1 — Base: Meta tag scanning, heading hierarchy, broken link detection, image optimization, Core Web Vitals code patterns, robots.txt/sitemap.xml, page performance, structured data presence, scored page-by-page output with prioritized fix list. (Origin: MemStack Pro v3.2, Mar 2026)