Site Architecture Audit Claude Skill
This skill audits a website's architecture and crawlability and produces an actionable SEO report. It crawls the site and inspects robots.txt and XML sitemaps to find hostname and protocol duplication, URL structure problems, excessive crawl depth, weak directory taxonomy, blocked or conflicting crawl and index directives, and sitemap-to-crawl parity gaps, then recommends concrete fixes.
Quick Take
Point the skill at a live site URL. It probes the four hostname and protocol variants, parses robots.txt and every XML sitemap, crawls the site within the configured page and depth limits, runs severity-ranked architecture checks, and returns a prioritized report with copy-pasteable fixes.
What The Skill Checks
- High-severity structural problems: multiple indexable hostname and protocol variants (http and https, www and non-www) with no canonical host enforcement, important pages blocked by robots.txt, and conflicting crawl and index directives such as a noindex page that is also disallowed or canonicalized elsewhere.
- XML sitemap health, checked in both directions: sitemap URLs that are non-200, redirected, noindexed, or non-canonical, plus indexable pages missing from the sitemap entirely, along with robots.txt sitemap declaration and size-limit handling.
- URL and depth structure: uppercase, underscore, encoded, over-long, over-deep, or ID-only URLs, inconsistent trailing-slash, protocol, and www forms across internal links, crawlable tracking and session parameter spaces, click-depth distribution, and thin or inconsistent directory taxonomy.
How The Skill Is Packaged
The skill follows the standard Claude Agent Skill structure: a SKILL.md file with YAML frontmatter and workflow instructions, a references/ folder with the full audit check definitions and report template, and a scripts/ folder with Python scripts that crawl the site and gather architecture signals (host variants, robots.txt, and XML sitemaps) and audit the resulting inventory. The architecture skill runs against a live site, since robots.txt, sitemaps, redirects, and host-variant behavior only exist on a live server. Copy the skill folder into your Claude skills directory and Claude invokes it automatically when a request matches its description.
Skill Files
Every file in the skill is embedded below directly from the Claude-SEO-Skills repository, so you can review exactly what the skill instructs Claude to do before installing it.
SKILL.md
The skill definition: frontmatter, inputs to collect, the five-step audit workflow, and the resources the skill relies on.
Audit Checks Reference (references/audit-checks.md)
Full definitions, thresholds, and rationale for all fifteen audit checks the skill evaluates, from hostname canonicalization and directive conflicts to sitemap parity and URL structure.
Report Template (references/report-template.md)
The output structure the skill follows when writing the final audit report, including the host-variant table and the click-depth distribution.
Architecture Crawler Script (scripts/crawl_architecture.py)
Crawls a live site within the configured page and depth limits, probes the four hostname and protocol variants, parses robots.txt, and collects every XML sitemap URL (following sitemap index files) into a JSON inventory.
Architecture Auditor Script (scripts/audit_architecture.py)
Runs the audit checks against the architecture inventory JSON, with options for click-depth, URL-depth, and tracking-parameter thresholds, and outputs the structured audit report data.