AI-Friendly Checker

Score how well every page on your site can be read, understood and cited by AI crawlers and answer engines like ChatGPT, Perplexity and Google's AI Overviews.

Sitemap Scanner

Provide the URL to a valid sitemap.xml or scan a single page.

If you provide a website URL, we will automatically look for /sitemap.xml

Skip sitemap parsing and check just this specific page.

Collect and check Title, Description, Canonical, Robots, and Open Graph.

Verify HTTP response codes (200, 404, etc.) and page speed (TTFB).

Duplicate finder (Title/H1), Missing Alts, Word Count, and Lang check.

Score how readable each page is to AI crawlers: structured data, extractable text, semantics, freshness.

Analysis Results

Heading structure and semantic errors per page. Expand a row to see its heading tree.

URL
Headings
Errors
No results to display yet. Enter a sitemap and start analysis.

What "AI-friendly" actually means

Search is no longer just ten blue links. A growing share of people get answers from AI assistants that read your page, summarise it, and decide whether to cite you. The discipline of optimising for that is increasingly called GEO (Generative Engine Optimization) or AEO (Answer Engine Optimization). The good news: it overlaps heavily with classic technical SEO — clean structure, real text, and machine-readable metadata.

The catch is that most AI crawlers don't execute JavaScript. They read the raw HTML your server returns. If your content only appears after the browser runs a framework, an AI assistant may see a near-empty page — and you simply won't be in the answer. This checker fetches each page the same way an AI crawler does (raw HTML, no JS) and scores what it can actually extract.

What the score is built from

  • Extractable content (25%). How much real text is in the raw HTML. A near-empty body is the single biggest red flag — it usually means the page is client-rendered and invisible to crawlers that don't run JavaScript.
  • Structured data (20%). JSON-LD (schema.org) is the most reliable way to hand an answer engine clean facts — what the page is, who wrote it, when. Pages without any are guessed at.
  • Title & description (20%). The short summary an assistant is most likely to lift verbatim. Missing either weakens how your page gets represented.
  • Heading outline (15%). One clear H1 and no skipped levels gives the model an unambiguous document structure to follow.
  • Semantic landmark (10%). A <main> or <article> element marks where the real content is, so boilerplate doesn't drown it out.
  • Declared language (5%) and freshness signal (5%). A lang attribute and a visible modified/published date help assistants answer in the right language and favour current content.

Making your site discoverable to AI

Two site-wide files round out the picture, beyond what any single page contains:

  • robots.txt access. Make sure you aren't accidentally blocking AI user-agents (GPTBot, ClaudeBot, PerplexityBot, Google-Extended and friends) that you actually want to reach you.
  • llms.txt. A proposed standard (llmstxt.org): a curated Markdown map of your key pages for LLMs. Honest caveat — major crawlers don't officially consume it yet, so treat it as a cheap, on-brand signal rather than a guaranteed lever.

Per-page scoring lives here today; site-wide robots.txt and llms.txt checks are on the way.

AI-friendliness FAQ

Share via:

👉 Was this tool helpful?