Content Analysis & Keyword Cannibalization Auditor
Analyze word count, find duplicate Titles or H1s, identify missing image Alt tags, and check language attributes.
Sitemap Scanner
Provide the URL to a valid sitemap.xml or scan a single page.
Analysis Results
Heading structure and semantic errors per page. Expand a row to see its heading tree.
URL ↑ | Headings | Errors |
|---|---|---|
No results to display yet. Enter a sitemap and start analysis. | ||
What this analyser checks
The Content Analyser looks at the parts of your pages that search engines and assistive technology actually read — not the styling, but the words, the images, and the language metadata. It pulls every URL from your sitemap and reports on four things that move the needle: word count, duplicate headings and titles, missing image alt text, and the lang attribute on the <html> tag.
The headline feature is the duplicate finder. After scanning every page, the analyser groups identical H1 texts and identical titles across the whole site and flags them — that's how you spot keyword cannibalisation in a few seconds instead of clicking through every page by hand.
Common issues this analyser finds
- Keyword cannibalisation (duplicate H1 or Title). Two pages with the same H1 or title compete against each other in search results, splitting clicks and ranking signals. The analyser flags both pages so you can merge, redirect, or rewrite.
- Thin content. The word count is calculated from the actual page body (script, style, nav, header and footer are stripped first). Pages under about 300 words tend to underperform in search, and Google has been explicit that thin content can hurt the whole site.
- Images without
alttext. Missing alt is a problem for accessibility, for Google Images, and for resilience when an image fails to load. The analyser counts missing-alt images on every page so you can prioritise pages with the most. - Missing
langattribute.<html lang="en">tells screen readers how to pronounce text, helps Google route the page to the right locale, and triggers correct hyphenation in browsers. Every page should have it.
How to fix keyword cannibalisation
Once you've spotted duplicates, you have three good options:
- Merge the two pages into one stronger article and 301-redirect the loser.
- Differentiate them by rewriting headings and intent so each targets a distinct query.
- Canonicalise the weaker page to the stronger one if they genuinely need to coexist.
Content analysis FAQ
👉 Was this tool helpful?