GEO Audit Checklist: 25 Checks for Generative Engine Optimization
By the SiteBeat team · Updated 3 July 2026 · 7 min read
Quick answer: a GEO audit checks whether AI answer engines can crawl, read, extract and trust your website. This checklist covers all 25 essential checks across four pillars — crawler access, structure & extractability, authority & freshness, and structured data. Every check is deterministic: it's either true of your site or it isn't, no AI judgement required.
Pillar 1 — Crawler access (can AI fetch you?)
robots.txt allows citation crawlers — OAI-SearchBot, PerplexityBot, Claude-SearchBot, Googlebot are not disallowed.
No blanket disallow — no User-agent: * + Disallow: / left over from staging.
Content is server-rendered — your main content appears in the raw HTML. AI crawlers do not run JavaScript.
No noindex — neither a meta robots tag nor an X-Robots-Tag header excludes the page.
No bot-challenge wall — crawlers get your content, not a "checking your browser" interstitial.
HTTPS — with a valid certificate.
XML sitemap — exists and is referenced from robots.txt.
No AI opt-out directives you didn't intend — noai meta tags, tdm-reservation, or Content-Usage headers explicitly tell AI not to use your content.
Pillar 2 — Structure & extractability (can AI quote you?)
Exactly one H1 — the page's topic anchor.
No skipped heading levels — H2 → H4 breaks the outline AI uses to segment content.
Semantic landmarks — <main> or <article> separates content from navigation.
Question-style subheadings — H2s phrased as the questions users ask.
Answer-first opening — a 40–80 word direct answer at the top, liftable verbatim.
Extractable chunk sizes — paragraphs of 40–100 self-contained words, not walls of text.
Lists and tables — structured formats are cited disproportionately often.
Image alt text — AI can't read or cite an image it can't describe.
Pillar 3 — Authority & freshness (can AI trust you?)
Machine-readable dates — datePublished/dateModified in schema, kept current.
Outbound citations — links to authoritative sources (.gov, .edu, primary research) correlate with ~30% more AI citations (Princeton GEO study).
Statistics with attribution — concrete numbers are the most quotable sentences on any page.
Author attribution — a byline backed by a schema Person with a profile link.
Trust pages — About, Contact, Privacy, Terms all exist and are linked.
No keyword stuffing — over-repetition measurably reduces AI visibility.
Pillar 4 — Structured data (can AI identify you?)
Valid JSON-LD — present and parseable (malformed schema is ignored entirely).
Organization schema with sameAs — plus logo and contact point, so AI can ground your brand as an entity.
Content-type schema — Article (with author, dates, image, publisher) on posts; FAQ schema on question pages; complete Open Graph and Twitter Card tags for link unfurling.
How do you run all 25 checks at once?
Manually, this checklist takes a couple of hours per site. A SiteBeat scan runs every check on this list (plus ~25 more) in about 20 seconds and returns an AI readiness grade from A+ to F, a per-crawler access matrix, the exact text AI extracts from your pages, and — in the full audit — copy-paste fixes: corrected robots.txt, generated llms.txt, and prefilled JSON-LD. The full audit is a one-time €29; see a sample report.
Run the full 50-check GEO audit
Free scan · grade in 20 seconds · full audit €29 one-time