How We Audit GEO: Our Methodology, Step by Step

A GEO audit is a structured assessment of how likely AI answer engines are to retrieve, trust, and cite a website, and what to fix first to improve those odds. Our methodology scores a site across five dimensions and converts the findings into a prioritized action plan rather than a list of complaints. This is the framework we run on every engagement; we're publishing it because a transparent methodology is itself a trust signal, and because most "AI SEO audits" we see skip the steps that actually determine whether a model can use a page.

The five dimensions we score

Every audit produces a composite GEO score built from five weighted categories. We score each 0–100, weight by impact, and combine. The categories are deliberately ordered from "can the engine even see you" to "will the engine choose you."

1. AI crawler access

Before anything else, we confirm AI crawlers aren't blocked. We check robots.txt, meta robots tags, and HTTP headers for GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, Google-Extended, and Bing's crawler. A surprising number of sites accidentally block these, sometimes via an overzealous WAF rule or a blanket Disallow. If an engine can't fetch the page, no other optimization matters. We produce an access map showing exactly which engines can and cannot reach the site.

2. Technical foundation

We verify the things the retrieval layer depends on: server-side rendering or hydration that exposes content without JavaScript execution, valid HTTPS, reasonable performance, clean semantic HTML, a working sitemap.xml, and indexability. AI engines that rely on existing search indexes inherit those indexes' blind spots, so classic technical SEO health is a prerequisite, not an afterthought.

3. AI citability

This is the dimension most audits miss. We evaluate, passage by passage, how extractable and quotable the content is. The signals we look for are grounded in published research on generative-engine visibility, which found that content with cited statistics, direct quotations, and authoritative references is significantly more likely to be surfaced by AI engines (Aggarwal et al., KDD 2024). Concretely, we score:

Whether key pages open with a direct, self-contained answer to the implied question.
Whether claims are backed by linked, primary sources with visible dates.
Whether content is chunked into clean, liftable units rather than long unbroken prose.
Whether the page answers the actual questions users ask (we test against real prompts).

4. Content quality and E-E-A-T

We assess experience, expertise, authoritativeness, and trustworthiness: named authors with real credentials and bios, original analysis versus rehashed summaries, accurate and current information, and clear sourcing. Models, and the human reviewers behind their guidelines, discount anonymous, unsourced, or stale content. We flag thin pages and identify where first-hand expertise should be made explicit.

5. Structured data

We inventory existing Schema.org JSON-LD and validate it, then identify gaps: missing Organization, Article, Person, FAQPage, or BreadcrumbList markup, broken entity links, and dishonest dates. Valid, connected structured data removes ambiguity for engines about who you are and what your content means.

How we actually run it

The audit is parallelized. We delegate each dimension to a focused workstream that pulls live data by fetching the robots and headers, rendering pages to check SSR, extracting and scoring passages against target prompts, and validating every JSON-LD block, rather than relying on a single surface-level crawl. Running the dimensions in parallel keeps a full audit to hours, not days, and means each category is assessed by logic specialized for it.

We also test the brand directly inside the engines: we run a set of real, high-intent prompts through ChatGPT, Perplexity, Gemini, and Google AI Overviews and record whether the brand is cited, how it's described, and which competitors appear instead. That citation baseline is the closest thing GEO has to a rank check, and it's how we measure progress over time.

From findings to a plan

A score on its own is useless. Every audit ends with a prioritized action plan sorted by impact-to-effort: the few changes that unblock retrieval or fix broken entities come first, citability rewrites come next, and nice-to-haves like llms.txt come last. Each item names the page, the specific change, and the dimension it improves.

Why we re-audit monthly

The engines change their grounding behavior constantly, and competitors are optimizing too. A GEO score is a snapshot, not a finish line. We re-run the same audit on a cadence, compare the deltas, and track citation share so clients can see movement that traditional rank tracking would never reveal. The methodology stays fixed precisely so the comparison stays honest.

If you want to see where a site stands today, the five dimensions above are the right place to start. Run them in order, fix what blocks retrieval first, and measure citations, not just rankings.