---
title: "How AI search actually chooses what to cite — and the five layers that decide"
url: https://martech.llc/research/how-ai-search-chooses-citations
publishedAt: 2026-05-29
updatedAt: 2026-05-29
author: sundar
category: research-note
summary: "AI search engines rarely cite the page that ranks first. They rewrite a question into a fan of related queries, retrieve candidates, then attach a citation to each sentence a source can verify. Drawing on Google's patents and peer-reviewed retrieval research, this maps the five layers of citation selection."
soWhat: "AI engines fan one question into many searches, then cite the sources cheapest to verify. The work is to be the most verifiable answer — not the highest-ranked page."
tags: ["generative-engine-optimization","answer-engine-optimization","ai-search","citations","query-fan-out"]
keywords: ["how ai search chooses citations","query fan-out","generative engine optimization","answer engine optimization","ai overviews citations","how to get cited by ai","llm citations","perplexity citations"]
claims: [{"id":"claim-1","text":"Google's 'Search with stateful chat' patent (US20240289407A1) describes a first large language model generating additional, alternative, supplemental, rewritten, and drill-down queries from a single user query.","source":"https://patents.google.com/patent/US20240289407A1/en","sourceTitle":"Google LLC — Search with stateful chat (US20240289407A1)","sourceDate":"2024-08-29"},{"id":"claim-2","text":"The patent describes selecting documents responsive to both the original query and the LLM-generated additional queries as candidate search-result documents.","source":"https://patents.google.com/patent/US20240289407A1/en","sourceTitle":"Google LLC — Search with stateful chat (US20240289407A1)","sourceDate":"2024-08-29"},{"id":"claim-3","text":"The patent describes query-independent trustworthiness measures generated based on a document's author, domain, and inbound links.","source":"https://patents.google.com/patent/US20240289407A1/en","sourceTitle":"Google LLC — Search with stateful chat (US20240289407A1)","sourceDate":"2024-08-29"},{"id":"claim-4","text":"The patent describes attaching a source identifier before or after a portion of the summary to indicate that the source document verifies that portion.","source":"https://patents.google.com/patent/US20240289407A1/en","sourceTitle":"Google LLC — Search with stateful chat (US20240289407A1)","sourceDate":"2024-08-29"},{"id":"claim-5","text":"The ALCE benchmark found that on the ELI5 long-form QA dataset, even the best models lack complete citation support 50% of the time.","source":"https://arxiv.org/abs/2305.14627","sourceTitle":"Gao et al. — Enabling LLMs to Generate Text with Citations (ALCE)","sourceDate":"2023-05-23"},{"id":"claim-6","text":"RARR automatically finds attribution for the output of any text generation model and post-edits the output to fix unsupported content while preserving the original.","source":"https://arxiv.org/abs/2210.08726","sourceTitle":"Gao et al. — RARR: Researching and Revising for Attribution","sourceDate":"2022-10-17"},{"id":"claim-7","text":"Research on verifiable generation emphasizes the critical role of retrieval accuracy in citation generation, showing substantial room for improvement in current LLMs.","source":"https://arxiv.org/abs/2310.05634","sourceTitle":"Li et al. — Towards Verifiable Generation","sourceDate":"2023-10-09"},{"id":"claim-8","text":"The peer-reviewed GEO study demonstrated that Generative Engine Optimization can boost a source's visibility by up to 40% in generative-engine responses.","source":"https://arxiv.org/abs/2311.09735","sourceTitle":"Aggarwal et al. — GEO: Generative Engine Optimization (KDD 2024)","sourceDate":"2023-11-16"},{"id":"claim-9","text":"In the GEO study the top methods Cite Sources, Quotation Addition, and Statistics Addition achieved a relative improvement of 30-40% on Position-Adjusted Word Count, with the best methods beating baseline by 41%.","source":"https://arxiv.org/abs/2311.09735","sourceTitle":"Aggarwal et al. — GEO: Generative Engine Optimization (KDD 2024)","sourceDate":"2023-11-16"},{"id":"claim-10","text":"The GEO study found that keyword stuffing, the classic SEO tactic, did not perform well in generative engines while credibility methods did.","source":"https://arxiv.org/abs/2311.09735","sourceTitle":"Aggarwal et al. — GEO: Generative Engine Optimization (KDD 2024)","sourceDate":"2023-11-16"},{"id":"claim-11","text":"In the GEO study the Cite Sources method led to a 115.1% increase in visibility for websites ranked fifth, while the top-ranked website's visibility decreased by 30.3% on average.","source":"https://arxiv.org/html/2311.09735v3","sourceTitle":"Aggarwal et al. — GEO (arXiv HTML v3), Section 5.2","sourceDate":"2023-11-16"},{"id":"claim-12","text":"The GEO study validated its methods on Perplexity.ai, a real-world generative engine, and demonstrated visibility improvements up to 37%.","source":"https://arxiv.org/abs/2311.09735","sourceTitle":"Aggarwal et al. — GEO: Generative Engine Optimization (KDD 2024)","sourceDate":"2023-11-16"},{"id":"claim-13","text":"Google's guidance directs creators to produce people-first, helpful content and to demonstrate experience, expertise, authoritativeness, and trustworthiness.","source":"https://developers.google.com/search/docs/fundamentals/creating-helpful-content","sourceTitle":"Google Search Central — Creating helpful, reliable, people-first content"},{"id":"claim-14","text":"In February 2026 Bing introduced AI performance reporting in Bing Webmaster Tools, exposing how pages appear and are cited inside AI answers.","source":"https://blogs.bing.com/webmaster/February-2026/Introducing-AI-Performance-in-Bing-Webmaster-Tools-Public-Preview","sourceTitle":"Bing Webmaster Blog — AI Performance in Bing Webmaster Tools"}]
---

# How AI search actually chooses what to cite — and the five layers that decide

AI search engines rarely cite the page that ranks first. Google's own patents describe rewriting a single question into a fan of related queries, gathering the documents that answer the whole fan, ranking them by trust signals such as author and domain, then attaching a citation to each sentence a source can verify. Citation is earned at the sentence, not the keyword.

That distinction is the whole game, and most marketing teams are still playing the old one. They track a blue-link rank for a head term while the engine quietly runs five other searches the team never sees, reads the results, and decides — sentence by sentence — which sources to name. This piece traces how that decision is made, using the primary sources that describe it: Google's patent filings, the peer-reviewed retrieval-attribution literature, and the controlled study that measured which content changes actually move the needle. It closes with a single mental model — the Citation Selection Stack — and an honest account of what the evidence does *not* yet prove.

<Aside kind="fact" title="The short version">
An answer engine turns one question into many searches, retrieves a candidate set for each, ranks candidates by relevance and trust, writes an answer, and attaches a citation to every span it can verify against a source. You influence five different layers — and the page that wins is the one that is cheapest to verify.
</Aside>

## How does an AI engine turn one question into a search?

It does not search your exact question. It searches a *fan* of related questions and pools the results.

<Claim id="claim-1">Google's [&ldquo;Search with stateful chat&rdquo; patent (US20240289407A1)](https://patents.google.com/patent/US20240289407A1/en) describes a &ldquo;first&rdquo; large language model generating additional queries — &ldquo;alternative query suggestions, supplemental queries, rewritten versions of the user&rsquo;s query, and/or &lsquo;drill down&rsquo; queries&rdquo; — from one user query.</Claim> The SEO community calls this *query fan-out*; the term itself does not appear in the filing, and the patent is written in the permissive &ldquo;may be&rdquo; language of a disclosed method, not a confirmation of what ships in production. But the mechanism it documents is unambiguous, and it reorders everything downstream.

<QueryFanOut />

Why it matters: <Claim id="claim-2">the patent describes selecting documents that are responsive to [both the original query and the generated additional queries](https://patents.google.com/patent/US20240289407A1/en) for inclusion in the candidate set.</Claim> A page that answers the literal question but none of the adjacent ones competes for a single slice of the fan. A page that covers the intent — the comparison, the price, the reliability question, the alternative — shows up across the fan and accumulates more chances to be cited. You are no longer optimizing for a keyword. You are optimizing for *coverage of an intent*.

Watch it play out. Each synthetic query builds its own candidate set; the page that wins is the one present across the most of them — not the one that ranks first for the words the user typed:

<FanCoverageRace />

## Which sources clear the trust bar?

Retrieval produces too many candidates to cite. Something has to filter them, and the patent is specific about what.

<Claim id="claim-3">It describes &ldquo;query-independent measures&rdquo; that include a trustworthiness measure for a document &ldquo;generated based on an author thereof, a domain thereof, and/or inbound link(s) thereto.&rdquo;</Claim> Read that carefully. *Query-independent* means these signals attach to the page before anyone asks a question — they are properties of who published it, where, and who vouches for it. That is the patent-level shadow of the same idea Google states plainly in its public guidance: <Claim id="claim-13">create [people-first, helpful content](https://developers.google.com/search/docs/fundamentals/creating-helpful-content) that demonstrates experience, expertise, authoritativeness, and trustworthiness.</Claim> The patent says &ldquo;trustworthiness,&rdquo; not E-E-A-T; the labels are the industry's, the signal is the engine's.

> A source identifier appearing before and/or after a portion of the NL based summary can indicate that the SRD, corresponding to the source identifier, verifies the portion.

<Claim id="claim-4">That line — the patent's description of how [citations are attached](https://patents.google.com/patent/US20240289407A1/en) — is the most important sentence in the whole filing.</Claim> The engine is not citing a page because the page is &ldquo;good.&rdquo; It is attaching a source to a *portion of its own answer* because that source **verifies** the portion. Verification is the unit of citation. Authority gets you into the candidate set; verifiability gets your URL printed next to a sentence.

<TrustBar />

## How does the model decide what it can attribute?

Here the primary sources stop being patents and start being peer review — and they are humbling.

<Claim id="claim-5">The [ALCE benchmark](https://arxiv.org/abs/2305.14627), the first systematic evaluation of LLM citation quality, found that on long-form questions &ldquo;even the best models lack complete citation support 50% of the time.&rdquo;</Claim> Half. The models that power answer engines routinely write sentences their own cited sources do not fully support. This is why the same brand, asked the same question twice, can get two different citation sets: attribution is probabilistic, not deterministic.

<AttributionGap />

The research response has been to bolt verification onto generation. <Claim id="claim-6">[RARR](https://arxiv.org/abs/2210.08726) — Retrofit Attribution using Research and Revision — &ldquo;automatically finds attribution for the output of any text generation model&rdquo; and then &ldquo;post-edits the output to fix unsupported content while preserving the original output as much as possible.&rdquo;</Claim> And the bottleneck is not the writer; it is the reader. <Claim id="claim-7">Work on [verifiable generation](https://arxiv.org/abs/2310.05634) emphasizes &ldquo;the critical role of retrieval accuracy&rdquo;: a model can only attribute to what retrieval surfaced.</Claim>

<Pullquote>If the retrieval layer never finds your page, no amount of authority earns the citation. You cannot be quoted from a room the engine never entered.</Pullquote>

This is the layer marketers most underestimate. They obsess over being authoritative and ignore being *retrievable and verifiable* — crawlable, fresh, structured, and written in sentences a machine can match to a claim.

## What does the evidence say actually moves visibility?

Mechanism is one thing; measured effect is another. The strongest causal evidence comes from the GEO study by a team at Princeton and IIT Delhi, [presented at KDD 2024](https://arxiv.org/abs/2311.09735), which ran controlled content modifications across a 10,000-query benchmark and a live engine.

<Claim id="claim-8">It reported that Generative Engine Optimization &ldquo;can boost visibility by up to 40% in generative engine responses.&rdquo;</Claim> But the *which* matters more than the headline. <Claim id="claim-9">The three top-performing methods — Cite Sources, Quotation Addition, and Statistics Addition — &ldquo;achieved a relative improvement of 30-40% on the Position-Adjusted Word Count metric,&rdquo; with the best methods beating baseline by 41%.</Claim> And the control condition is the punchline for anyone still running a 2015 playbook: <Claim id="claim-10">classic [keyword stuffing](https://arxiv.org/abs/2311.09735) &ldquo;[did not] perform well,&rdquo; while statistics and quotations &ldquo;show strong performance improvements across all metrics.&rdquo;</Claim>

<CitationWorthiness />

Two further findings reframe the strategy entirely. <Claim id="claim-11">The Cite Sources method &ldquo;led to a substantial [115.1% increase in visibility](https://arxiv.org/html/2311.09735v3) for websites ranked fifth in SERP, while on average, the visibility of the top-ranked website decreased by 30.3%.&rdquo;</Claim> Generative-engine visibility does not mirror classic ranking — it can invert it. And the lifts were not only simulated: <Claim id="claim-12">the study validated its methods on [Perplexity.ai](https://arxiv.org/abs/2311.09735), a real-world engine, and &ldquo;demonstrate[d] visibility improvements up to 37%.&rdquo;</Claim>

<Aside kind="warning" title="Read the numbers honestly">
The 40% and 37% figures are upper bounds for the best methods on favorable domains, not averages, and the benchmark ran largely on 2023–24 model versions. Treat them as direction and magnitude, not a guarantee for a 2026 engine. The mechanism is durable; the exact percentage is a snapshot.
</Aside>

## The Citation Selection Stack

Put the sources together and a single model falls out. Five layers sit between a question and a printed citation, and each has exactly one lever a publisher controls.

<CitationStack />

To make that concrete, take a single factual sentence you might publish and run it down the stack — what the engine does at each layer, and the one lever you control to survive it:

<CitationWorkedExample />

Most teams pour their effort into layer two — be indexed, be fast — and treat the rest as luck. The pages that get cited are built backwards from layer five: easy to verify, easy to quote, easy to attribute. The practical translation is almost rude in its simplicity. Write the sentence the engine would want to quote, make it true, and make it cheap to check.

## What this means for the work

Three shifts follow directly from the evidence, in order of effort-to-impact.

**Write to be quoted, not to rank.** Every section should open with a self-contained, factual sentence a source could verify — the GEO study's strongest levers were statistics, quotations, and citations, not keywords. If a paragraph cannot be lifted out and still be true, the engine cannot lift it either.

**Make verification cheap.** Expose the structure that lets an engine map a sentence to its source without guessing. Schema.org `Claim` and `FAQPage` markup, a visible last-updated date, and one outbound citation per claim turn an expensive verification into a cheap one. The page below this one does exactly that — every factual sentence here is wrapped as a machine-readable claim:

```json
{
  "@context": "https://schema.org",
  "@type": "Claim",
  "text": "Keyword stuffing did not transfer to generative engines; statistics and quotations did.",
  "appearance": { "@type": "WebPage", "@id": "https://martech.llc/research/how-ai-search-chooses-citations" },
  "firstAppearance": { "@type": "CreativeWork", "url": "https://arxiv.org/abs/2311.09735" }
}
```

**Measure the surface you can now see.** The engines are starting to report it. <Claim id="claim-14">In February 2026, [Bing introduced AI performance reporting](https://blogs.bing.com/webmaster/February-2026/Introducing-AI-Performance-in-Bing-Webmaster-Tools-Public-Preview) in Webmaster Tools, surfacing how pages appear inside AI answers.</Claim> Treat citation share like a rank: probe the questions your buyers ask, log which sources get named, and watch the set move.

## Where the evidence runs out

A serious reading has to mark its own edges. The Google patent is a *disclosed method* in permissive drafting language — it documents what the system can do, not a sworn description of production AI Overviews, and the terms &ldquo;query fan-out,&rdquo; &ldquo;AI Mode,&rdquo; and &ldquo;E-E-A-T&rdquo; are commentary, not patent text. The GEO results ran on 2023–24 engines and a Perplexity snapshot; whether the exact lifts hold on a 2026 frontier model is genuinely open. And no vendor — not Perplexity, not Bing, not Google — has published a precise, current account of how *its* live system chooses citations, so the engine-specific mechanics here lean on Google's patent and on academic RAG research applied by analogy.

What survives all of that is the shape of the thing. Engines fan out, retrieve, rank by trust, and cite by verification. The signals that win are credibility signals — statistics, quotations, sources — and they reward the verifiable page over the merely high-ranked one. Optimize for the question after the question, write sentences worth quoting, and make them cheap to check. That work compounds no matter which engine reads it next.

— Sundar Ramesh Kumar · martech.llc
