GEO · June 10, 2026 · 4 min read

LLM SEO — how language models choose the sources they cite

LLM SEO explained: the crawl-to-citation chain, llms.txt, structured data, mentions and trust signals, and how language models pick the sources they cite.

By Mediseo

LLM SEO is the work of becoming one of the sources that language models like ChatGPT, Gemini, Perplexity and Claude actually pull answers from. The mechanics behind it form a three-link chain: the model must be able to crawl your site, your content must sit in an index the model retrieves from, and the text must be worth citing. If one link fails, it doesn't matter how strong you are in the other two. This article walks through each link — and the signals that decide whether you, specifically, get mentioned.

The short version

LLM SEO is about being found, understood and cited by language models — not just ranked on Google.
The chain has three links: crawl → index → citation. All three have to work.
The toolkit: open access for AI crawlers, an llms.txt file, structured data, and mentions from sources the models trust.
Mentions count even without a link. Models learn who you are from your name appearing in the right context.
Nobody can guarantee a citation. What you can do is remove every obstacle standing in the way.

Link 1: Crawl — let the models in

The first link is mundane, and still the most common place things break: the model has to be able to read your page. AI companies run their own crawlers — at the time of writing, GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot and Google-Extended among others — and plenty of websites block them in robots.txt without knowing it, because an old "block all bots" rule is still hanging around.

Check three things:

robots.txt. Are you letting the AI crawlers in, or did you inherit a rule that locks them out?
Rendering. Many AI crawlers run little or no JavaScript as of today. Does your key content sit in the HTML your server delivers, or is it assembled in the browser afterwards? The latter is like inviting guests over and serving dinner in a locked fridge drawer.
Speed and stability. Slow pages and frequent error codes make crawlers give up and return less often.

This is classic technical SEO work, which is why SEO and AI visibility belong together instead of being two separate projects.

Link 2: Index — be where the models retrieve from

An AI answer is built from one of two sources: what the model learned during training, or what it fetches live when you ask the question. For anything involving businesses, prices and services it is almost always the latter — the model runs a search and reads the top results before answering.

And where does it search? In the same indexes as always: Gemini retrieves from Google, Copilot from Bing, Perplexity from its own index. That means old-fashioned indexing hygiene — a sitemap, indexable pages, decent rankings — is the entry ticket to the AI answer. If you're invisible in the search indexes, you're invisible to the models. We've written a separate breakdown of how Perplexity, Gemini and Copilot each pick sources.

Link 3: Citation — make the text worth clipping

Once the model has found ten candidate pages, it chooses what actually ends up in the answer. Here, the winning text is the text that's easy to cite:

Answer first, explanation after. Sentences that stand on their own and can be clipped out without losing their meaning.
Concrete facts. Numbers, prices, dates and definitions — things that can be verified. "We're passionate about quality" can't be cited as a fact, because it isn't one.
Clear structure. Headings that look like questions, paragraphs that look like answers. A model hunting for an answer skips past fog with impressive efficiency.

llms.txt and schema — a fact box for machines

Two tools make your content explicitly machine-readable. llms.txt is a short file summarising who you are, what you offer and where the key pages live — a fact box the model can trust instead of guessing. Structured data (JSON-LD) does the same per page: this is a business, this is a service with a price, this is a question with an answer. Neither guarantees anything — but both remove guesswork, and models prefer citing sources they don't have to interpret.

Mentions and trust signals — authority machines can see

The final layer is the hardest one to fake: other people talking about you. Language models pick up mentions — your name appearing in industry round-ups, directories, press and expert articles — even where there's no link at all. If you're consistently mentioned in the same context ("SEO agency in Sandefjord", "AI search specialists"), an association forms that the model can repeat.

The trust signals that matter look suspiciously like old-fashioned credibility: the same facts about your business everywhere on the web, a real address and registration number, reviews, named authors, and content with visible dates that actually gets maintained. Claims that contradict each other from page to page are the opposite of a trust signal.

What you can control — and what you can't

The honest summary: nobody controls what a language model cites, and anyone promising you "guaranteed mentions in ChatGPT" is selling something they don't own. The models change behaviour often, and what's true at the time of writing may be adjusted within a quarter. What you can control is the whole chain on your side: access, indexing, citable text, machine-readable facts and consistent signals. That's measurable work, and it's what we do as part of ongoing SEO and AI search work from NOK 5,500/month.

How AI is changing the SEO craft itself is covered in our article on AI SEO. And if you're wondering how visible your business is to the language models today, book a short call — or read more about how we work with AI search.