- Home
- The Thinking Wire
- An AEO Best Practice, Debunked by Its Own Data
We told you to publish an llms.txt. So did most of the AI-readiness advice on the internet. It became a checklist item, a thing you ship to prove your site is ready for the agent era. Then Ahrefs went and counted.
They analyzed 137,210 domains against their May 2026 traffic data. Of the sites that had bothered to publish an llms.txt, 97% of those files received zero requests. Not low traffic. Zero. Only about 3% saw any measurable reads at all.
That is a best practice failing its own test. And the failure is instructive, because it reveals exactly which AI is touching your site and which is staying away.
The file nobody asked for
llms.txt was proposed as a discovery layer for language models. A single file at the root of your domain, listing what exists and where to find it. A sitemap for machines that read instead of crawl. The logic was clean: agents are coming, give them a map.
The flaw is in the premise. A map only helps if someone goes looking for it. Ahrefs puts it bluntly: zero AI bots go looking for llms.txt files that don’t exist, and almost none go looking for the ones that do. The standard assumed agents would adopt a convention nobody built them to use. They didn’t.
John Mueller of Google had already said the quiet part. He called llms.txt “a temporary crutch” and “not something non-developer sites need to worry about.” That assessment now has 137,000 domains of evidence behind it.
Note the asymmetry. Ahrefs found that 28% of domains in their sample publish a valid llms.txt, and they flag that as an upper bound because their data skews technical. Real adoption across the open web is lower. So you have a meaningful share of technically-sophisticated sites spending effort on a file that, 97% of the time, no machine ever reads. That is wasted governance effort, pointed at a surface that does not exist for the readers it was meant to serve.
Who is actually reading your site
The same dataset answers the more useful question. If agents are not fetching llms.txt, what are they fetching, and which agents are they?
Ahrefs broke bot traffic into categories. AI-related activity totals 19.5% of bot traffic, and the breakdown matters more than the headline:
- AI agents and infrastructure: 10.5%. This is the largest slice by far.
- Training crawlers: 5.3%. The bots harvesting content for model training.
- AI assistants: 2.5%. This is Perplexity, ChatGPT browsing, the consumer-facing answer engines everyone optimizes for.
- AI retrieval bots: 1.1%. The fetch-on-demand bots that pull a page to answer a live query.
Stare at those last two numbers. The AI assistants and retrieval bots, the entire category that AEO advice is built around, account for 3.6% of bot traffic combined. The thing most marketing teams are optimizing for is a rounding error in their own logs.
The agents-and-infrastructure slice is where the action is. And inside it, Ahrefs found that Claude Code out-fetched every assistant and retrieval bot except GPTBot. A coding agent. Something other than a search engine, an answer engine, or a chatbot summarizing your page for a curious human. A tool that developers run from a terminal, reaching out to read your documentation while it writes code against your API.
That changes who your second audience is. The reader showing up in volume is the developer’s agent, pulling your reference docs to integrate against your API, while the consumer asking Perplexity about your product barely registers. Those two readers want completely different things, and you have probably been optimizing for the one that barely shows up.
The real lesson is about logs, not files
The llms.txt story is a specific failure with a general cause. A tactic spread as best practice before anyone validated it against traffic. The convention sounded right. It mapped cleanly to how humans think about discovery. It just lacked evidence that the readers it targeted would ever use it.
This happens constantly in AI readiness. A vendor names a technique, the technique sounds sophisticated, and it propagates through blog posts and conference talks until it feels mandatory. Nobody checks the logs, because checking the logs is boring and the technique already feels true.
Your server logs do not care what feels true. They record exactly which user-agents fetched which pages, how often, and whether they came back. That record is the only honest answer to “is our site working for AI.” Ahrefs built their entire finding on that kind of first-party data, and it overturned a widely-repeated practice in one study.
The operational discipline here is simple to state and rare to follow. Before you adopt an AI-readiness tactic, ask what evidence exists that it works. After you adopt it, check whether the traffic you expected actually arrived. If the file you published gets zero reads, stop publishing it and put the effort where the agents already are.
Do this now
Pull your access logs for the last 30 days and segment by user-agent. Three questions:
- Is anything fetching your llms.txt? Grep for it. If the request count is zero or near it, you have your answer about that file, and you can reclaim the maintenance effort.
- Which AI bots actually hit your site? Sort by volume. If coding agents like Claude Code dominate, your priority is clean, accurate, machine-readable documentation, not answer-engine SEO.
- Where is the real agent traffic landing? Those are the pages that deserve governance attention. The ones agents never touch do not need an AI strategy. They need to be left alone.
The teams that win the agent era will not be the ones who shipped every recommended file. They will be the ones who read their own logs and pointed their effort at the traffic that is actually there.
This analysis synthesizes We Analyzed 137K Sites: 97% of llms.txt Files Never Get Read (Ahrefs, June 2026).
Victorino Group helps teams validate AI-readiness tactics with real traffic data, not vendor hype. Let’s talk.
All articles on The Thinking Wire are written with the assistance of Anthropic's Opus LLM. Each piece goes through multi-agent research to verify facts and surface contradictions, followed by human review and approval before publication. If you find any inaccurate information or wish to contact our editorial team, please reach out at editorial@victorinollc.com . About The Thinking Wire →
If this resonates, let's talk
We help companies implement AI without losing control.
Schedule a Conversation