Your docs are now read by machines

November 25th, 20259 mins read

Your docs are now read by machines

This is nothing new. Machines have always scouted the web for answers. Google has been crawling sites for decades (scanning metadata, structure, keywords, all the SEO things).

But what’s happening today is different from what we’ve experienced before. Back then, search engines mostly looked for matches and signals. Now, they go deeper.

Today’s AI tools read your pages (they don’t just scan). They break down your text, understand the structure, interpret your examples, and then synthesize everything into answers for other developers.

Your documentation, blog posts, SDK guides, and READMEs are now being read by both humans and machines. However, for the purpose of this article, let’s focus on documentation.

If your docs aren’t optimized for machines, they can easily be misinterpreted, ignored, or worse… excluded completely from the answers AI gives to developers. And that’s a big problem, especially as AI becomes a primary discovery tool.

Let me paint a scenario…

I shared this while giving this talk at OSCAFEST'25.

Imagine your company provides a tool or SDK that developers can integrate into their dashboard. Everything is smooth, adoption is growing, and people love it. Then one day, your team introduces a new parameter, tweaks an existing one, changes a default value, or adds a new authentication flow.

Of course, you update your documentation. You push it to production. Internally, everyone knows the latest changes.

But there’s a problem. A developer somewhere is writing code with Cursor (or any AI-assisted IDE). The AI model Cursor is using hasn’t synced with your updates yet. It doesn’t know you changed anything. So what happens?

Cursor keeps suggesting the old parameters, the old structure, the old code samples, because it’s pulling from:

  • Outdated articles
  • Old examples on blogs
  • Cached information from before your update
  • Anything except your official docs (or the old version of your docs)

And the developer starts getting bugs. Not because they wrote anything wrong. But the machine assistant guiding them is working with incomplete or outdated information.

This is exactly why your docs should be the first point of truth for machines. If AI tools can’t properly read, understand, or sync with your docs:

  • Developers get wrong suggestions
  • Your SDK feels unreliable
  • Your support tickets increase
  • And your product looks harder to use than it actually is

All because the machine audience wasn’t properly considered.

This is why we’re at a point where documentation needs to be optimized not just for humans, but for machines that learn from those docs and pass the knowledge on to other developers.

That’s just one scenario, but it shows what’s happening at a bigger scale. AI tools are becoming a second interpreter of our documentation, and if they misunderstand or can’t access the latest updates, developers get the wrong guidance.

And here’s where it gets even more interesting: AI isn’t just reading your docs now… It’s actually becoming a major traffic source for your product.

Let me explain.

AI is now a big traffic source

One thing we’re all starting to notice is that AI isn’t just helping developers write code. It’s quietly becoming a discovery engine.

People aren’t always “Googling” tools anymore. They’re asking ChatGPT. They’re asking Claude. They’re asking Cursor. They’re asking Perplexity.

And these AI tools point them to your product. A perfect example is when Guillermo Rauch (Vercel’s CEO) mentioned that 10% of Vercel’s signups now come from ChatGPT, and that number used to be less than 1%.

That’s a massive shift in how developers find tools. Think about what that means:

  • AI is recommending products.
  • AI is explaining how tools work.
  • AI is teaching developers how to use SDKs.
  • AI is onboarding users with your documentation as the source.

In other words, AI is becoming a real distribution channel. If AI can understand your docs, developers can discover your product through AI.

But the opposite is also true. If AI struggles to understand your docs, your product becomes invisible in this new discovery flow.

This is why optimizing documentation for machines isn’t just a “nice to have.” It directly affects:

  • Your product adoption
  • Developer experience
  • The suggestions AI assistants give
  • Whether your tool is recommended to developers at all

And this leads us perfectly into the next section.

GEO — Generative engine optimization

I’ve read a ton of posts on this subject. In fact, Gartner predicts that by 2026, traditional search engine volume will drop by 25%, as AI chatbots and virtual agents increasingly become the new answer engines instead of Google.

Now, with all this talk about AI reading your docs and recommending your tools, the big question becomes: How do you make sure AI actually understands your documentation?

This is where GEO, which stands for generative engine optimization, comes in.

Think of it like this:

  • SEO was about making your content easy for Google to find.
  • GEO is about making your content easy for AI models to understand**.**

What you should know is that AI tools don’t “browse” the way humans do. They don’t click links, look around, or skim pages. They consume your content. They read your README, your documentation, and your examples, breaking them down into tokens and connecting patterns, and then store that understanding so they can reuse it when developers ask questions.

So GEO is just preparing your content in a way that helps AI:

  • read it
  • parse it
  • interpret it
  • and not misunderstand it

Here are the core parts of GEO.

How GEO works

Now that we’ve talked about what GEO is, let’s break down how it actually works. I’ve seen different explanations online, but almost all of them boil down to four major ideas. When you understand these four, you basically understand the whole concept.

I’d say the first thing you need to optimize for GEO is SEO. Everything you know or have done for SEO is 100% valid.

1. Establish entity presence This is just a fancy way of saying: Make sure AI knows who you are, everywhere.

AI models rely heavily on "entity recognition." If your product or project appears consistently in multiple online locations, it becomes easier for AI to identify it correctly. This means:

  • Be present on your own website, GitHub, docs site, and community pages
  • Use consistent naming (very important)
  • Keep your social profiles, directories, and mentions up to date
  • Make sure your product is described the same way everywhere

Search Engine Land puts it this way: “The more reputable sources that reference you, the more AI systems recognize you as a legitimate entity.”

So, consistency builds trust for machines as well (not just for humans).

2. Feed the generative engines AI models don’t magically know things. They learn from what’s available.

The more high-quality, consistent information you publish, the better AI understands your product. For example:

  • Your documentation
  • Your README
  • Blog posts
  • Guides
  • Changelogs
  • Tutorials
  • API references
  • Samples and examples

You’re basically leaving breadcrumbs everywhere so the model can form a full picture. And make sure everything aligns as mismatched docs confuse AI (and humans, too).

3. Use structured, AI-readable content This is where the real magic happens. You can have great documentation, but if it’s written in a chaotic way, AI will struggle to parse it. LLMs prefer docs that:

  • Have clear headings
  • Use short, focused sections
  • Don’t mix unrelated topics in the same section
  • Use predictable formatting
  • Include fenced code blocks for anything technical
  • Avoid vague pronouns (like “it”, “this”, “that”)

Also, formats like:

  • Markdown
  • HTML
  • schema.org structured data

…make your content even easier for machines to understand. This ties into tokenization too.

4. Provide rich context When your docs leave gaps, AI will try to infer meaning, and many times it infers incorrectly. That’s how you get wrong suggestions (that sound or look right) in Cursor, Copilot, or ChatGPT.

Rich context means:

  • Link related pages internally
  • Add examples (lots of them!) and data
  • Define key terms instead of assuming
  • Explain the “why,” not just the “how”
  • Include parameters, variations, notes, warnings

The more context you provide, the more accurate the AI’s understanding becomes, and the more accurate the answers it gives to developers.

You should make your content self-contained, so AI doesn’t have to guess.

The llms.txt standard

Last year, Jeremy Howard proposed adding a file called /llms.txt to your site, which instructs large language models on how to read and utilize your content.

Think of it as:

  • robots.txt → for search engine crawlers
  • sitemap.xml → for listing all your URLs
  • llms.txt → for helping LLMs understand what’s important and where to find it

The proposal describes it as a markdown file at the root of your site that includes:

  • A bit of background about your project
  • Pointers to the most important docs
  • Links to clean, LLM-friendly versions of your content
  • Optional usage notes or guidance for AI tools

So instead of AI having to guess which pages matter, you hand it a map.

While digging through posts and examples, I found:

  • Mintlify automatically generates /llms.txt and /llms-full.txt for docs it hosts, so AI tools like ChatGPT and Perplexity can index docs more easily.
  • Anthropic, Perplexity, Zapier, ElevenLabs, Vercel, Cursor and others all have some form of llms.txt or llms-full.txt implementation in the wild. You can literally open it by appending the file to their docs URL — for example:
    • Anthropic → https://docs.anthropic.com/llms.txt
    • Perplexity → https://docs.perplexity.ai/llms-full.txt
    • Zapier → https://docs.zapier.com/llms.txt and https://docs.zapier.com/llms-full.txt
    • FastHTML → https://www.fastht.ml/docs/llms.txt
    • A2A (open source) → https://github.com/a2aproject/A2A/blob/main/llms.txt
  • Ecommerce platforms like BigCommerce are writing guides showing how llms.txt can point AI directly to product feeds, so AI doesn’t have to scrape messy HTML product pages.

So it’s not just a “docs nerd” thing anymore. It’s docs, SaaS, ecommerce, and even marketing all trying to control how AI sees their content.

The “present” of documentation…

Machines are reading our docs. AI is onboarding our users. Developers are discovering tools through LLMs before they ever open Google.

We’re writing in a world where our documentation has two audiences: humans who need clarity, and machines that need structure.

And the funny thing is that optimizing for machines actually makes our docs better for humans too, as there is a clearer structure, better examples, consistent terminology, and fewer assumptions.

Everything I’ve shared in this article isn’t some far-off trend or “nice to have later.” It’s already affecting how your SDK, library, product, and project are perceived.

Your documentation isn’t just teaching people anymore. It’s teaching the machines that teach the people. And that’s the reality we’re writing into now.