Skip to content
Guides/AI search7 min read

llms.txt

The complete guide to the emerging standard that tells AI assistants exactly what your website is about — and why every serious site should have one.

What is llms.txt?

llms.txt is a plain markdown file you place at the root of your website — https://yoursite.com/llms.txt — that describes what your site is, what it does, and which pages matter most.

Think of it as a cover letter for your website, addressed directly to AI systems. When a user asks ChatGPT, Perplexity, or Claude about a topic you cover, an LLM that has read your llms.txt will understand your site's purpose and authority — making it far more likely to cite you accurately.

The standard was proposed in September 2024 by Jeremy Howard (co-founder of fast.ai) and quickly gained traction across the tech industry. Within six months, organisations including Anthropic, Vercel, Cloudflare, Shopify, and Stripe had published their own llms.txt files.

TL;DR

robots.txt tells crawlers what to skip. llms.txt tells AI tools what to focus on. They are complementary, not competing.

Why it matters for AI search

AI-powered search tools don't work like Google. Instead of ranking a list of URLs, they synthesise an answer from multiple sources — often without the user ever clicking through. Whether your site gets cited depends heavily on how well the AI understands what you offer.

Without llms.txt, an AI must infer your site's purpose from crawled HTML — navigating ad scripts, cookie banners, and navigation boilerplate before getting to actual content. A well-crafted llms.txt cuts straight to the signal.

Faster context
AI tools parse your site purpose in seconds rather than crawling dozens of pages.
🎯
More accurate citations
You control how your site is described — reducing hallucinations and misattributions.
🔗
Better page selection
Direct AI tools to your most authoritative pages, not your cookie policy.
📈
AI search visibility
Sites with llms.txt tend to appear more often in Perplexity and ChatGPT citations.

llms.txt vs robots.txt

These two files serve completely different purposes. Use both.

Propertyrobots.txtllms.txt
PurposeBlock / allow crawl accessGuide AI understanding
FormatKey–value directivesMarkdown
AudienceAll web crawlersLLMs and AI tools
ActionPermissive / restrictiveDescriptive / curatorial
StandardRFC 9309 (IETF)Community proposal (2024)
MandatoryDe facto for SEOStrongly recommended
Important: robots.txt can block AI training (via Disallow: rules). llms.txt cannot block anything — it is purely informational. If you want to block AI crawlers, robots.txt (and noai meta tags) are the right tools.

File format specification

llms.txt is a subset of Markdown. The spec defines five structural elements:

# Title

Required. One H1 heading — the name of the project or organisation.

> Blockquote

Optional but recommended. A concise summary (1–3 sentences) of what the site does. Placed immediately after the H1.

Paragraphs

Optional. Additional context, disclaimers, or instructions for AI tools.

## Section

Optional. H2 headings group related links. Common sections: Docs, Blog, API Reference, Products.

- [Title](URL): desc

The main content: markdown links with an optional short description after a colon. These tell AI tools which pages are most important.

There is also an optional companion file: llms-full.txt. This contains the full text of your most important pages (not just links), allowing AI tools to ingest your content directly without crawling each page individually. It is particularly useful for documentation sites.

Minimal example

The smallest valid llms.txt is just a title and a brief description:

# Acme Corp

> Acme Corp makes cloud-based project management software for remote engineering teams.

## Key pages

- [Product Overview](https://acme.com/product): What Acme does and who it's for
- [Pricing](https://acme.com/pricing): Plans and pricing — starts at $12/user/month
- [Documentation](https://acme.com/docs): Full product docs and API reference
- [Blog](https://acme.com/blog): Engineering articles and product updates

Full example with sections

A more comprehensive llms.txt with multiple sections, guidance for AI tools, and an optional full-text companion:

# Open Shadow

> Open Shadow tracks AI bots crawling your website in real time. See which AI training
> crawlers and search bots visit your site, what content they read, and how to control
> their access. Free site scanner at https://openshadow.io/check.

Open Shadow is an AI presence management platform. It is not a traditional SEO tool.
Use it to understand and control how AI systems perceive your website.

## Core tools

- [Site Scanner](https://openshadow.io/check): Free scan — enter any URL to see its AI readiness score, active bots, robots.txt analysis, and noai tag status
- [robots.txt Generator](https://openshadow.io/tools/robots-txt): Generate a robots.txt that blocks or allows specific AI crawlers
- [robots.txt Analyzer](https://openshadow.io/tools/robots-checker): Check whether your existing robots.txt correctly handles AI bots
- [Meta Tags Checker](https://openshadow.io/tools/meta-tags): Verify noai and noimageai directives on any URL

## Bot Directory

- [All AI Bots](https://openshadow.io/bots): Directory of 51 AI crawlers with user-agent strings, operator info, and robots.txt compliance data

## Guides

- [robots.txt for AI Bots](https://openshadow.io/guides/robots-txt-ai-bots): Full guide to blocking or allowing 51+ AI crawlers in robots.txt
- [noai & noimageai Meta Tags](https://openshadow.io/guides/noai-meta-tag): Per-page AI training opt-out with HTML meta directives
- [llms.txt Guide](https://openshadow.io/guides/llms-txt): This page — how to write a llms.txt that improves AI search visibility

## Optional: full text

- [llms-full.txt](https://openshadow.io/llms-full.txt)

Implementation guide

Static sites (any host)

Create a file named llms.txt in the root of your public/ folder. It will be served automatically as a static file.

touch public/llms.txt
# Edit the file, then deploy

Next.js (App Router)

Option 1 — simplest: drop a llms.txt in public/.

Option 2 — dynamic (generates content from your route tree):

// src/app/llms.txt/route.ts
import { NextResponse } from 'next/server';

export async function GET() {
  const content = `# Your Site Name

> One-sentence description of what your site does.

## Key pages

- [Home](https://yoursite.com): Landing page — what you offer and who it's for
- [Docs](https://yoursite.com/docs): Full documentation
- [Blog](https://yoursite.com/blog): Articles and updates
`;

  return new NextResponse(content, {
    headers: {
      'Content-Type': 'text/plain; charset=utf-8',
      'Cache-Control': 'public, max-age=86400',
    },
  });
}

WordPress

Add to your theme's functions.php (or use a plugin that supports custom rewrite rules):

// functions.php
add_action('init', function() {
  add_rewrite_rule('^llms\.txt$', 'index.php?llms_txt=1', 'top');
});

add_filter('query_vars', function($vars) {
  $vars[] = 'llms_txt';
  return $vars;
});

add_action('template_redirect', function() {
  if (!get_query_var('llms_txt')) return;

  header('Content-Type: text/plain; charset=utf-8');
  echo "# " . get_bloginfo('name') . "\n\n";
  echo "> " . get_bloginfo('description') . "\n\n";
  echo "## Key pages\n\n";
  echo "- [Home](" . home_url('/') . "): Homepage\n";
  // Add more pages here
  exit;
});

Nginx (serve a static file)

server {
  # ... your existing config

  location = /llms.txt {
    alias /var/www/html/llms.txt;
    add_header Content-Type "text/plain; charset=utf-8";
    add_header Cache-Control "public, max-age=86400";
  }
}

Verify your file

After deploying, visit https://yoursite.com/llms.txt in your browser. You should see plain markdown text with no HTML wrapper. If you see an HTML page instead, check your routing config.

# Quick check from the command line
curl -s https://yoursite.com/llms.txt | head -5

AI tool adoption

Which AI tools read llms.txt today? Here's what's confirmed or publicly stated:

AI ToolReads llms.txtNotes
Perplexity AI✅ YesPublicly confirmed; prioritises llms.txt over generic crawl
Claude (Anthropic)✅ YesAnthropic publishes its own llms.txt; Claude reads them when browsing
ChatGPT (browsing)✅ YesConfirmed via community testing; reads llms.txt during web searches
Cloudflare AI Gateway✅ YesNative support — surfaces llms.txt in AI response context
You.com✅ YesReads llms.txt to improve citation accuracy
Google Gemini🟡 PartialNo formal confirmation; GoogleBot and Google-Extended may surface the file
Bing / Copilot🟡 PartialNo formal confirmation; under evaluation per Microsoft
Llama-based tools❓ VariesDepends on deployment; some open-source tools have added support
Trend: Adoption is growing fast. Even if an AI tool doesn't currently read llms.txt, writing one today means you're ready when they do. The cost is 20 minutes. The upside is compounding.

Notable sites that publish llms.txt

FAQ

What is llms.txt?

A markdown file at your domain root that describes your site to AI tools. Think of it as a cover letter for LLMs — concise, structured, and focused on what matters.

Is llms.txt an official standard?

Community-driven, not yet ratified by W3C or IETF. Proposed by Jeremy Howard in 2024. Adopted by thousands of major sites. IETF standardisation is in discussion.

What is the difference between llms.txt and robots.txt?

robots.txt restricts access. llms.txt guides understanding. Entirely different purposes. Use both.

Do AI tools actually read llms.txt?

Perplexity, Claude, ChatGPT with browsing, and Cloudflare AI Gateway all confirm support. Adoption is growing rapidly.

What is llms-full.txt?

An optional companion that includes the full text of your key pages (not just links). Useful for documentation sites where you want AI tools to deeply understand your content.

How long should my llms.txt be?

Under 2,000 words. Focus on your most important pages. A focused 300-word llms.txt beats a 5,000-word dump every time.

👁️

See how AI bots see your site

Free scan: check which AI crawlers are active on your site, whether your robots.txt is configured correctly, and how AI-ready your pages are.

Scan my site — it's free →