Skip to content
Docusaurus · React · Static Site·9 min read

How to Block AI Bots on Docusaurus: Complete 2026 Guide

Docusaurus generates a static React site — like MkDocs, it has no server process. Bot blocking splits across the content layer (robots.txt in static/, noai meta via headTags in docusaurus.config.js) and the hosting platform layer (X-Robots-Tag headers, hard 403 via Edge Functions). The headTags config approach is the simplest — no theme swizzling required.

headTags is all you need for noai meta — no swizzling

Docusaurus v2/v3 supports a headTags array in docusaurus.config.js that injects arbitrary HTML tags into every page's <head>. This is simpler than swizzling the Root or Layout component. Reserve swizzling for per-page conditional logic that can't be done with a static config entry.

Methods at a glance

MethodWhat it doesWhere it lives
static/robots.txtSignals bots which paths are off-limitsstatic/ → build/
headTags in docusaurus.config.jsnoai meta on every pageConfig file
<head> in MDX front matternoai meta on specific pagesIndividual .md/.mdx files
netlify.toml / vercel.json / _headersX-Robots-Tag on all responsesHosting platform
Edge FunctionHard 403 on known AI User-AgentsNetlify / Cloudflare
Swizzled Root/LayoutPer-page conditional metasrc/theme/ (advanced)

1. robots.txt — static/ directory

Place robots.txt in the static/ directory at the root of your Docusaurus project. Docusaurus copies the entire static/ directory into build/ unchanged. No config needed.

# static/robots.txt
# Copied to build/robots.txt by Docusaurus — no config needed

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: *
Allow: /
# Verify after build:
npm run build
ls build/robots.txt   # should exist

static/ vs docs/ vs src/

Only files in static/ are copied to the build output as-is. Files in docs/ are Markdown content processed by Docusaurus. Files in src/ are React components. robots.txt goes in static/ — not in docs/ or src/.

2. noai meta tag — headTags config

The headTags array in docusaurus.config.js (or docusaurus.config.ts) injects HTML tags into the <head> of every generated page. This is the cleanest approach — no component swizzling required.

// docusaurus.config.js (or .ts)
import type { Config } from '@docusaurus/types';

const config: Config = {
  title: 'My Documentation',
  url: 'https://docs.example.com',
  baseUrl: '/',

  // Inject meta tags into every page's <head>
  headTags: [
    {
      tagName: 'meta',
      attributes: {
        name: 'robots',
        content: 'noai, noimageai',
      },
    },
  ],

  // ... rest of config
};

export default config;

headTags is available in Docusaurus v2.4+ and v3.x. For older versions, use the theme swizzling approach (Section 4).

3. Per-page meta — MDX head block

In any .md or .mdx file, add a <head> block directly in the file to inject page-specific meta tags. These override or supplement the global headTags config.

---
# docs/some-page.mdx
title: My Page
description: Page description
---

<head>
  {/* Block AI indexing on this specific page */}
  <meta name="robots" content="noindex, noai, noimageai" />
</head>

# My Page

Content here...

The <head> block in MDX uses JSX syntax — self-closing tags need />. This is processed by Docusaurus's MDX pipeline, not HTML directly.

4. Theme swizzling — conditional per-page meta

For conditional logic (e.g. different robots values based on doc category or front matter), swizzle the Root component to wrap every page in a custom component that injects the correct meta tag.

# Eject the Root component (safe — wraps the whole app):
npx docusaurus swizzle @docusaurus/theme-classic Root --eject
// src/theme/Root.tsx — after swizzling
import React from 'react';
import Head from '@docusaurus/Head';
import { useLocation } from '@docusaurus/router';
import OriginalRoot from '@theme-original/Root';

// Pages that should not be indexed
const NOINDEX_PATHS = ['/internal/', '/draft/'];

export default function Root(props) {
  const { pathname } = useLocation();
  const isNoIndex = NOINDEX_PATHS.some((p) => pathname.startsWith(p));

  return (
    <>
      <Head>
        <meta
          name="robots"
          content={isNoIndex ? 'noindex, noai, noimageai' : 'noai, noimageai'}
        />
      </Head>
      <OriginalRoot {...props} />
    </>
  );
}

Swizzle with --eject (full copy) not --wrap for Root — wrapping works but ejecting gives more control. Keep the original import and delegate to it.

5. X-Robots-Tag — hosting platform

Docusaurus produces a static site — HTTP headers come from your hosting platform.

Netlify — netlify.toml

# netlify.toml
[build]
  command = "npm run build"
  publish = "build"

[[headers]]
  for = "/*"
  [headers.values]
    X-Robots-Tag = "noai, noimageai"

Vercel — vercel.json

{
  "buildCommand": "npm run build",
  "outputDirectory": "build",
  "headers": [
    {
      "source": "/(.*)",
      "headers": [
        { "key": "X-Robots-Tag", "value": "noai, noimageai" }
      ]
    }
  ]
}

Cloudflare Pages — _headers file

# static/_headers — copied to build/_headers by Docusaurus
/*
  X-Robots-Tag: noai, noimageai

GitHub Pages — no custom headers

GitHub Pages does not support custom HTTP headers. Use the headTags noai meta approach (Section 2) as your only option, or migrate to Cloudflare Pages for header + edge function support.

6. Hard 403 — edge functions

For hard User-Agent blocking before content is served:

Netlify Edge Function

// netlify/edge-functions/block-ai-bots.js
const BLOCKED_UA = /GPTBot|ChatGPT-User|ClaudeBot|Claude-Web|anthropic-ai|CCBot|Google-Extended|PerplexityBot|Amazonbot|Bytespider|YouBot|DuckAssistBot|meta-externalagent|MistralAI-Spider|oai-searchbot/i;

export default async function handler(request, context) {
  const ua = request.headers.get("user-agent") || "";
  const path = new URL(request.url).pathname;

  if (path !== "/robots.txt" && BLOCKED_UA.test(ua)) {
    return new Response("Forbidden", { status: 403 });
  }

  return context.next();
}

export const config = { path: "/*" };
# netlify.toml — declare edge function
[[edge_functions]]
  path = "/*"
  function = "block-ai-bots"

[build]
  command = "npm run build"
  publish = "build"

Cloudflare Pages — _middleware.js

// static/functions/_middleware.js — copied to build/functions/
const BLOCKED_UA = /GPTBot|ClaudeBot|CCBot|PerplexityBot|Amazonbot|Bytespider/i;

export async function onRequest(context) {
  const { request, next } = context;
  const ua = request.headers.get("user-agent") || "";
  const path = new URL(request.url).pathname;

  if (path !== "/robots.txt" && BLOCKED_UA.test(ua)) {
    return new Response("Forbidden", { status: 403 });
  }

  return next();
}

7. Full docusaurus.config.js example

Complete config with headTags for noai meta, robots.txt reference, and standard Docusaurus v3 structure.

// docusaurus.config.ts
import type { Config } from '@docusaurus/types';
import type * as Preset from '@docusaurus/preset-classic';

const config: Config = {
  title: 'My Documentation',
  tagline: 'Project docs',
  url: 'https://docs.example.com',
  baseUrl: '/',
  onBrokenLinks: 'throw',
  onBrokenMarkdownLinks: 'warn',
  favicon: 'img/favicon.ico',

  // ── AI training opt-out ────────────────────────────────────────────
  // Injects into <head> of every generated page
  headTags: [
    {
      tagName: 'meta',
      attributes: {
        name: 'robots',
        content: 'noai, noimageai',
      },
    },
  ],

  presets: [
    [
      'classic',
      {
        docs: {
          sidebarPath: './sidebars.ts',
          routeBasePath: '/',
        },
        blog: false,
        theme: {
          customCss: './src/css/custom.css',
        },
      } satisfies Preset.Options,
    ],
  ],

  themeConfig: {
    navbar: {
      title: 'My Docs',
      items: [
        { to: '/', label: 'Docs', position: 'left' },
      ],
    },
  } satisfies Preset.ThemeConfig,
};

export default config;

8. Deployment quick reference

PlatformBuild commandPublish dirHeaders
Netlifynpm run buildbuildnetlify.toml [[headers]]
Vercelnpm run buildbuildvercel.json headers
Cloudflare Pagesnpm run buildbuildstatic/_headers file
GitHub Pagesnpm run buildbuild❌ no custom headers
AWS S3 + CloudFrontnpm run buildbuildCloudFront response headers policy

Frequently asked questions

How do I add robots.txt to a Docusaurus site?

Place robots.txt in the static/ directory — Docusaurus copies everything from static/ to build/ unchanged. Do not put it in docs/ or src/ — only static/ is copied as-is.

How do I add noai meta tags to Docusaurus?

Use headTags in docusaurus.config.js: headTags: [{ tagName: "meta", attributes: { name: "robots", content: "noai, noimageai" } }]. This injects the meta tag into every page with no swizzling. Available in Docusaurus v2.4+ and v3.

What is theme swizzling and when should I use it?

Swizzling copies a Docusaurus theme component into your project so you can modify it. For a global noai meta tag, use headTags in config — no swizzling needed. Swizzle only when you need per-page conditional logic (e.g. different robots values on /internal/ paths). Use --eject on the Root component for the safest swizzling target.

How do I add X-Robots-Tag headers to a Docusaurus site?

Headers come from your host, not Docusaurus. Netlify: [[headers]] in netlify.toml. Vercel: headers in vercel.json. Cloudflare Pages: _headers file in static/ (copied to build/). GitHub Pages: not supported — use noai meta tag.

Does Docusaurus support per-page robots meta tags?

Yes. In any MDX file, add a <head> block with <meta name="robots" content="noindex, noai, noimageai" />. For programmatic control across many pages, swizzle the Root component and conditionally set the robots value based on the current path.

Can I block AI bots on GitHub Pages with Docusaurus?

GitHub Pages doesn't support custom headers or edge functions. Use headTags for the noai meta tag as your primary defense. For hard 403 blocking, migrate to Netlify or Cloudflare Pages (both free, both support edge functions). Docusaurus deploys cleanly to both with no config changes beyond the build command and publish directory.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.