Skip to content
Guides/Directus

How to Block AI Bots on Directus: Complete 2026 Guide

Directus is a headless CMS that wraps your database in a REST + GraphQL API and provides a no-code admin panel. Unlike traditional web frameworks, Directus doesn't render your frontend HTML — it serves data. Bot blocking is applied at two levels: the robots.txt static file (Layer 1) and a TypeScript hook extension that accesses Directus's underlying Fastify app via the init lifecycle hook (Layers 3–4). The noai meta tag (Layer 2) belongs in your frontend application, not in Directus.

Directus architecture

/adminReact SPA — admin panel. Block AI bots here (no real content).
/apiREST API — your data. Block AI scrapers here.
/graphqlGraphQL endpoint — block bots attempting mass data extraction.
/robots.txtServed from public/ — must remain accessible to all bots.
Your frontend (Next.js, Nuxt, SvelteKit) is separate — add noai meta tags there.

Protection layers

1
robots.txtpublic/robots.txt — served automatically, overrides Directus default
2
noai meta tagYour frontend SPA (Next.js/Nuxt/SvelteKit) — not in Directus itself
3
X-Robots-Tag headerHook extension → init → Fastify onRequest → reply.header()
4
Hard 403 blockHook extension → init → Fastify onRequest → reply.code(403).send()

Layer 1: robots.txt

Place robots.txt in your Directus project's public/ directory. Directus serves this directory at the web root — your file takes precedence over Directus's built-in default.

# public/robots.txt

User-agent: *
Allow: /

User-agent: GPTBot
User-agent: ClaudeBot
User-agent: anthropic-ai
User-agent: Google-Extended
User-agent: CCBot
User-agent: Bytespider
User-agent: Applebot-Extended
User-agent: PerplexityBot
User-agent: Diffbot
User-agent: cohere-ai
User-agent: FacebookBot
User-agent: omgili
User-agent: omgilibot
User-agent: Amazonbot
User-agent: DeepSeekBot
User-agent: MistralBot
User-agent: xAI-Bot
User-agent: AI2Bot
Disallow: /

In Docker deployments, ensure the public/ directory is either COPY'd into the image or mounted as a volume alongside your .env and extensions/ directories.

Layer 2: noai meta tag

Directus does not render your website's HTML — your frontend does. Add the noai meta tag in your frontend framework's base layout:

Next.js

// app/layout.tsx
export const metadata = {
  other: { 'robots': 'noai, noimageai' },
};

Nuxt

// nuxt.config.ts
export default defineNuxtConfig({
  app: { head: { meta: [
    { name: 'robots', content: 'noai, noimageai' }
  ]}}
})

SvelteKit

<!-- src/routes/+layout.svelte -->
<svelte:head>
  <meta name="robots" content="noai, noimageai">
</svelte:head>

Layers 3 & 4: hook extension

Directus extensions live in the extensions/ directory. A hook extension exposes Directus lifecycle events — including the init hook which provides access to the underlying Fastify application before it starts serving requests.

Extension file structure

extensions/
└── hooks/
    └── ai-bot-blocker/
        ├── index.ts
        └── package.json

extensions/hooks/ai-bot-blocker/index.ts

import type { HookExtensionContext } from '@directus/extensions';

const AI_BOT_PATTERNS = [
  'gptbot', 'chatgpt-user', 'oai-searchbot',
  'claudebot', 'anthropic-ai', 'claude-web',
  'google-extended', 'ccbot', 'bytespider',
  'applebot-extended', 'perplexitybot', 'diffbot',
  'cohere-ai', 'facebookbot', 'meta-externalagent',
  'omgili', 'omgilibot', 'amazonbot',
  'deepseekbot', 'mistralbot', 'xai-bot', 'ai2-bot',
];

const EXEMPT_PATHS = new Set(['/robots.txt', '/sitemap.xml', '/favicon.ico']);

export default ({ init, logger }: HookExtensionContext) => {
  // init('app.before') fires before Directus registers its routes
  // 'app' is the underlying Fastify instance
  init('app.before', ({ app }: { app: any }) => {

    app.addHook('onRequest', async (request: any, reply: any) => {
      const path = request.url.split('?')[0]; // strip query string

      // Always pass through exempt paths
      if (EXEMPT_PATHS.has(path)) return;

      const ua = (request.headers['user-agent'] || '').toLowerCase();

      // Layer 4: hard 403 block
      if (AI_BOT_PATTERNS.some(pattern => ua.includes(pattern))) {
        logger.debug(`Blocked AI bot: ${request.headers['user-agent']} at ${path}`);
        return reply.code(403).header('Content-Type', 'text/plain').send('Forbidden');
      }

      // Layer 3: X-Robots-Tag — inject on response for legitimate requests
      // onRequest is too early to set response headers that persist; use onSend instead
    });

    // Layer 3: X-Robots-Tag via onSend (fires just before response is sent)
    app.addHook('onSend', async (_request: any, reply: any, _payload: any) => {
      reply.header('X-Robots-Tag', 'noai, noimageai');
    });
  });
};

extensions/hooks/ai-bot-blocker/package.json

{
  "name": "directus-extension-ai-bot-blocker",
  "version": "1.0.0",
  "type": "module",
  "directus:extension": {
    "type": "hook",
    "path": "index.js",
    "source": "index.ts",
    "host": "^10.0.0"
  }
}

Key points

  • init('app.before', ...) — the only supported way to access the raw Fastify app from a Directus extension. This fires before Directus registers its own routes.
  • app.addHook('onRequest', ...) is a Fastify lifecycle hook — it fires at the very start of every request, before routing. Calling reply.send() from here terminates the request immediately.
  • onSend for the X-Robots-Tag header — this hook fires just before the response body is sent and can modify headers reliably. Setting headers in onRequest may be overwritten by subsequent Fastify plugins.
  • request.url includes the query string — strip it with .split('?')[0] before comparing to EXEMPT_PATHS.
  • Build the extension with the Directus CLI before Directus can load it: npx directus-extension build from the extension directory compiles TypeScript to JavaScript.

Building and loading the extension

# Install Directus extension SDK (dev dependency)
cd extensions/hooks/ai-bot-blocker
npm install --save-dev @directus/extensions-sdk

# Build the extension (TS → JS)
npx directus-extension build

# Restart Directus — it automatically loads extensions in extensions/hooks/
# The extension is ready when you see:
# [extensions] Loaded hook: ai-bot-blocker

Directus auto-loads all directories under extensions/hooks/ on startup. No registration step required. After building, restart the Directus process.

Nginx alternative (self-hosted)

For self-hosted Directus behind nginx, block bots at the proxy layer — they never reach Node.js at all:

# /etc/nginx/sites-available/directus.conf
map $http_user_agent $is_ai_bot {
    default         0;
    "~*GPTBot"      1;
    "~*ClaudeBot"   1;
    "~*anthropic-ai" 1;
    "~*Google-Extended" 1;
    "~*CCBot"       1;
    "~*Bytespider"  1;
    "~*PerplexityBot" 1;
    "~*Diffbot"     1;
    "~*cohere-ai"   1;
}

server {
    listen 443 ssl;
    server_name your-directus-domain.com;

    # Layer 1 — robots.txt passthrough
    location = /robots.txt {
        proxy_pass http://localhost:8055;
    }

    # Layer 4 — hard block for all other paths
    location / {
        if ($is_ai_bot) { return 403 "Forbidden"; }

        # Layer 3 — X-Robots-Tag on all responses
        add_header X-Robots-Tag "noai, noimageai" always;

        proxy_pass http://localhost:8055;
        proxy_set_header Host $host;
    }
}

Verification

# Layer 1 — robots.txt
curl https://your-directus.com/robots.txt

# Layer 3 — X-Robots-Tag on API response
curl -I https://your-directus.com/api/collections
# Expected: X-Robots-Tag: noai, noimageai

# Layer 4 — hard block on bot user-agent
curl -A "Mozilla/5.0 (compatible; GPTBot/1.0)" -I https://your-directus.com/api/collections
# Expected: HTTP/1.1 403 Forbidden

# robots.txt must be exempt
curl -A "GPTBot" -I https://your-directus.com/robots.txt
# Expected: HTTP/1.1 200 OK

FAQ

How do I access the Fastify app from a Directus hook extension?

Use init('app.before', ({ app }) => { ... }). The 'app.before' event fires before Directus registers its routes and gives you the raw Fastify instance. You can call app.addHook() to register Fastify lifecycle hooks, app.register() to add plugins, or app.decorate() to add decorators. This is the only supported way to access the underlying HTTP framework from a Directus extension — there is no official Directus API for request-level middleware.

Where does Directus serve static files like robots.txt?

Place the file in public/ in your Directus project root. Directus serves all files in public/ at the web root automatically — no route definition required. In Docker, COPY the public/ directory into the container alongside your extensions/ directory and .env file. If you use a volume mount, ensure robots.txt is present in the mounted directory when the container starts.

Where do noai meta tags go in a Directus-powered site?

In your frontend application — the Next.js, Nuxt, SvelteKit, or Astro site that fetches from the Directus API. Directus serves JSON, not your website's HTML. Add the noai meta tag to your frontend's base layout. The Directus admin panel (/admin) is a React SPA you don't control without forking Directus — but it's typically not indexed by search engines and AI bots won't find useful training data there.

Will the hook extension affect the /admin panel?

Yes — Fastify hooks registered via init('app.before') fire for all requests including /admin, /api, and /graphql. This is correct: AI bots are blocked before they reach authentication, saving resources. EXEMPT_PATHS ensures /robots.txt and /sitemap.xml remain accessible. If you need to exclude /admin from blocking (e.g. for monitoring tools that use generic user agents), add it to EXEMPT_PATHS.

Should I use the hook extension or nginx for bot blocking?

Use nginx if you self-host Directus behind a reverse proxy — bots are blocked before reaching Node.js, with lower latency and resource usage. Use the hook extension for cloud-hosted Directus (Directus Cloud, Railway, Render, Fly.io) where you don't control the reverse proxy layer. Both approaches can coexist: nginx handles the majority of bots, the extension handles any that reach the application layer.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.