Skip to content
Guides/Netlify

How to Block AI Bots on Netlify: Complete 2026 Guide

Netlify is a deployment platform that hosts any static site or SSR app — Next.js, SvelteKit, Nuxt, Astro, Gatsby, and more. Bot blocking spans two layers: Netlify CDN config (_headers file, netlify.toml) and Netlify Edge Functions — globally distributed, Deno-runtime interceptors that run before your site content is served.

Four protection layers

1
robots.txtpublic/robots.txt in your publish directory — Netlify CDN serves it before any function runs
2
noai meta tagFramework meta API — Next.js Metadata, SvelteKit <svelte:head>, Nuxt useHead, Astro <head>
3
X-Robots-Tag header_headers file (zero config) or netlify.toml [[headers]] block — applied at CDN edge
4
Hard 403 blockNetlify Edge Functions (Deno) — intercept all requests before site content, globally distributed

Layer 1: robots.txt

Netlify serves static files from your publish directory directly from its CDN — before any function or edge function runs. Your publish directory is configured in netlify.toml (or defaults to public).

# netlify.toml — confirm your publish directory
[build]
  publish = "public"   # Next.js, Astro
  # publish = "dist"   # SvelteKit (check svelte.config.js)
  # publish = ".next"  # Next.js with output: 'export'

Place robots.txt in your publish directory:

# public/robots.txt  (Next.js, Astro, Gatsby)
# static/robots.txt  (SvelteKit — copies to dist/ at build)

User-agent: *
Allow: /

# AI training crawlers — disallow content and image use
User-agent: GPTBot
User-agent: ClaudeBot
User-agent: anthropic-ai
User-agent: Google-Extended
User-agent: CCBot
User-agent: Bytespider
User-agent: Applebot-Extended
User-agent: PerplexityBot
User-agent: Diffbot
User-agent: cohere-ai
User-agent: FacebookBot
User-agent: omgili
User-agent: omgilibot
Disallow: /

SvelteKit note

In SvelteKit, static/robots.txt is copied to the build output automatically. You can also generate it dynamically from src/routes/robots.txt/+server.ts.

Layer 2: noai meta tag

The noai and noimageai directives are set in your framework's meta API, not in Netlify config. Examples for the most common frameworks deployed to Netlify:

Next.js (App Router)

// app/layout.tsx
export const metadata = {
  robots: 'noai, noimageai',
};

SvelteKit

<!-- src/routes/+layout.svelte -->
<svelte:head>
  <meta name="robots" content="noai, noimageai" />
</svelte:head>

Astro

---
// src/layouts/Base.astro
---
<html>
  <head>
    <meta name="robots" content="noai, noimageai" />
    <slot name="head" />
  </head>
  <body><slot /></body>
</html>

Nuxt

// nuxt.config.ts
export default defineNuxtConfig({
  app: {
    head: {
      meta: [
        { name: 'robots', content: 'noai, noimageai' }
      ]
    }
  }
})

Layer 3: X-Robots-Tag header

Netlify applies response headers at the CDN level — before your framework or functions run. Two equivalent approaches:

Option A: _headers file (simplest)

Create a _headers file in your publish directory. Netlify processes it automatically — no additional config required.

# public/_headers  (place in your publish directory)

/*
  X-Robots-Tag: noai, noimageai

# Override for specific paths (e.g. allow indexing on public API docs)
/api/*
  X-Robots-Tag: index, follow

Option B: netlify.toml [[headers]] (structured)

Lives at the repo root alongside your other Netlify config. Supports regex path matching and is version-controlled alongside build config.

# netlify.toml

[build]
  publish = "public"

# Apply X-Robots-Tag to all responses
[[headers]]
  for = "/*"
  [headers.values]
    X-Robots-Tag = "noai, noimageai"

# Override for API routes
[[headers]]
  for = "/api/*"
  [headers.values]
    X-Robots-Tag = "index, follow"

Precedence

When both _headers and netlify.toml [[headers]] define the same header for the same path,_headers takes precedence. Pick one approach and stick to it.

Layer 4: Hard 403 via Edge Functions

Netlify Edge Functions run on Netlify's global edge network using the Deno runtime — not Node.js. They intercept every request before your site content is served, making them ideal for bot blocking. The key method is context.next() — call it to pass the request through to your site, or return a Response to intercept.

Step 1: Create the edge function

// netlify/edge-functions/block-bots.ts

import type { Config, Context } from '@netlify/edge-functions';

const AI_BOTS = [
  'gptbot',
  'claudebot',
  'anthropic-ai',
  'google-extended',
  'ccbot',
  'bytespider',
  'applebot-extended',
  'perplexitybot',
  'diffbot',
  'cohere-ai',
  'facebookbot',
  'omgili',
  'omgilibot',
  'iaskspider',
  'petalbot',
  'youbot',
  'semrushbot-ai',
];

const EXEMPT_PATHS = [
  '/robots.txt',
  '/sitemap.xml',
  '/sitemap-index.xml',
  '/favicon.ico',
];

export default async function handler(
  request: Request,
  context: Context,
) {
  const url = new URL(request.url);
  const { pathname } = url;

  // Always allow access to these paths — bots need robots.txt
  if (EXEMPT_PATHS.some(path => pathname.startsWith(path))) {
    return context.next();
  }

  const ua = (request.headers.get('user-agent') ?? '').toLowerCase();
  const isAIBot = AI_BOTS.some(bot => ua.includes(bot));

  if (isAIBot) {
    return new Response('Forbidden', {
      status: 403,
      headers: { 'Content-Type': 'text/plain' },
    });
  }

  return context.next();
}

// Register for all paths (inline config — no netlify.toml entry needed)
export const config: Config = {
  path: '/*',
};

Deno runtime — what changes vs Node.js

  • No require() — use ES module import
  • No Node.js built-ins (fs, path, etc.) — use standard Web APIs
  • Request, Response, URL are global — no imports needed
  • TypeScript supported natively — no ts-node or build step
  • All of the above is fine for bot blocking — just string matching on headers

Step 2: Register in netlify.toml (alternative to inline config)

If you prefer to manage routing centrally in netlify.toml rather than using the inline config export:

# netlify.toml

[[edge_functions]]
  path = "/*"
  function = "block-bots"

Combined: 403 block + X-Robots-Tag in one function

export default async function handler(request: Request, context: Context) {
  const { pathname } = new URL(request.url);

  if (EXEMPT_PATHS.some(p => pathname.startsWith(p))) {
    return context.next();
  }

  const ua = (request.headers.get('user-agent') ?? '').toLowerCase();
  const isAIBot = AI_BOTS.some(bot => ua.includes(bot));

  if (isAIBot) {
    return new Response('Forbidden', { status: 403 });
  }

  // Add X-Robots-Tag to all legitimate responses
  const response = await context.next();
  response.headers.set('X-Robots-Tag', 'noai, noimageai');
  return response;
}

Note: await context.next() fetches the actual site response, then you can mutate headers before returning. This is slightly more overhead than the _headers approach — use the _headers file for static header injection and Edge Functions only for the 403 blocking logic.

Netlify Functions vs Edge Functions for bot blocking

FeatureEdge FunctionsNetlify Functions
RuntimeDeno (edge)Node.js (Lambda)
Cold start~0ms100-500ms
Intercepts page requests✅ Yes — runs for /*❌ No — API routes only
Ideal for bot blocking✅ Yes❌ No
Access to request UA✅ Yes✅ Yes (but too late)
Mutate response headers✅ Yes✅ Yes (own response)

Use Edge Functions for bot blocking. Netlify Functions (Lambda-style) are for building API endpoints — they cannot intercept page navigation requests.

Verify your setup

# Check robots.txt is served correctly
curl https://your-site.netlify.app/robots.txt

# Verify X-Robots-Tag header
curl -I https://your-site.netlify.app/

# Test AI bot gets 403
curl -I -A "GPTBot/1.1" https://your-site.netlify.app/

# Verify bots can still access robots.txt (should be 200)
curl -I -A "GPTBot/1.1" https://your-site.netlify.app/robots.txt

# Check edge function logs in Netlify dashboard
# Site → Functions → block-bots → Logs

After deploying, open Netlify Dashboard → Functions → block-bots → Logs to see real-time edge function invocations. Filter for 403 responses to audit which bots are being blocked.

FAQ

What is the _headers file and how is it different from netlify.toml?

The _headers file is placed in your publish directory (e.g. public/_headers) and uses a plain path/header format. netlify.toml [[headers]] achieves the same result from the repo root. Both are processed by Netlify CDN before any function runs. Use _headers for simplicity, netlify.toml if you already use it for other Netlify config.

What is the difference between Netlify Edge Functions and Netlify Functions for blocking bots?

Netlify Functions are AWS Lambda-style — they handle API routes only (/api/*), not site-wide middleware. Edge Functions run at Netlify global edge nodes before routing and can intercept any request. Use Edge Functions for bot blocking, not Netlify Functions.

Do Netlify Edge Functions use Node.js or Deno?

Netlify Edge Functions use the Deno runtime. You use standard Web APIs (Request, Response, URL) instead of Node.js modules. No require() — use ES module import. For bot blocking this makes no difference in practice.

Should I register my edge function in netlify.toml or use the config export?

Both work. The inline config export (export const config: Config = { path: "/*" }) in the edge function file is the newer approach — no netlify.toml [[edge_functions]] entry needed. The netlify.toml approach works if you prefer centralised routing config.

Will blocking AI bots in an Edge Function affect my site performance?

No meaningful impact. Edge Functions run at Netlify global edge nodes with near-zero cold start. Legitimate visitors pass through in under 5ms. Blocked bots never reach your origin, which reduces load.

Can I use the same bot-blocking code on both Netlify and Vercel?

The core logic (UA matching, EXEMPT_PATHS) is identical. The wrapper differs: Vercel uses NextRequest/NextResponse in middleware.ts, Netlify uses Request/Response Web APIs with context.next() in netlify/edge-functions/. Extract the detection logic into a shared utility and wrap it platform-specifically.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.