Skip to content
Guides/Vercel

How to Block AI Bots on Vercel: Complete 2026 Guide

Vercel is a deployment platform, not a framework — which means your bot blocking strategy spans two layers: Vercel infrastructure (vercel.json, Edge Middleware, Vercel Firewall) and your framework code (Next.js Metadata API, SvelteKit hooks, etc.). This guide covers the Vercel-native mechanisms that work regardless of which framework you deploy.

Four protection layers

1
robots.txtpublic/robots.txt (CDN-served) or Next.js app/robots.ts programmatic route
2
noai meta tagNext.js Metadata API robots field — per-page or site-wide in layout.tsx
3
X-Robots-Tag headervercel.json headers array (CDN-level) or next.config.ts headers() function
4
Hard 403 blockEdge Middleware (middleware.ts) — runs at Vercel Edge before your app

Layer 1: robots.txt

Vercel serves files from your framework's public directory directly from the CDN edge — no Lambda invocation, no cold start. For Next.js, that is public/robots.txt. For SvelteKit it is static/robots.txt. For Astro it is public/robots.txt. Drop the file and Vercel handles the rest.

Option A: Static file (any framework)

Place in your framework's public/static directory. Vercel CDN serves it with no app code involved.

# public/robots.txt  (Next.js, Astro, Remix)
# static/robots.txt  (SvelteKit)
# public/robots.txt  (Nuxt)

User-agent: *
Allow: /

# AI training crawlers
User-agent: GPTBot
User-agent: ClaudeBot
User-agent: anthropic-ai
User-agent: Google-Extended
User-agent: CCBot
User-agent: Bytespider
User-agent: Applebot-Extended
User-agent: PerplexityBot
User-agent: Diffbot
User-agent: cohere-ai
User-agent: FacebookBot
User-agent: omgili
User-agent: omgilibot
Disallow: /

Option B: Next.js app/robots.ts (programmatic)

Next.js 13.3+ generates robots.txt from a TypeScript file. Useful when the disallowed paths come from environment variables or a database.

// app/robots.ts
import type { MetadataRoute } from 'next';

export default function robots(): MetadataRoute.Robots {
  const aiCrawlers = [
    'GPTBot', 'ClaudeBot', 'anthropic-ai', 'Google-Extended',
    'CCBot', 'Bytespider', 'Applebot-Extended', 'PerplexityBot',
    'Diffbot', 'cohere-ai', 'FacebookBot', 'omgili',
  ];

  return {
    rules: [
      { userAgent: '*', allow: '/' },
      ...aiCrawlers.map(agent => ({ userAgent: agent, disallow: '/' })),
    ],
    sitemap: 'https://example.com/sitemap.xml',
  };
}

Next.js renders this at build time (SSG) and serves it as a static file — no runtime overhead.

Layer 2: noai meta tag

The noai and noimageai directives tell compliant AI crawlers not to use page content or images for training. Set this in your framework's meta API. The examples below use Next.js — see your framework's docs for SvelteKit (<svelte:head>), Nuxt (useHead), or Astro (<head>).

Site-wide (Next.js layout.tsx)

// app/layout.tsx
import type { Metadata } from 'next';

export const metadata: Metadata = {
  // Next.js injects this as <meta name="robots" content="noai, noimageai">
  robots: 'noai, noimageai',

  // ... your other metadata
  title: { default: 'My Site', template: '%s | My Site' },
};

Per-page override (Next.js page.tsx)

Page-level metadata merges with layout metadata. The page value wins for duplicate keys.

// app/blog/[slug]/page.tsx
import type { Metadata } from 'next';

export async function generateMetadata({
  params,
}: {
  params: { slug: string };
}): Promise<Metadata> {
  const post = await getPost(params.slug);

  return {
    title: post.title,
    // Override to allow indexing for specific pages
    robots: post.isPublic ? 'index, follow' : 'noai, noimageai',
  };
}

Layer 3: X-Robots-Tag header

The X-Robots-Tag response header signals the same directives as the meta tag but applies to all content types — including PDFs, JSON API responses, and images. On Vercel you have three options.

Option A: vercel.json headers (recommended — no code)

Vercel CDN injects these headers on every response before your app runs. Works with any framework. No rebuild required to change header values — just redeploy.

// vercel.json
{
  "headers": [
    {
      "source": "/(.*)",
      "headers": [
        {
          "key": "X-Robots-Tag",
          "value": "noai, noimageai"
        }
      ]
    }
  ]
}

Option B: next.config.ts headers() (Next.js — conditional)

Use this when you need conditional logic — different headers per path, or values loaded from environment variables.

// next.config.ts
import type { NextConfig } from 'next';

const nextConfig: NextConfig = {
  async headers() {
    return [
      {
        // Apply to all routes
        source: '/:path*',
        headers: [
          {
            key: 'X-Robots-Tag',
            value: 'noai, noimageai',
          },
        ],
      },
      {
        // Override for API routes — allow AI to access public API
        source: '/api/:path*',
        headers: [
          {
            key: 'X-Robots-Tag',
            value: 'index, follow',
          },
        ],
      },
    ];
  },
};

export default nextConfig;

Option C: Edge Middleware

You can set X-Robots-Tag in middleware.ts at the same time as the 403 block — see Layer 4 below. The combined middleware approach is shown there.

Layer 4: Hard 403 via Edge Middleware

Vercel Edge Middleware is the most powerful tool for bot blocking. It runs on Vercel's global Edge Network — not as a serverless Lambda — so there is no cold start and overhead is typically 1–5ms. The middleware intercepts the request before your framework (Next.js, SvelteKit, Nuxt, etc.) ever executes.

Create middleware.ts at your project root (same level as package.json):

// middleware.ts  (project root, not inside src/ or app/)
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

const AI_BOTS = [
  'gptbot',
  'claudebot',
  'anthropic-ai',
  'google-extended',
  'ccbot',
  'bytespider',
  'applebot-extended',
  'perplexitybot',
  'diffbot',
  'cohere-ai',
  'facebookbot',
  'omgili',
  'omgilibot',
  'iaskspider',
  'petalbot',
  'youbot',
  'semrushbot-ai',
];

const EXEMPT_PATHS = [
  '/robots.txt',
  '/sitemap.xml',
  '/sitemap-index.xml',
  '/favicon.ico',
];

export function middleware(request: NextRequest) {
  const { pathname } = request.nextUrl;

  // Never block these paths — bots need robots.txt to read it
  if (EXEMPT_PATHS.some(path => pathname.startsWith(path))) {
    return NextResponse.next();
  }

  const ua = (request.headers.get('user-agent') ?? '').toLowerCase();
  const isAIBot = AI_BOTS.some(bot => ua.includes(bot));

  if (isAIBot) {
    return new Response('Forbidden', {
      status: 403,
      headers: { 'Content-Type': 'text/plain' },
    });
  }

  return NextResponse.next();
}

// Only run middleware on page requests, not static assets
export const config = {
  matcher: [
    /*
     * Match all request paths EXCEPT:
     * - _next/static  (Next.js static assets)
     * - _next/image   (Next.js image optimization)
     * - favicon.ico   (browser favicon request)
     */
    '/((?!_next/static|_next/image|favicon.ico).*)',
  ],
};

Framework compatibility

This middleware.ts file works with any framework deployed to Vercel — Next.js, SvelteKit, Nuxt, Astro, Remix. The NextRequest/NextResponse types are from the next package but the runtime is Vercel's Edge Runtime. For non-Next.js frameworks, use the standard Web API Request/Response types instead.

Combined middleware: 403 + X-Robots-Tag

Set the header on legitimate traffic, block bots — all in one middleware function.

export function middleware(request: NextRequest) {
  const { pathname } = request.nextUrl;

  if (EXEMPT_PATHS.some(path => pathname.startsWith(path))) {
    return NextResponse.next();
  }

  const ua = (request.headers.get('user-agent') ?? '').toLowerCase();
  const isAIBot = AI_BOTS.some(bot => ua.includes(bot));

  if (isAIBot) {
    return new Response('Forbidden', { status: 403 });
  }

  // Add X-Robots-Tag to all legitimate responses
  const response = NextResponse.next();
  response.headers.set('X-Robots-Tag', 'noai, noimageai');
  return response;
}

Vercel Firewall (Pro/Enterprise — no code)

Vercel Firewall is a WAF (Web Application Firewall) available on Pro and Enterprise plans. You create rules in the Vercel dashboard — no code, no redeploy. Rules run before Edge Middleware in the request pipeline, making them the earliest possible blocking point.

Create a Firewall rule

  1. Go to your project in the Vercel dashboard → SecurityFirewall
  2. Click Add Rule
  3. Set condition: User-AgentcontainsGPTBot
  4. Set action: Block
  5. Repeat for each bot (or use regex for multiple bots in one rule)

Firewall rules via vercel.json (preview)

Vercel is rolling out firewall.rules in vercel.json for code-based WAF configuration. Check Vercel docs for current availability.

// vercel.json (firewall.rules — check Vercel docs for availability)
{
  "firewall": {
    "rules": [
      {
        "name": "Block AI training crawlers",
        "description": "Block known AI bot user agents",
        "conditionGroup": [
          {
            "conditions": [
              {
                "type": "user_agent",
                "op": "re",
                "value": "GPTBot|ClaudeBot|anthropic-ai|Google-Extended|CCBot|Bytespider"
              }
            ]
          }
        ],
        "action": { "type": "block" }
      }
    ]
  }
}

Dynamic bot list with Edge Config

To update your bot list without redeploying, store it in Vercel Edge Config. Updates propagate globally in under 300ms — useful for rapidly adding newly discovered bots.

// 1. Install the SDK
// npm install @vercel/edge-config

// 2. Add to your Edge Config via Vercel dashboard or CLI:
// Key: "aiBots"
// Value: ["gptbot","claudebot","google-extended","ccbot","bytespider"]

// 3. In middleware.ts
import { get } from '@vercel/edge-config';

export async function middleware(request: NextRequest) {
  const { pathname } = request.nextUrl;
  if (EXEMPT_PATHS.some(p => pathname.startsWith(p))) return NextResponse.next();

  // Fetch bot list from Edge Config (fast — globally replicated)
  const aiBots = await get<string[]>('aiBots') ?? AI_BOTS;

  const ua = (request.headers.get('user-agent') ?? '').toLowerCase();
  const isAIBot = aiBots.some(bot => ua.includes(bot));

  if (isAIBot) return new Response('Forbidden', { status: 403 });
  return NextResponse.next();
}

Edge Config reads are billed per request on Vercel — for most sites, the cost is negligible. For very high-traffic sites, cache the bot list in middleware memory.

Verify your setup

# Check robots.txt is served correctly
curl https://your-site.vercel.app/robots.txt

# Verify X-Robots-Tag header on a page
curl -I https://your-site.vercel.app/

# Test that a known bot UA gets a 403
curl -I -A "GPTBot/1.1" https://your-site.vercel.app/

# Verify bots can still read robots.txt (should be 200, not 403)
curl -I -A "GPTBot/1.1" https://your-site.vercel.app/robots.txt

# Check Vercel deployment logs for blocked requests
vercel logs --filter "403"

After deploying, check Vercel Dashboard → Logs to see blocked bot requests in real time. Filter by status code 403 to audit which bots are hitting your site.

FAQ

Does vercel.json headers apply to all frameworks, not just Next.js?

Yes. vercel.json is Vercel platform config — it applies to any framework you deploy (Next.js, SvelteKit, Nuxt, Astro, Remix, etc.). The headers array is processed by Vercel infrastructure before your app code runs.

What is the difference between vercel.json headers and Edge Middleware for blocking bots?

vercel.json headers adds response headers (like X-Robots-Tag) but cannot block requests — it always returns your app response. Edge Middleware can intercept and return a 403 before your app executes. Use vercel.json headers for X-Robots-Tag signaling and Edge Middleware for hard 403 blocking.

Does Edge Middleware add latency to every request?

Vercel Edge Middleware runs on Vercel's global Edge Network with near-zero cold start. Typical overhead is 1-5ms. For legitimate visitors the overhead is negligible. For blocked bots they receive a 403 immediately without your app ever running.

Should I use the matcher config in middleware.ts?

Yes. Without matcher, middleware runs on every request including _next/static asset requests (JS bundles, CSS, images). The recommended matcher excludes _next/static, _next/image, and favicon.ico so static asset serving is unaffected.

What is Vercel Firewall and how does it differ from Edge Middleware?

Vercel Firewall (Pro/Enterprise) is a WAF configured in the Vercel dashboard — no code required. You create rules matching User-Agent, IP, country, etc. It runs before Edge Middleware in the request pipeline. Edge Middleware is code-based and available on all plans including Hobby.

Can I update the bot list without redeploying my app?

Yes, using Vercel Edge Config. Store your bot list as an Edge Config item and read it in middleware with @vercel/edge-config. Edge Config updates propagate globally in under 300ms with no redeployment.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.