How is this different from Google Analytics?

Google Analytics shows you traffic. Shadow shows you traffic, AI bot activity, what AI platforms say about your brand, AND tells you what to do about all of it. It's analytics + AI intelligence + action steps in one tool.

Do I need to install anything?

For basic monitoring (bot detection, AI perception, readiness score) — nope, just enter your URL. For full visitor analytics (clicks, behavior, sessions), add one script tag. One-click integrations for Vercel, Shopify, WordPress, and more.

Will it slow down my site?

No. The script is under 5KB and loads async. Zero impact on page speed or Core Web Vitals. External monitoring has literally no impact — it watches from the outside.

What AI bots does Shadow detect?

All of them. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, and dozens more. The Shadow Network means new bots get identified across all users instantly.

What do you mean by "actionable steps"?

Shadow doesn't just show you graphs. It says things like: "ChatGPT has your pricing wrong — add structured data to /pricing to fix it" or "Your bounce rate on /features is 68% — here's why and what to change." Specific, do-it-today recommendations.

Can Shadow block bots?

Shadow is a telescope, not a shield. It shows you who's visiting and what AI says about you. It generates block rules and robots.txt configs you can apply — but it doesn't intercept traffic.

Yes. Shadow never collects PII. IP addresses are hashed after classification. No cookies on your visitors. All Shadow Network data is anonymized. GDPR compliant by design.

Next.jsApp RouterNew9 min read

How to Block AI Bots on Next.js

Next.js 13+ App Router has native app/robots.ts support, type-safe metadata for per-page control, and edge middleware for hard blocking — the most complete AI bot control of any web framework. Here's every method, from static file to server-level enforcement.

Quick fix — App Router: create app/robots.ts

Next.js auto-serves this at /robots.txt. No configuration needed.

import { MetadataRoute } from 'next';

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      { userAgent: '*', allow: '/' },
      { userAgent: 'GPTBot', disallow: '/' },
      { userAgent: 'ChatGPT-User', disallow: '/' },
      { userAgent: 'OAI-SearchBot', disallow: '/' },
      { userAgent: 'ClaudeBot', disallow: '/' },
      { userAgent: 'anthropic-ai', disallow: '/' },
      { userAgent: 'Google-Extended', disallow: '/' },
      { userAgent: 'Bytespider', disallow: '/' },
      { userAgent: 'CCBot', disallow: '/' },
      { userAgent: 'PerplexityBot', disallow: '/' },
      { userAgent: 'meta-externalagent', disallow: '/' },
      { userAgent: 'Diffbot', disallow: '/' },
    ],
    sitemap: 'https://yourdomain.com/sitemap.xml',
  };
}

Pages Router? Use public/robots.txt instead — see Method 2 below. app/robots.ts only works with the App Router.

All Methods

app/robots.ts — App Router (Recommended)

Easy

App Router (Next.js 13+)

app/robots.ts

Native Next.js convention — returns a MetadataRoute.Robots object, type-safe, auto-served at /robots.txt. Works on Vercel and self-hosted.

Next.js 13+ App Router only. For Pages Router, use public/robots.txt instead.

public/robots.txt — Static File

Easy

App Router + Pages Router

public/robots.txt

Plain text file served as-is by Next.js static file handler. Works with both App Router and Pages Router. No TypeScript needed.

If both app/robots.ts and public/robots.txt exist, app/robots.ts takes precedence.

metadata.robots — Per-Page noai Tag

Easy

App Router (Next.js 13+)

app/layout.tsx or any page.tsx

Set noai/noimageai globally in root layout metadata, or per-page. Next.js renders the correct <meta name="robots"> tag automatically.

For Pages Router, use next/head inside each page or _app.tsx for global application.

next.config.js headers — X-Robots-Tag

Easy

App Router + Pages Router

next.config.js (or next.config.ts)

Add X-Robots-Tag: noai, noimageai as an HTTP response header on all routes. More authoritative than the HTML meta tag — bots that scrape without rendering HTML still see it.

Applies at the HTTP layer. Combine with robots.txt for belt-and-suspenders protection.

middleware.ts — Hard Blocking

Intermediate

App Router + Pages Router

middleware.ts (project root)

Inspect User-Agent at the Edge and return 403 before serving any content. The only Next.js method that stops bots ignoring robots.txt. Runs at Vercel Edge or Node.js server.

Most powerful method. Returns 403 before application code runs — zero performance impact on real users.

Method 1: app/robots.ts

App Router only — Next.js 13+

Create app/robots.ts in your App Router project. Next.js generates /robots.txt from this file at build time (static generation) or on-demand (dynamic). No extra configuration required.

// app/robots.ts
import { MetadataRoute } from 'next';

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      // Allow standard search crawlers
      { userAgent: '*', allow: '/' },

      // Block AI training bots
      { userAgent: 'GPTBot', disallow: '/' },
      { userAgent: 'ChatGPT-User', disallow: '/' },
      { userAgent: 'OAI-SearchBot', disallow: '/' },
      { userAgent: 'ClaudeBot', disallow: '/' },
      { userAgent: 'anthropic-ai', disallow: '/' },
      { userAgent: 'Google-Extended', disallow: '/' },
      { userAgent: 'Bytespider', disallow: '/' },
      { userAgent: 'CCBot', disallow: '/' },
      { userAgent: 'PerplexityBot', disallow: '/' },
      { userAgent: 'meta-externalagent', disallow: '/' },
      { userAgent: 'Amazonbot', disallow: '/' },
      { userAgent: 'Applebot-Extended', disallow: '/' },
      { userAgent: 'xAI-Bot', disallow: '/' },
      { userAgent: 'DeepSeekBot', disallow: '/' },
      { userAgent: 'MistralBot', disallow: '/' },
      { userAgent: 'Diffbot', disallow: '/' },
      { userAgent: 'cohere-ai', disallow: '/' },
      { userAgent: 'AI2Bot', disallow: '/' },
      { userAgent: 'Ai2Bot-Dolma', disallow: '/' },
      { userAgent: 'YouBot', disallow: '/' },
      { userAgent: 'DuckAssistBot', disallow: '/' },
      { userAgent: 'omgili', disallow: '/' },
      { userAgent: 'omgilibot', disallow: '/' },
      { userAgent: 'webzio-extended', disallow: '/' },
      { userAgent: 'gemini-deep-research', disallow: '/' },
    ],
    sitemap: `${process.env.NEXT_PUBLIC_SITE_URL ?? 'https://yourdomain.com'}/sitemap.xml`,
  };
}

Method 2: public/robots.txt

App Router + Pages Router

Place a plain text robots.txt file in your public/ directory. Next.js serves it at /robots.txt with no processing. Works with both App Router and Pages Router.

User-agent: *
Allow: /

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: xAI-Bot
Disallow: /

User-agent: DeepSeekBot
Disallow: /

User-agent: MistralBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: AI2Bot
Disallow: /

User-agent: Ai2Bot-Dolma
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: webzio-extended
Disallow: /

User-agent: gemini-deep-research
Disallow: /

app/robots.ts vs public/robots.txt: If both exist, app/robots.ts takes precedence. The public/robots.txt is ignored when an app/robots.ts file is present. Use one or the other.

Method 3: metadata.robots — Per-Page noai Tag

App Router — Next.js 13+

Use Next.js's metadata API to set robots directives globally (in root layout) or per-page.

Global — app/layout.tsx:

// app/layout.tsx
import { Metadata } from 'next';

export const metadata: Metadata = {
  robots: {
    index: true,
    follow: true,
    // These render as: <meta name="robots" content="noai, noimageai">
    // Note: Next.js passes additional directives through the 'other' property
    other: {
      'robots': 'noai, noimageai',
    },
  },
};

Or add the meta tag directly in your root layout <head>:

// app/layout.tsx
export default function RootLayout({ children }: { children: React.ReactNode }) {
  return (
    <html lang="en">
      <head>
        <meta name="robots" content="noai, noimageai" />
      </head>
      <body>{children}</body>
    </html>
  );
}

Pages Router — _app.tsx or per-page with next/head:

// pages/_app.tsx (global) or any page file
import Head from 'next/head';

export default function MyApp({ Component, pageProps }) {
  return (
    <>
      <Head>
        <meta name="robots" content="noai, noimageai" />
      </Head>
      <Component {...pageProps} />
    </>
  );
}

Method 4: next.config.js Headers

App Router + Pages Router

Set X-Robots-Tag: noai, noimageai as an HTTP header on all responses. More authoritative than the HTML meta tag — applies even when a bot fetches pages without rendering JavaScript.

// next.config.js (or next.config.ts)
/** @type {import('next').NextConfig} */
const nextConfig = {
  async headers() {
    return [
      {
        // Apply to all routes
        source: '/(.*)',
        headers: [
          {
            key: 'X-Robots-Tag',
            value: 'noai, noimageai',
          },
        ],
      },
    ];
  },
};

module.exports = nextConfig;

Method 5: middleware.ts — Hard Blocking

App Router + Pages Router — runs at Edge

The most powerful method. Middleware runs before any page rendering — on Vercel it executes at the Edge, globally distributed. Bots that ignore robots.txt still get a 403 and never see your content.

// middleware.ts (project root — same level as app/ or pages/)
import { NextRequest, NextResponse } from 'next/server';

const BLOCKED_BOTS = new RegExp(
  '(GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|cohere-ai|AI2Bot|Ai2Bot-Dolma|YouBot|DuckAssistBot|omgili|omgilibot|webzio-extended|gemini-deep-research)',
  'i'
);

export function middleware(request: NextRequest) {
  const ua = request.headers.get('user-agent') ?? '';

  if (BLOCKED_BOTS.test(ua)) {
    return new NextResponse('Forbidden', {
      status: 403,
      headers: { 'Content-Type': 'text/plain' },
    });
  }

  return NextResponse.next();
}

// Apply middleware to all routes except static files and API routes
// that need to be publicly accessible
export const config = {
  matcher: [
    '/((?!_next/static|_next/image|favicon.ico).*)',
  ],
};

Why middleware is the strongest method

• Runs before your application code — bots never reach your pages or API routes
• Works even if bots ignore robots.txt (Bytespider, some Diffbot configurations)
• On Vercel: executes at the Edge, globally distributed — no latency for real users
• The 403 response is served from cache-layer infrastructure — your origin server is never touched
• No performance impact on legitimate traffic

Vercel Deployment Notes

vercel.json headers

Alternative to next.config.js headers — applies at the Vercel routing layer:

{
  "headers": [{
    "source": "/(.*)",
    "headers": [{
      "key": "X-Robots-Tag",
      "value": "noai, noimageai"
    }]
  }]
}

Edge Config (advanced)

For dynamic bot lists without redeployment, read blocked UAs from Vercel Edge Config in your middleware:

import { get } from '@vercel/edge-config';
// In middleware:
const blockedBots = await get('blockedBots');
// Update the list in Edge Config dashboard
// — changes apply instantly, no redeploy

Full AI Bot Reference

All 25 AI bots in the block list above:

GPTBotChatGPT-UserOAI-SearchBotClaudeBotanthropic-aiGoogle-ExtendedBytespiderCCBotPerplexityBotmeta-externalagentAmazonbotApplebot-ExtendedxAI-BotDeepSeekBotMistralBotDiffbotcohere-aiAI2BotAi2Bot-DolmaYouBotDuckAssistBotomgiliomgilibotwebzio-extendedgemini-deep-research

Frequently Asked Questions

What's the difference between app/robots.ts and public/robots.txt in Next.js?↓

public/robots.txt is a static file served directly from Next.js's static file handler — simple, no TypeScript, no configuration. app/robots.ts is a Next.js App Router convention (Next.js 13+) that generates robots.txt programmatically from a TypeScript file returning a MetadataRoute.Robots object. Use app/robots.ts if you need dynamic rules (e.g. different rules per environment, pulling blocked bots from a config) or want type safety. Use public/robots.txt if you just need a static block list — it's simpler and works with both App Router and Pages Router.

How do I add a noai meta tag per page in Next.js App Router?↓

In Next.js 13+ App Router, use the metadata export in your page.tsx: export const metadata = { robots: { index: true, follow: true, nocache: false, googleBot: { index: true }, noai: true, noimageai: true } }. Or for global application across all pages, set it in your root layout.tsx metadata export. Next.js renders this as <meta name="robots" content="noai, noimageai"> in the page <head>. For Pages Router, use next/head: import Head from 'next/head'; and add <meta name="robots" content="noai, noimageai"> inside <Head>.

How does Next.js middleware block AI bots?↓

Create a middleware.ts file in your project root. In it, check the User-Agent header of incoming requests and return a 403 response for matched AI bot patterns. This runs at the Edge (before your application code) on Vercel, or at the Node.js server level on self-hosted deployments. The middleware approach is the most reliable blocking method because it intercepts requests before they reach your application — even if a bot ignores robots.txt, it gets a 403 and cannot read your content.

Will blocking AI bots with Next.js middleware affect Vercel performance?↓

No — middleware that returns early (before running application code) is extremely fast. On Vercel Edge, middleware rejecting a bot with a 403 adds near-zero latency for legitimate users since bots are filtered before any page rendering occurs. The middleware pattern with a user-agent check and early return is one of the most efficient ways to handle this. There is no performance impact on real user requests.

How do I block AI bots on Next.js deployed to Vercel?↓

Three options work on Vercel: (1) app/robots.ts or public/robots.txt — served automatically by Vercel, immediate effect; (2) next.config.js headers — sets X-Robots-Tag on all responses, processed server-side; (3) middleware.ts — runs at Vercel Edge, blocks bots before they hit your application. For Vercel specifically, you can also add headers in vercel.json under the 'headers' key. The middleware approach is most powerful as it enforces blocking even if bots ignore robots.txt.

Does blocking AI bots in Next.js affect Google Search or Core Web Vitals?↓

No. Googlebot and Bingbot are excluded from AI bot block lists — they are standard search crawlers, not AI training bots. Blocking GPTBot, ClaudeBot, CCBot, and other AI training crawlers has no effect on your Google search rankings, Core Web Vitals measurements, or sitemap discovery. Next.js's built-in sitemap support (app/sitemap.ts) continues working normally.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.