How is this different from Google Analytics?

Google Analytics shows you traffic. Shadow shows you traffic, AI bot activity, what AI platforms say about your brand, AND tells you what to do about all of it. It's analytics + AI intelligence + action steps in one tool.

Do I need to install anything?

For basic monitoring (bot detection, AI perception, readiness score) — nope, just enter your URL. For full visitor analytics (clicks, behavior, sessions), add one script tag. One-click integrations for Vercel, Shopify, WordPress, and more.

Will it slow down my site?

No. The script is under 5KB and loads async. Zero impact on page speed or Core Web Vitals. External monitoring has literally no impact — it watches from the outside.

What AI bots does Shadow detect?

All of them. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, and dozens more. The Shadow Network means new bots get identified across all users instantly.

What do you mean by "actionable steps"?

Shadow doesn't just show you graphs. It says things like: "ChatGPT has your pricing wrong — add structured data to /pricing to fix it" or "Your bounce rate on /features is 68% — here's why and what to change." Specific, do-it-today recommendations.

Can Shadow block bots?

Shadow is a telescope, not a shield. It shows you who's visiting and what AI says about you. It generates block rules and robots.txt configs you can apply — but it doesn't intercept traffic.

Yes. Shadow never collects PII. IP addresses are hashed after classification. No cookies on your visitors. All Shadow Network data is anonymized. GDPR compliant by design.

NGINX Unit · Application Server · JSON Config API8 min read

How to Block AI Bots on NGINX Unit: Complete 2026 Guide

NGINX Unit is a modern, polyglot application server from NGINX Inc. Unlike traditional web servers, it is configured entirely through a JSON REST API — no config file syntax, no restarts. It runs Python, Node.js, Go, PHP, Ruby, Java, and Perl applications natively. Bot blocking uses Unit's routing system with header match conditions and return actions.

How Unit routing works
Bot blocking via header match
Applying config via the control API
X-Robots-Tag via response_headers
Serving robots.txt as a static file
Full unit.json example
Docker deployment
FAQ

How Unit routing works

NGINX Unit processes requests through routes — a JSON array of steps evaluated in order. Each step has an optional match object (conditions) and an action object (what to do). The first step whose match passes wins.

{
  "routes": [
    {
      "match": { /* conditions */ },
      "action": { /* pass / return / share */ }
    },
    {
      /* no match = matches everything (default route) */
      "action": { "pass": "applications/myapp" }
    }
  ]
}

For bot blocking, add a step with a match on the User-Agent header and an action of {"return": 403} — placed before the application pass step.

Bot blocking via header match

Wildcard pattern matching (simplest)

{
  "routes": [
    {
      "match": {
        "headers": {
          "User-Agent": [
            "*GPTBot*",
            "*ClaudeBot*",
            "*anthropic-ai*",
            "*CCBot*",
            "*Google-Extended*",
            "*AhrefsBot*",
            "*Bytespider*",
            "*Amazonbot*",
            "*Diffbot*",
            "*FacebookBot*",
            "*cohere-ai*",
            "*PerplexityBot*",
            "*YouBot*"
          ]
        }
      },
      "action": {
        "return": 403
      }
    },
    {
      "action": {
        "pass": "applications/myapp"
      }
    }
  ]
}

Wildcard matching: In NGINX Unit, * in a match string matches any sequence of characters (including none). "*GPTBot*" matches any User-Agent containing GPTBot anywhere in the string. Array values in a match condition are OR logic — if any pattern matches, the condition is true.

Regex matching (more precise)

{
  "routes": [
    {
      "match": {
        "headers": {
          "User-Agent": "~(?i)(GPTBot|ClaudeBot|anthropic-ai|CCBot|Google-Extended|AhrefsBot|Bytespider|Amazonbot|Diffbot|FacebookBot|cohere-ai|PerplexityBot|YouBot)"
        }
      },
      "action": {
        "return": 403
      }
    },
    {
      "action": {
        "pass": "applications/myapp"
      }
    }
  ]
}

Regex prefix: In NGINX Unit match conditions, prefix a string with ~ to use it as a PCRE regular expression. The (?i) flag makes the match case-insensitive. Without the ~ prefix, the string is treated as a literal pattern with * wildcards only.

Custom response body for blocked bots

{
  "match": {
    "headers": {
      "User-Agent": ["*GPTBot*", "*ClaudeBot*", "*anthropic-ai*"]
    }
  },
  "action": {
    "return": 403,
    "response_headers": {
      "Content-Type": "text/plain"
    }
  }
}

No response body in return action: NGINX Unit's return action sends the HTTP status code and headers, but does not support a custom response body in the route config. For a custom body, forward blocked requests to a small application that returns the 403 with a body, or use an upstream nginx instance for the body content.

Applying config via the control API

NGINX Unit's configuration is managed through a Unix socket REST API. No restart required — changes take effect immediately.

Full config replace (PUT)

# Replace the entire config
curl -X PUT \
  --data-binary @unit.json \
  --unix-socket /var/run/control.unit.sock \
  http://localhost/config

Update only the routes section (PATCH)

# Update just the routes without touching applications
curl -X PUT \
  --data-binary @routes.json \
  --unix-socket /var/run/control.unit.sock \
  http://localhost/config/routes

Read current config

curl --unix-socket /var/run/control.unit.sock http://localhost/config | python3 -m json.tool

Insert a new route step at position 0 (prepend)

# Insert bot-blocking step at the beginning of the routes array
curl -X POST \
  --data-binary '{
    "match": {
      "headers": {
        "User-Agent": ["*GPTBot*", "*ClaudeBot*", "*anthropic-ai*", "*CCBot*", "*Google-Extended*"]
      }
    },
    "action": { "return": 403 }
  }' \
  --unix-socket /var/run/control.unit.sock \
  http://localhost/config/routes/0

Array indexing in the API: Use /config/routes/0 to address the first element, /config/routes/1 for the second, etc. POST to an array index inserts at that position. PUTreplaces it.

X-Robots-Tag via response_headers

Add X-Robots-Tag to all application responses by including response_headers in the pass action:

{
  "routes": [
    {
      "match": {
        "headers": {
          "User-Agent": ["*GPTBot*", "*ClaudeBot*", "*anthropic-ai*", "*CCBot*", "*Google-Extended*", "*AhrefsBot*", "*Bytespider*", "*Amazonbot*", "*Diffbot*", "*FacebookBot*", "*cohere-ai*", "*PerplexityBot*", "*YouBot*"]
        }
      },
      "action": { "return": 403 }
    },
    {
      "action": {
        "pass": "applications/myapp",
        "response_headers": {
          "X-Robots-Tag": "noai, noimageai",
          "X-Content-Type-Options": "nosniff"
        }
      }
    }
  ]
}

response_headers placement: Headers in response_headers are injected into responses from that action step — both the application pass and static file share actions support them. They are applied in addition to any headers your application sets. If your application already sets X-Robots-Tag, both values will appear — use application-level logic to set it conditionally instead.

Serving robots.txt as a static file

Add a route step that intercepts /robots.txt requests and serves a static file directly — without passing the request to your application:

{
  "routes": [
    {
      "match": {
        "headers": {
          "User-Agent": ["*GPTBot*", "*ClaudeBot*", "*anthropic-ai*", "*CCBot*", "*Google-Extended*", "*AhrefsBot*", "*Bytespider*", "*PerplexityBot*"]
        }
      },
      "action": { "return": 403 }
    },
    {
      "match": { "uri": "/robots.txt" },
      "action": {
        "share": "/var/www/static$uri",
        "response_headers": {
          "Cache-Control": "public, max-age=86400",
          "X-Robots-Tag": "noai, noimageai"
        }
      }
    },
    {
      "action": {
        "pass": "applications/myapp",
        "response_headers": {
          "X-Robots-Tag": "noai, noimageai"
        }
      }
    }
  ]
}

Place your robots.txt at /var/www/static/robots.txt. The $uri variable in the share path resolves to the request URI (/robots.txt), so the full path becomes /var/www/static/robots.txt.

Full unit.json example

{
  "listeners": {
    "*:80": {
      "pass": "routes"
    },
    "*:443": {
      "pass": "routes",
      "tls": {
        "certificate": "bundle"
      }
    }
  },

  "routes": [
    {
      "match": {
        "headers": {
          "User-Agent": [
            "*GPTBot*",
            "*ClaudeBot*",
            "*anthropic-ai*",
            "*CCBot*",
            "*Google-Extended*",
            "*AhrefsBot*",
            "*Bytespider*",
            "*Amazonbot*",
            "*Diffbot*",
            "*FacebookBot*",
            "*cohere-ai*",
            "*PerplexityBot*",
            "*YouBot*"
          ]
        }
      },
      "action": {
        "return": 403
      }
    },
    {
      "match": {
        "uri": "/robots.txt"
      },
      "action": {
        "share": "/var/www/static$uri",
        "response_headers": {
          "Cache-Control": "public, max-age=86400"
        }
      }
    },
    {
      "match": {
        "uri": ["/static/*", "/assets/*"]
      },
      "action": {
        "share": "/var/www$uri"
      }
    },
    {
      "action": {
        "pass": "applications/myapp",
        "response_headers": {
          "X-Robots-Tag": "noai, noimageai",
          "X-Content-Type-Options": "nosniff",
          "X-Frame-Options": "SAMEORIGIN"
        }
      }
    }
  ],

  "applications": {
    "myapp": {
      "type": "python 3",
      "path": "/var/www/myapp",
      "module": "wsgi",
      "user": "www-data",
      "group": "www-data",
      "environment": {
        "NODE_ENV": "production"
      }
    }
  },

  "settings": {
    "http": {
      "header_read_timeout": 30,
      "body_read_timeout": 30,
      "send_timeout": 30,
      "idle_timeout": 180,
      "max_body_size": 10485760
    }
  }
}

Apply and verify

# Apply config
curl -X PUT \
  --data-binary @unit.json \
  --unix-socket /var/run/control.unit.sock \
  http://localhost/config

# Verify response
curl -s --unix-socket /var/run/control.unit.sock http://localhost/config | python3 -m json.tool | head -20

# Test bot blocking
curl -A "GPTBot/1.0" http://localhost/
# Expected: HTTP/1.1 403 Forbidden

# Test legitimate request
curl -A "Mozilla/5.0 (compatible; Googlebot/2.1)" http://localhost/
# Expected: HTTP/1.1 200 OK

Docker deployment

docker-compose.yml

services:
  unit:
    image: unit:1.32.1-python3.12
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./unit.json:/docker-entrypoint.d/unit.json:ro
      - ./myapp:/var/www/myapp:ro
      - ./static:/var/www/static:ro
      - unit_state:/var/lib/unit
    restart: unless-stopped

volumes:
  unit_state:

Docker entrypoint: The official NGINX Unit Docker image loads JSON files from /docker-entrypoint.d/ on first boot. If the state volume is empty, it applies unit.json automatically. On subsequent starts (state volume has data), it uses the saved state — PUT to the control socket to update.

Available Docker image tags

unit:1.32.1-python3.12   # Python 3.12
unit:1.32.1-node21       # Node.js 21
unit:1.32.1-go1.22       # Go 1.22
unit:1.32.1-php8.3       # PHP 8.3
unit:1.32.1-ruby3.3      # Ruby 3.3
unit:1.32.1-jsc11        # JavaScript (JDK 11)

FAQ

How do I block AI bots by User-Agent in NGINX Unit?

Add a route step with match.headers["User-Agent"] as an array of wildcard patterns ("*GPTBot*") or a single regex string with ~ prefix, and action.return = 403. Place it as the first step in the routes array. Apply via the control API curl -X PUT --data-binary @unit.json --unix-socket ....

How does NGINX Unit routing work for bot blocking?

Routes are a JSON array evaluated in order — first match wins. Add a bot-blocking step (match on User-Agent, action return 403) before the application pass step. Array values in a match condition use OR logic.

How do I add X-Robots-Tag in NGINX Unit?

Use response_headers in the pass action: {"response_headers": {"X-Robots-Tag": "noai, noimageai"}}. Applied to all responses from that step. Static share actions also support response_headers.

How do I serve robots.txt in NGINX Unit?

Add a route step with match.uri = "/robots.txt" and action.share = "/var/www/static$uri". Place it before the application pass step — Unit serves the file directly without hitting your app.

How do I apply configuration changes without restarting NGINX Unit?

Use the control API: curl -X PUT --data-binary @config.json --unix-socket /var/run/control.unit.sock http://localhost/config. Changes are live immediately. Use PATCH or PUT to specific paths (e.g. /config/routes) to update sections without replacing the entire config.

What header match syntax does NGINX Unit support?

Array of strings (OR) with * wildcard, or a single string prefixed with ~ for PCRE regex. Example wildcard: "*GPTBot*". Example regex: "~(?i)(GPTBot|ClaudeBot)". Case-insensitive regex requires the (?i) inline flag.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

How to Block AI Bots on NGINX Unit: Complete 2026 Guide

Contents

How Unit routing works

Bot blocking via header match

Wildcard pattern matching (simplest)

Regex matching (more precise)

Custom response body for blocked bots

Applying config via the control API

Full config replace (PUT)

Update only the routes section (PATCH)

Read current config

Insert a new route step at position 0 (prepend)

X-Robots-Tag via response_headers

Serving robots.txt as a static file

Full unit.json example

Apply and verify

Docker deployment

docker-compose.yml

Available Docker image tags

FAQ

How do I block AI bots by User-Agent in NGINX Unit?

How does NGINX Unit routing work for bot blocking?

How do I add X-Robots-Tag in NGINX Unit?

How do I serve robots.txt in NGINX Unit?

How do I apply configuration changes without restarting NGINX Unit?

What header match syntax does NGINX Unit support?