How is this different from Google Analytics?

Google Analytics shows you traffic. Shadow shows you traffic, AI bot activity, what AI platforms say about your brand, AND tells you what to do about all of it. It's analytics + AI intelligence + action steps in one tool.

Do I need to install anything?

For basic monitoring (bot detection, AI perception, readiness score) — nope, just enter your URL. For full visitor analytics (clicks, behavior, sessions), add one script tag. One-click integrations for Vercel, Shopify, WordPress, and more.

Will it slow down my site?

No. The script is under 5KB and loads async. Zero impact on page speed or Core Web Vitals. External monitoring has literally no impact — it watches from the outside.

What AI bots does Shadow detect?

All of them. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, and dozens more. The Shadow Network means new bots get identified across all users instantly.

What do you mean by "actionable steps"?

Shadow doesn't just show you graphs. It says things like: "ChatGPT has your pricing wrong — add structured data to /pricing to fix it" or "Your bounce rate on /features is 68% — here's why and what to change." Specific, do-it-today recommendations.

Can Shadow block bots?

Shadow is a telescope, not a shield. It shows you who's visiting and what AI says about you. It generates block rules and robots.txt configs you can apply — but it doesn't intercept traffic.

Yes. Shadow never collects PII. IP addresses are hashed after classification. No cookies on your visitors. All Shadow Network data is anonymized. GDPR compliant by design.

How do I access the User-Agent header in SparkJava?

Use request.headers('User-Agent'). SparkJava wraps Jetty's HttpServletRequest.getHeader(), which is case-insensitive per the HTTP specification — 'User-Agent', 'user-agent', and 'USER-AGENT' all return the same value. Returns null when the header is absent, so always use a null check or a null-safe comparison before calling toLowerCase().

How does halt() work in SparkJava?

halt(statusCode, body) throws a HaltException immediately. Spark's filter chain catches this exception and sends the response with the given status code and body. Because it throws, any code after halt() in the same lambda is unreachable — you do not need a return statement. Set response headers before calling halt(), because the response is committed when the exception is caught.

What is the difference between before() and before(path, filter) in SparkJava?

before(filter) registers a global filter that runs for every request regardless of path. before(path, filter) scopes the filter to requests whose path matches the given pattern — SparkJava supports * wildcards (e.g. '/api/*'). For bot blocking use before(filter) (global) so bots probing any path are caught, including paths that do not match any route.

How to Block AI Bots in SparkJava

SparkJava is a lightweight Sinatra-inspired Java framework built on Jetty. Bot blocking uses the before() filter — a lambda that runs before every route handler. request.headers("User-Agent") is case-insensitive (wraps Jetty's HttpServletRequest.getHeader() which is case-insensitive per the HTTP specification). halt(403, "Forbidden") throws a HaltException — Spark catches it and sends the response immediately; code after halt() is unreachable. A plain return passes the request through.

1. Bot detection

Pure Java, no dependencies. Stream.anyMatch() short-circuits on first match. String.contains() for literal substring matching. Null-safe: returns false for null or empty input.

// AiBotDetector.java — AI bot detection, no external dependencies
import java.util.List;

public class AiBotDetector {

    private static final List<String> AI_BOT_PATTERNS = List.of(
        "gptbot",
        "chatgpt-user",
        "claudebot",
        "anthropic-ai",
        "ccbot",
        "google-extended",
        "cohere-ai",
        "meta-externalagent",
        "bytespider",
        "omgili",
        "diffbot",
        "imagesiftbot",
        "magpie-crawler",
        "amazonbot",
        "dataprovider",
        "netcraft"
    );

    /**
     * Returns true if the User-Agent string matches a known AI crawler.
     * String.contains() — literal substring match, no regex.
     * Null-safe: returns false for null or empty input.
     *
     * @param userAgent the raw User-Agent header value (may be null)
     * @return true if the request is from a known AI bot
     */
    public static boolean isAiBot(String userAgent) {
        if (userAgent == null || userAgent.isEmpty()) return false;
        final String lower = userAgent.toLowerCase();
        return AI_BOT_PATTERNS.stream().anyMatch(lower::contains);
    }
}

2. Global `before()` filter

Register the filter before any route definitions. request.headers("User-Agent") returns null when the header is absent — the isAiBot() helper handles this. Set response headers before calling halt() because the response is committed when HaltException is thrown.

// App.java — SparkJava with global before() bot-blocking filter
import static spark.Spark.*;

public class App {

    public static void main(String[] args) {

        port(8080);

        // before(filter) — runs for EVERY request before any route handler.
        // Registered before route definitions so it fires first.
        before((request, response) -> {
            // Allow robots.txt so bots can discover Disallow rules.
            if ("/robots.txt".equals(request.pathInfo())) {
                return; // plain return = pass through, do not block
            }

            // request.headers("User-Agent") — case-insensitive (wraps Jetty
            // HttpServletRequest.getHeader). Returns null when absent.
            String ua = request.headers("User-Agent");

            if (AiBotDetector.isAiBot(ua)) {
                // Set headers BEFORE halt() — response is committed on throw.
                response.header("X-Robots-Tag", "noai, noimageai");
                response.type("text/plain");

                // halt(statusCode, body) throws HaltException immediately.
                // Spark catches it and sends the response.
                // Code after halt() is UNREACHABLE — no return needed.
                halt(403, "Forbidden");
            }

            // Pass: inject X-Robots-Tag on the way through, then continue.
            response.header("X-Robots-Tag", "noai, noimageai");
            // Plain return — Spark continues to the route handler.
        });

        // robots.txt — reachable by all crawlers.
        get("/robots.txt", (request, response) -> {
            response.type("text/plain");
            return """
                User-agent: *
                Allow: /

                User-agent: GPTBot
                Disallow: /

                User-agent: ClaudeBot
                Disallow: /

                User-agent: CCBot
                Disallow: /

                User-agent: Google-Extended
                Disallow: /
                """;
        });

        get("/", (request, response) -> {
            response.type("application/json");
            return "{\"message\": \"Hello\"}";
        });

        get("/api/data", (request, response) -> {
            response.type("application/json");
            return "{\"data\": \"value\"}";
        });
    }
}

3. How `halt()` works

halt() throws a HaltException — it does not return. Spark's filter runner catches this exception, writes the status and body, and skips all remaining filters and the route handler. This means any statement after halt() in the same lambda is dead code.

// halt() internals — what happens under the hood.

// halt(int status, String body) is equivalent to:
//   throw new HaltException(status, body);
//
// Spark's filter execution loop catches HaltException and:
//   1. Sets the HTTP status code on the response.
//   2. Writes the body string to the response.
//   3. Commits the response (no further writes possible).
//   4. Skips all remaining filters and the route handler.
//
// Because halt() throws, the JVM unwinds the stack immediately.
// Any statement after halt() in the same lambda is unreachable:

before((request, response) -> {
    if (AiBotDetector.isAiBot(request.headers("User-Agent"))) {
        response.header("X-Robots-Tag", "noai, noimageai");
        halt(403, "Forbidden");
        // ← Everything below is dead code. The compiler may warn.
        response.type("text/plain"); // NEVER executes
        System.out.println("blocked"); // NEVER executes
    }
});

// Contrast with plain return — pass through:
before((request, response) -> {
    if ("/public".equals(request.pathInfo())) {
        return; // exits the lambda, Spark continues to next filter/route
    }
    // ... bot check
});

4. Path-scoped `before(path, filter)`

before(path, filter) restricts the filter to requests whose path matches the pattern. SparkJava supports * wildcard globs — "/api/*" matches /api/data, /api/users, and any other /api/ subpath.

// Path-scoped filter — protect /api/* only.
// before(path, filter) — path supports * wildcard globs.
// The global before() guard (above) remains for full-site protection;
// this shows the scoped variant independently.

before("/api/*", (request, response) -> {
    String ua = request.headers("User-Agent");
    if (AiBotDetector.isAiBot(ua)) {
        response.header("X-Robots-Tag", "noai, noimageai");
        halt(403, "Forbidden");
    }
    response.header("X-Robots-Tag", "noai, noimageai");
});

// Public routes — not covered by the /api/* filter.
get("/", (request, response) -> "public");
get("/blog", (request, response) -> "public blog");

// Protected routes — before("/api/*", ...) fires for these.
get("/api/data", (request, response) -> "protected");
get("/api/users", (request, response) -> "protected");

Key points

request.headers() is case-insensitive: SparkJava delegates to Jetty's HttpServletRequest.getHeader(), which is case-insensitive per RFC 7230. "User-Agent", "user-agent", and "USER-AGENT" all return the same value. Returns null when the header is absent — handle accordingly.
halt() throws — code after it is unreachable: halt(statusCode, body) is throw new HaltException(statusCode, body). The JVM unwinds the call stack immediately. You do not need return after halt(); any code there is dead and the compiler may warn.
Set headers before halt(): Response headers must be set on response before calling halt(). When HaltException is caught, Spark commits the response — headers set after halt() would be unreachable anyway.
Plain return passes through: Returning from the before() lambda without calling halt() tells Spark to continue — next filters run, then the route handler. Use return for the allow branch (e.g. /robots.txt pass-through).
Register before routes: before() filters fire in registration order. Register the bot-blocking filter as the first call in main(), before any get() / post() route definitions, to ensure it runs first.
Thread model: synchronous Jetty threads: SparkJava uses a Jetty thread pool — one thread per request, no async/await, no reactive. The before() lambda runs synchronously on the request thread. No CompletableFuture or Future handling needed.
SparkJava vs Javalin: Javalin (written by the same original author) is the recommended modern successor to SparkJava. Javalin uses app.before() with a Context object, ctx.status(403).result() to block (no halt()), and has first-class Kotlin and async support. Migration is straightforward if you outgrow SparkJava.

Framework comparison — Java web frameworks

Framework	Filter registration	Block request	Header access
SparkJava	`before((req, res) -> …)`	`halt(403, "Forbidden")` (throws)	`req.headers("User-Agent")` case-insensitive
Javalin	`app.before(ctx -> …)`	`ctx.status(403).result("Forbidden")` + skip	`ctx.header("User-Agent")` case-insensitive
Spring Boot	`OncePerRequestFilter` bean	`response.sendError(403)` or `response.setStatus(403)`	`request.getHeader("User-Agent")` case-insensitive
Quarkus	`@ServerRequestFilter` or Vert.x `router.route().handler()`	`requestContext.abortWith(Response.status(403).build())`	`headers.getHeaderString("User-Agent")` case-insensitive

SparkJava's halt() is unique among these frameworks — it uses a thrown exception to abort the filter chain rather than a return value or context flag. This makes the block absolute: no downstream code can accidentally run after a halt() call. Javalin (SparkJava's spiritual successor) uses a return-value approach instead, which can be more testable. Spring Boot and Quarkus use the Servlet/JAX-RS filter model, both of which are request-scoped with explicit chain.doFilter() / abortWith() semantics.