Skip to content

How to Block AI Bots in Kotlin Http4k

Http4k is a Kotlin HTTP toolkit built on a single idea: everything is a function. An HttpHandler is (Request) -> Response. A Filter is (HttpHandler) -> HttpHandler — a function that wraps a handler. There is no reflection, no DI container, no annotations. Bot blocking is a Filter: the outer lambda receives next (the downstream handler), the inner lambda receives req — return Response(FORBIDDEN) to block or call next(req) to pass through. req.header("user-agent") is case-insensitive and returns String?. All Response methods (.header(), .body()) return a new immutable instance.

1. Bot detection

Pure Kotlin, no dependencies. String.contains() for literal substring matching — no regex. List.any() short-circuits on first match.

// BotUtils.kt — AI bot detection, no external dependencies
package com.example.botblocker

private val AI_BOT_PATTERNS = listOf(
    "gptbot",
    "chatgpt-user",
    "claudebot",
    "anthropic-ai",
    "ccbot",
    "google-extended",
    "cohere-ai",
    "meta-externalagent",
    "bytespider",
    "omgili",
    "diffbot",
    "imagesiftbot",
    "magpie-crawler",
    "amazonbot",
    "dataprovider",
    "netcraft",
)

/**
 * Returns true if the User-Agent string matches a known AI crawler.
 * String.contains() — literal substring match, no regex.
 */
fun isAiBot(ua: String): Boolean {
    if (ua.isBlank()) return false
    val lower = ua.lowercase()
    // any() short-circuits on the first match
    return AI_BOT_PATTERNS.any { lower.contains(it) }
}

2. Filter — (HttpHandler) -> HttpHandler

The Filter lambda has two levels: the outer lambda receives next (called once when the filter is applied), the inner lambda is the actual per-request handler. Return a Response directly to block; call next(req) to delegate downstream.

// AiBotFilter.kt — Http4k Filter for blocking AI crawlers
package com.example.botblocker

import org.http4k.core.Filter
import org.http4k.core.Response
import org.http4k.core.Status.Companion.FORBIDDEN
import org.http4k.core.Status.Companion.OK

/**
 * Http4k Filter type alias: (HttpHandler) -> HttpHandler
 *
 * A Filter is a function that wraps an HttpHandler:
 *   - Outer lambda receives `next` — the downstream handler
 *   - Inner lambda receives `req` — the incoming Request
 *
 * To BLOCK: return a Response without calling next(req)
 * To PASS:  call next(req) and optionally modify the Response
 */
val aiBotFilter = Filter { next ->
    { req ->
        // Path guard: let /robots.txt through regardless of User-Agent.
        // Bots read robots.txt to discover Disallow rules.
        if (req.uri.path == "/robots.txt") {
            next(req)
        } else {
            // req.header() is case-insensitive — "user-agent" == "User-Agent"
            // Returns String? — use Elvis to provide a default empty string
            val ua = req.header("user-agent") ?: ""

            if (isAiBot(ua)) {
                // Block: return Response directly — do NOT call next(req)
                // Response is immutable — each .header() / .body() returns a new instance
                Response(FORBIDDEN)
                    .header("Content-Type", "text/plain")
                    .header("X-Robots-Tag", "noai, noimageai")
                    .body("Forbidden")
            } else {
                // Pass: call next, then add X-Robots-Tag to the downstream response
                // .header() on Response returns a new Response with the header appended
                next(req).header("X-Robots-Tag", "noai, noimageai")
            }
        }
    }
}

3. App.kt — Filter.then(routes(...))

Filter.then(HttpHandler) applies the filter in front of the router, producing a new HttpHandler. SunHttp uses the JDK's built-in HTTP server — no extra runtime dependencies. Swap to Netty or Undertow by changing the import and adding the artifact.

// App.kt — Http4k application
package com.example.botblocker

import org.http4k.core.Method.GET
import org.http4k.core.Response
import org.http4k.core.Status.Companion.OK
import org.http4k.routing.bind
import org.http4k.routing.routes
import org.http4k.server.SunHttp
import org.http4k.server.asServer

fun main() {
    val router = routes(
        "/robots.txt" bind GET to { _ ->
            Response(OK)
                .header("Content-Type", "text/plain")
                .body(
                    """
                    User-agent: *
                    Allow: /

                    User-agent: GPTBot
                    Disallow: /

                    User-agent: ClaudeBot
                    Disallow: /

                    User-agent: CCBot
                    Disallow: /

                    User-agent: Google-Extended
                    Disallow: /
                    """.trimIndent()
                )
        },
        "/" bind GET to { _ ->
            Response(OK)
                .header("Content-Type", "application/json")
                .body("""{"message":"Hello"}""")
        },
        "/api/data" bind GET to { _ ->
            Response(OK)
                .header("Content-Type", "application/json")
                .body("""{"data":"value"}""")
        },
    )

    // Filter.then(HttpHandler) applies the filter in front of the router.
    // aiBotFilter wraps router — all requests pass through the filter first.
    val app = aiBotFilter.then(router)

    // SunHttp uses the JDK built-in HTTP server — zero extra dependencies.
    // Other backends: Netty, Undertow, Jetty, Apache (add the relevant artifact).
    app.asServer(SunHttp(8080)).start().block()
}

4. Chaining multiple filters

Filter.then(Filter) composes two filters into one. Filters are applied left-to-right — the first filter in the chain is the outermost wrapper and sees every request first.

// Chaining multiple filters with Filter.then(Filter)
// Filters are applied left-to-right: first.then(second).then(router)
// means first wraps second wraps router.

import org.http4k.core.Filter
import org.http4k.core.then

val loggingFilter = Filter { next ->
    { req ->
        println(">> ${req.method} ${req.uri}")
        val res = next(req)
        println("<< ${res.status}")
        res
    }
}

val corsFilter = Filter { next ->
    { req ->
        next(req).header("Access-Control-Allow-Origin", "*")
    }
}

// Execution order: loggingFilter → aiBotFilter → corsFilter → router
val app = loggingFilter
    .then(aiBotFilter)
    .then(corsFilter)
    .then(router)

5. Scoped filter — protect /api only

Apply the filter to a sub-router rather than the top-level app. Routes outside the protected sub-router bypass the bot check entirely. Health checks and robots.txt remain unfiltered.

// Scoped filter — apply bot blocking only to /api routes
// Routes outside the apiRoutes block are unprotected.

import org.http4k.core.Method.GET
import org.http4k.routing.bind
import org.http4k.routing.routes

val apiRoutes = aiBotFilter.then(
    routes(
        "/api/data" bind GET to { _ ->
            Response(OK)
                .header("Content-Type", "application/json")
                .body("""{"data":"value"}""")
        },
        "/api/status" bind GET to { _ ->
            Response(OK).body("ok")
        },
    )
)

// Top-level router: /health bypasses the bot filter entirely
val app = routes(
    "/health" bind GET to { _ -> Response(OK).body("ok") },
    "/robots.txt" bind GET to robotsHandler,
    // Mount the protected API sub-router at /api
    apiRoutes,
)

6. Unit testing — no server required

Http4k filters and handlers are plain functions — call them directly in tests. No mocking, no test server, no HTTP client.aiBotFilter.then(downstream) returns a regular function you invoke with a Request and inspect the returned Response.

// AiBotFilterTest.kt — unit-testing the filter with no server required
// Http4k filters are plain functions — test them directly without HTTP.

import org.http4k.core.Method.GET
import org.http4k.core.Request
import org.http4k.core.Response
import org.http4k.core.Status.Companion.OK
import org.http4k.core.Status.Companion.FORBIDDEN
import org.junit.jupiter.api.Assertions.assertEquals
import org.junit.jupiter.api.Test

class AiBotFilterTest {

    // A minimal downstream handler — returns 200 OK
    private val downstream = { _: Request -> Response(OK).body("Hello") }

    // Apply the filter to the downstream handler — produces a testable HttpHandler
    private val handler = aiBotFilter.then(downstream)

    @Test
    fun `blocks GPTBot with 403`() {
        val req = Request(GET, "/").header("User-Agent", "Mozilla/5.0 (compatible; GPTBot/1.0)")
        val res = handler(req)
        assertEquals(FORBIDDEN, res.status)
        assertEquals("noai, noimageai", res.header("X-Robots-Tag"))
    }

    @Test
    fun `passes normal browser`() {
        val req = Request(GET, "/").header("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64)")
        val res = handler(req)
        assertEquals(OK, res.status)
        assertEquals("noai, noimageai", res.header("X-Robots-Tag"))
    }

    @Test
    fun `passes robots txt regardless of User-Agent`() {
        val req = Request(GET, "/robots.txt").header("User-Agent", "GPTBot/1.0")
        val res = handler(req)
        // robots.txt guard fires before bot check — downstream returns 200
        assertEquals(OK, res.status)
    }
}

7. build.gradle.kts

// build.gradle.kts — Http4k dependencies
plugins {
    kotlin("jvm") version "2.0.0"
    application
}

application {
    mainClass.set("com.example.botblocker.AppKt")
}

repositories {
    mavenCentral()
}

val http4kVersion = "5.32.1.0"

dependencies {
    // Core — HttpHandler, Filter, Request, Response types
    implementation("org.http4k:http4k-core:${http4kVersion}")

    // Server backend — choose one:
    implementation("org.http4k:http4k-server-sunhttp:${http4kVersion}")  // JDK built-in, zero deps
    // implementation("org.http4k:http4k-server-netty:${http4kVersion}")
    // implementation("org.http4k:http4k-server-undertow:${http4kVersion}")
    // implementation("org.http4k:http4k-server-jetty:${http4kVersion}")

    testImplementation(kotlin("test"))
    testImplementation("org.junit.jupiter:junit-jupiter:5.10.2")
}

tasks.test {
    useJUnitPlatform()
}

Key points

Framework comparison — Kotlin / JVM HTTP frameworks

FrameworkMiddleware / FilterBlockUA header
Http4kFilter { next -> { req -> ... } }Response(FORBIDDEN).header(...)req.header("user-agent")
Ktorinstall(plugin) { intercept(...)}call.respond(HttpStatusCode.Forbidden); finish()call.request.headers["User-Agent"]
Spring BootOncePerRequestFilter / HandlerInterceptorresponse.sendError(403)request.getHeader("User-Agent")
Javalinapp.before { ctx -> ... }ctx.status(403).result("Forbidden"); ctx.skipRemainingHandlers()ctx.userAgent()

Http4k's Filter is the most minimal approach — no framework registration, no annotation processing, no coroutine context. The filter is a plain function value that can be composed, tested, and reused without any framework infrastructure. Ktor requires coroutines and plugin installation; Spring Boot requires a bean container; Javalin uses a mutable context object.