How to Block AI Bots on Erlang Cowboy: Complete 2026 Guide
Cowboy is the HTTP server underlying Phoenix, Plug, and most BEAM web stacks. Use the cowboy_middleware behaviour — implement execute/2. Return {stop, Req} after cowboy_req:reply(403, ...) to halt — the handler module is never called. Return {ok, Req, Env} to continue. Middleware order: [cowboy_router, ai_bot_blocker, cowboy_handler].
execute/2 return values
{ok, Req, Env}— continue to next middleware{stop, Req}— halt chain, send Req's reply, handler never calledReq passed to {stop, Req} must have a reply attached via cowboy_req:reply/4 first. Cowboy sends that reply and stops. The cowboy_handler middleware (which calls your route handler) never executes.
Protection layers
Step 1 — Bot detection (src/ai_bots.erl)
Binary patterns with binary:match/2 — a built-in BIF (Built-In Function), no library needed. Returns {{Start, Length}} on match or nomatch. Caller lowercases the UA with string:lowercase/1 (OTP 20+).
%% src/ai_bots.erl — bot detection module
-module(ai_bots).
-export([is_ai_bot/1]).
%% Known AI bot UA substrings as lowercase binaries.
%% binary:match/2 requires binary patterns.
-define(PATTERNS, [
%% OpenAI
<<"gptbot">>, <<"chatgpt-user">>, <<"oai-searchbot">>,
%% Anthropic
<<"claudebot">>, <<"claude-web">>,
%% Common Crawl
<<"ccbot">>,
%% Bytedance
<<"bytespider">>,
%% Meta
<<"meta-externalagent">>,
%% Perplexity
<<"perplexitybot">>,
%% Google AI
<<"google-extended">>, <<"googleother">>,
%% Cohere
<<"cohere-ai">>,
%% Amazon
<<"amazonbot">>,
%% Diffbot
<<"diffbot">>,
%% AI2
<<"ai2bot">>,
%% DeepSeek
<<"deepseekbot">>,
%% Mistral
<<"mistralai-user">>,
%% xAI
<<"xai-bot">>,
%% You.com
<<"youbot">>,
%% DuckDuckGo AI
<<"duckassistbot">>
]).
%% is_ai_bot/1: returns true if UA binary matches any known AI bot pattern.
%% UA must already be lowercased (string:lowercase/1) before calling.
%% binary:match/2: returns {Start, Length} on match, 'nomatch' on no match.
is_ai_bot(<<>>) -> false;
is_ai_bot(UA) ->
lists:any(
fun(Pattern) -> binary:match(UA, Pattern) =/= nomatch end,
?PATTERNS
).Step 2 — Middleware (src/ai_bot_blocker.erl)
Path check first — robots.txt and /health return {ok, Req, Env} unconditionally. For other paths: cowboy_req:set_resp_header/3 (not reply) sets headers on pass-through without sending a response — Cowboy merges them into the handler's eventual reply.
%% src/ai_bot_blocker.erl — cowboy_middleware behaviour
-module(ai_bot_blocker).
-behaviour(cowboy_middleware).
-export([execute/2]).
%% execute/2 is the single required callback for cowboy_middleware.
%% Return {ok, Req, Env} → continue to next middleware
%% Return {stop, Req} → halt chain, send Req's reply, handler never called
execute(Req, Env) ->
Path = cowboy_req:path(Req),
%% Allow robots.txt unconditionally — all crawlers must be able to read it.
%% cowboy_router has already resolved the handler; we bypass the bot check.
case Path of
<<"/robots.txt">> ->
{ok, Req, Env};
<<"/health">> ->
{ok, Req, Env};
_ ->
%% cowboy_req:header/3: header name must be lowercase binary.
%% Third arg <<>> is the default if the header is absent.
%% Cowboy normalises header names to lowercase per HTTP spec.
UA = cowboy_req:header(<<"user-agent">>, Req, <<>>),
%% string:lowercase/1 (OTP 20+): works on binary, returns binary.
UALower = string:lowercase(UA),
case ai_bots:is_ai_bot(UALower) of
true ->
%% Reply with 403 and attach to Req.
Req2 = cowboy_req:reply(
403,
#{
<<"content-type">> => <<"text/plain; charset=utf-8">>,
<<"x-robots-tag">> => <<"noai, noimageai">>
},
<<"Forbidden">>,
Req
),
%% {stop, Req2}: Cowboy sends the reply, halts middleware chain.
%% cowboy_handler (and therefore the route handler) never runs.
{stop, Req2};
false ->
%% Pass through: add X-Robots-Tag to all legitimate responses.
%% We cannot set response headers here without a reply —
%% use cowboy_req:set_resp_header/3 instead.
Req2 = cowboy_req:set_resp_header(
<<"x-robots-tag">>, <<"noai, noimageai">>, Req
),
{ok, Req2, Env}
end
end.Step 3 — Application startup (src/my_app.erl)
Middleware order in the middlewares list is execution order. cowboy_router must be first (populates handler in Env); cowboy_handler must be last (calls the handler). Your middleware goes between them.
%% src/my_app.erl — OTP application + Cowboy server startup
-module(my_app).
-behaviour(application).
-export([start/2, stop/1]).
start(_Type, _Args) ->
Dispatch = cowboy_router:compile([
{'_', [
%% robots.txt — cowboy_static serves the file directly.
%% Resolved by cowboy_router; ai_bot_blocker checks path and allows it.
{"/robots.txt", cowboy_static,
{file, "priv/static/robots.txt"}},
%% Application routes
{"/", home_handler, []},
{"/health", health_handler, []},
{"/api/[...]", api_handler, []}
]}
]),
%% Middleware order matters:
%% 1. cowboy_router — resolves path to handler module, populates Env
%% 2. ai_bot_blocker — UA check (runs after routing so path is available)
%% 3. cowboy_handler — calls the resolved handler module
{ok, _} = cowboy:start_clear(
my_http_listener,
[{port, 8080}],
#{
env => #{dispatch => Dispatch},
middlewares => [cowboy_router, ai_bot_blocker, cowboy_handler]
}
),
my_sup:start_link().
stop(_State) ->
cowboy:stop_listener(my_http_listener).Step 4 — Handler module (src/home_handler.erl)
Handlers only run for requests that passed the middleware check. No bot-detection code needed in individual handlers — the middleware already blocked AI crawlers before init/2 is called.
%% src/home_handler.erl — example cowboy_handler
%% Bot check is already done by ai_bot_blocker middleware —
%% this handler only runs for legitimate traffic.
-module(home_handler).
-export([init/2]).
init(Req, State) ->
Body = <<"<!DOCTYPE html>
<html>
<head>
<meta name=\"robots\" content=\"noai, noimageai\">
<title>My Site</title>
</head>
<body><h1>Welcome</h1></body>
</html>">>,
Req2 = cowboy_req:reply(
200,
#{<<"content-type">> => <<"text/html; charset=utf-8">>},
Body,
Req
),
{ok, Req2, State}.
%% src/api_handler.erl
%% -module(api_handler).
%% -export([init/2]).
%% init(Req, State) ->
%% Req2 = cowboy_req:reply(200,
%% #{<<"content-type">> => <<"application/json">>},
%% <<"{\"data\":\"protected\"}">>, Req),
%% {ok, Req2, State}.rebar.config and app spec
%% rebar.config — dependencies
{erl_opts, [debug_info]}.
{deps, [
{cowboy, "2.12.0"}
]}.
{relx, [
{release, {my_app, "0.1.0"}, [my_app, cowboy]},
{mode, dev}
]}.
%% src/my_app.app.src
%% {application, my_app,
%% [{description, "My Cowboy app"},
%% {vsn, "0.1.0"},
%% {modules, []},
%% {registered, []},
%% {applications, [kernel, stdlib, cowboy]},
%% {mod, {my_app, []}}
%% ]}.
%% Build and run:
%% rebar3 compile
%% rebar3 shellCowboy middleware vs per-handler vs Plug vs Phoenix
| Feature | Cowboy middleware | Cowboy handler | Elixir Plug | Phoenix |
|---|---|---|---|---|
| Middleware contract | cowboy_middleware behaviour — execute/2 returns {ok, Req, Env} or {stop, Req} | cowboy_handler — init/2 in each handler module, manual bot check per handler | Plug behaviour — init/1 compile-time + call/2 runtime returning Plug.Conn + halt() | Plug pipeline in endpoint.ex — plug AiBotBlocker before router |
| Short-circuit | {stop, Req} after cowboy_req:reply/4 — cowboy_handler never called | cowboy_req:reply(403, ...) + {ok, Req2, State} in init/2 — handler returns early | send_resp(conn, 403, "Forbidden") |> halt() — halt() required to stop pipeline | Same as Plug — plug in endpoint stops all downstream plugs including router |
| UA header access | cowboy_req:header(<<"user-agent">>, Req, <<>>) — binary, lowercase key, binary default | Same cowboy_req:header/3 call inside init/2 | get_req_header(conn, "user-agent") → [String.t()] — list, use List.first/2 | Same as Plug — get_req_header in Plug, or conn.req_headers in Phoenix controller |
| String matching | binary:match(UALower, Pattern) =/= nomatch — binary built-in, no library needed | Same binary:match/2 in handler init/2 | String.contains?(ua, pattern) — Elixir stdlib, no external deps | Same as Plug |
| robots.txt | cowboy_static handler for /robots.txt in dispatch; path check in middleware to bypass | Explicit route to static handler; handler-level approach bypasses naturally | Plug.Static before AiBotBlocker plug — static handler auto-halts after serving file | plug Plug.Static in endpoint.ex before bot-blocker plug |
| Middleware order | middlewares: [cowboy_router, ai_bot_blocker, cowboy_handler] — router first, handler last | N/A — per-handler pattern, no global middleware chain | plug macro order in Plug.Builder = execution order | plug macro order in Phoenix.Endpoint = execution order |
Summary
{stop, Req}after reply — callcowboy_req:reply(403, ...)first, then return{stop, Req2}. The handler module never runs.- Middleware order:
[cowboy_router, your_middleware, cowboy_handler]— router first (resolves path), handler last (calls module). Your middleware goes between. binary:match/2— built-in BIF, no library. Returnsnomatchor{{Start, Length}}. Use=/= nomatchfor found.cowboy_req:set_resp_header/3— sets a header on the response without sending it. Use this in the pass-through branch for X-Robots-Tag; Cowboy merges it into the handler's eventual reply.- If you use Elixir/Phoenix: write a Plug, not a cowboy_middleware. This guide is for direct Erlang + Cowboy deployments.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.