Skip to content
Guides/Rocket

How to Block AI Bots on Rocket (Rust): Complete 2026 Guide

Rocket is a Rust web framework known for compile-time route validation and type-safe request handling. Its bot-blocking architecture is fundamentally different from every other framework: Fairings (lifecycle callbacks) cannot abort requests, and Request Guards (the FromRequest trait) are the idiomatic way to block requests with a 403.

Fairings are NOT middleware

In every other framework, “middleware” can short-circuit a request and return a response. Rocket fairings cannot. on_request returns () — there is no mechanism to abort. on_response can modify the response but the route handler has already run. This is by design: Rocket separates side-effects (fairings) from access control (guards).

Protection layers

1
robots.txtFileServer::from("./static") or a #[get("/robots.txt")] route — served before any guard or fairing
2
noai meta tagIn HTML templates — via Tera, Handlebars, Askama, or a custom Responder
3
X-Robots-Tag header (Fairing)on_response sets the header globally on all responses — including 403s
4
Hard 403 — Request Guard (per-route)FromRequest returns Outcome::Error((Status::Forbidden, ())) — handler never runs. Recommended.
5
Hard 403 — Fairing override (global)on_response rewrites the response to 403 — handler still runs but response is overwritten. Less efficient.

Step 1 — Shared bot list (src/bots.rs)

A &[&str] slice is zero-cost at runtime — the strings are embedded in the binary. The is_ai_bot() function lowercases and checks for substring matches.

// src/bots.rs — shared AI bot list
pub const AI_BOTS: &[&str] = &[
    // OpenAI
    "gptbot", "chatgpt-user", "oai-searchbot",
    // Anthropic
    "claudebot", "claude-web",
    // Common Crawl
    "ccbot",
    // Bytedance
    "bytespider",
    // Meta
    "meta-externalagent",
    // Perplexity
    "perplexitybot",
    // Google AI
    "google-extended", "googleother",
    // Cohere
    "cohere-ai",
    // Amazon
    "amazonbot",
    // Diffbot
    "diffbot",
    // AI2
    "ai2bot",
    // DeepSeek
    "deepseekbot",
    // Mistral
    "mistralai-user",
    // xAI
    "xai-bot",
    // You.com
    "youbot",
    // DuckDuckGo AI
    "duckassistbot",
];

pub fn is_ai_bot(user_agent: &str) -> bool {
    let ua = user_agent.to_lowercase();
    AI_BOTS.iter().any(|bot| ua.contains(bot))
}

Step 2 — Request Guard (recommended — per-route blocking)

Implement FromRequest for a guard struct. Return Outcome::Error((Status::Forbidden, ())) for AI bots — the route handler never executes. Zero wasted computation. Add the guard as a function parameter to any route.

// src/guards.rs — Request Guard for AI bot blocking
use rocket::request::{self, FromRequest, Request};
use rocket::http::Status;

use crate::bots::is_ai_bot;

/// Request Guard that blocks AI bots with a 403.
/// Add this as a parameter to any route to protect it.
pub struct AiBotGuard;

#[rocket::async_trait]
impl<'r> FromRequest<'r> for AiBotGuard {
    type Error = ();

    async fn from_request(req: &'r Request<'_>) -> request::Outcome<Self, Self::Error> {
        let ua = req.headers().get_one("User-Agent").unwrap_or("");

        if is_ai_bot(ua) {
            // Short-circuit — route handler never runs
            request::Outcome::Error((Status::Forbidden, ()))
        } else {
            request::Outcome::Success(AiBotGuard)
        }
    }
}

Step 3 — Routes, mounting, and catcher

Add _guard: AiBotGuard to route parameters. The underscore prefix tells Rust the value is intentionally unused — the guard's effect is its existence (it ran successfully). Routes without the guard remain accessible to AI bots.

// src/main.rs — routes with AiBotGuard
use rocket::fs::FileServer;

mod bots;
mod guards;
mod fairings;

use guards::AiBotGuard;

// Protected route — guard runs before handler.
// If UA is an AI bot, handler never executes → 403.
#[get("/")]
fn index(_guard: AiBotGuard) -> &'static str {
    "Welcome to my site"
}

// Protected API route — same guard, same behavior.
#[get("/api/data")]
fn api_data(_guard: AiBotGuard) -> rocket::serde::json::Json<&'static str> {
    rocket::serde::json::Json("sensitive data")
}

// Unprotected route — no guard, AI bots can access.
#[get("/health")]
fn health() -> &'static str {
    "ok"
}

#[launch]
fn rocket() -> _ {
    rocket::build()
        // Attach the X-Robots-Tag fairing (global, all responses)
        .attach(fairings::XRobotsTagFairing)
        // Mount routes
        .mount("/", routes![index, api_data, health])
        // Serve static files (including robots.txt)
        .mount("/", FileServer::from("./static"))
        // Register 403 catcher for a clean error page
        .register("/", catchers![forbidden])
}

#[catch(403)]
fn forbidden() -> &'static str {
    "Forbidden"
}

Step 4 — X-Robots-Tag via Fairing (global header)

This is what fairings are designed for: adding headers to all responses. on_response fires after every route handler — set X-Robots-Tag on every response, including 403 error pages.

// src/fairings.rs — X-Robots-Tag on all responses
use rocket::fairing::{Fairing, Info, Kind};
use rocket::{Request, Response};

/// Fairing that adds X-Robots-Tag to every response.
/// Fairings CAN modify responses — they just cannot abort requests.
pub struct XRobotsTagFairing;

#[rocket::async_trait]
impl Fairing for XRobotsTagFairing {
    fn info(&self) -> Info {
        Info {
            name: "X-Robots-Tag Header",
            kind: Kind::Response,
        }
    }

    // on_response: fires AFTER the route handler returns.
    // Can modify headers, status, body — anything on the Response.
    async fn on_response<'r>(
        &self,
        _req: &'r Request<'_>,
        res: &mut Response<'r>,
    ) {
        res.set_raw_header("X-Robots-Tag", "noai, noimageai");
    }
}

Step 5 — Global blocking via Fairing override (alternative)

Trade-off: The route handler still runs — database queries, computation, and side-effects all execute before the response is overwritten. Use this only when you cannot add Request Guards to every route. For zero-overhead global blocking, put Rocket behind nginx or Caddy.

// src/fairings.rs — Global blocking via on_response override
use rocket::fairing::{Fairing, Info, Kind};
use rocket::http::Status;
use rocket::{Request, Response};
use std::io::Cursor;

use crate::bots::is_ai_bot;

/// Global bot blocker via Fairing.
///
/// ⚠️ TRADE-OFF: The route handler STILL RUNS — the response is
/// overwritten afterward. This wastes computation (DB queries, etc.)
/// but provides truly global coverage without adding guards to every route.
///
/// For zero-overhead global blocking, use nginx/Caddy in front of Rocket.
pub struct GlobalBotBlockerFairing;

#[rocket::async_trait]
impl Fairing for GlobalBotBlockerFairing {
    fn info(&self) -> Info {
        Info {
            name: "Global AI Bot Blocker",
            kind: Kind::Response,
        }
    }

    async fn on_response<'r>(
        &self,
        req: &'r Request<'_>,
        res: &mut Response<'r>,
    ) {
        let ua = req.headers().get_one("User-Agent").unwrap_or("");

        if is_ai_bot(ua) {
            // Override the entire response to 403
            res.set_status(Status::Forbidden);
            res.set_sized_body(
                Some("Forbidden".len()),
                Cursor::new("Forbidden"),
            );
            // Remove any body the handler set
            res.remove_header("Content-Type");
            res.set_raw_header("Content-Type", "text/plain; charset=utf-8");
        }

        // X-Robots-Tag on ALL responses (blocked or not)
        res.set_raw_header("X-Robots-Tag", "noai, noimageai");
    }
}

Step 6 — robots.txt

Three options: static file via FileServer, a #[get] route handler, or compile-time embedding with include_str!(). The include_str! approach bakes the file into the binary at compile time — zero filesystem reads at runtime.

// Option A: Static file — place in ./static/robots.txt
// FileServer::from("./static") serves it automatically at /robots.txt.
// No code needed — just the file.

// Option B: Route handler — dynamic or compile-time embedded
#[get("/robots.txt")]
fn robots() -> (rocket::http::ContentType, &'static str) {
    (rocket::http::ContentType::Plain, ROBOTS_CONTENT)
}

// Option C: Compile-time embedded via include_str!()
// The file is baked into the binary — no filesystem read at runtime.
#[get("/robots.txt")]
fn robots_embedded() -> (rocket::http::ContentType, &'static str) {
    (
        rocket::http::ContentType::Plain,
        include_str!("../static/robots.txt"),
    )
}

const ROBOTS_CONTENT: &str = "User-agent: *
Allow: /

# AI training bots — blocked
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Meta-ExternalAgent
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: AmazonBot
Disallow: /

User-agent: Diffbot
Disallow: /";

Step 7 — noai meta tag via custom Responder

Rocket's Responder trait lets you create types that control every aspect of the HTTP response. Combine the noai meta tag in HTML with the X-Robots-Tag header in a single responder type.

// In a Tera/Handlebars/Askama template:
// <meta name="robots" content="noai, noimageai">

// With Rocket's responder system, inject via a custom responder:
use rocket::response::{self, Responder, Response};
use rocket::http::ContentType;
use std::io::Cursor;

pub struct HtmlWithNoAi(pub String);

impl<'r> Responder<'r, 'static> for HtmlWithNoAi {
    fn respond_to(self, _req: &'r Request<'_>) -> response::Result<'static> {
        Response::build()
            .header(ContentType::HTML)
            .raw_header("X-Robots-Tag", "noai, noimageai")
            .sized_body(Some(self.0.len()), Cursor::new(self.0))
            .ok()
    }
}

// Usage in a route:
#[get("/")]
fn index(_guard: AiBotGuard) -> HtmlWithNoAi {
    HtmlWithNoAi(format!(
        "<html><head><meta name=\"robots\" content=\"noai, noimageai\"></head>\
         <body><h1>Protected</h1></body></html>"
    ))
}

Rocket vs Actix-web vs Axum vs Warp

FeatureRocketActix-webAxumWarp
Middleware modelRequest Guards (per-route) + Fairings (lifecycle)wrap() / wrap_fn() — wraps handler chainTower layers — Service<Request> → Service<Request>Filter combinators — composable extractors
Can abort in middleware?Guards: yes (Outcome::Error). Fairings: no.Yes — return HttpResponse earlyYes — return Response from layerYes — return Rejection
Global blockingFairing on_response override (handler still runs)app.wrap(middleware) — runs on all routesRouter::layer(middleware) — runs on all routes.and(filter) at route composition level
Per-route blockingfn index(_g: Guard) — function parameter.wrap() on resource/scope.route_layer() on specific routes.and(filter) per-route
UA header accessreq.headers().get_one("User-Agent")req.headers().get("user-agent")req.headers().get("user-agent")warp::header::optional("user-agent")
Hard 403Outcome::Error((Status::Forbidden, ()))HttpResponse::Forbidden().finish()(StatusCode::FORBIDDEN, "Forbidden").into_response()warp::reject::custom(Forbidden)
robots.txtFileServer::from("./static") or #[get] routeactix_files::Files or routetower_http::services::ServeDir or routewarp::fs::dir("./static") or route
Compile-time checksYes — route types checked at compile timePartial — extractors checked at runtimePartial — extractors checked at runtimeYes — filter types checked at compile time

Quick reference

UA headerreq.headers().get_one("User-Agent")
Guard 403Outcome::Error((Status::Forbidden, ()))
Guard successOutcome::Success(AiBotGuard)
X-Robots-Tagres.set_raw_header("X-Robots-Tag", "noai, noimageai")
Fairing traitimpl Fairing — on_request (no abort), on_response (can modify)
Guard traitimpl FromRequest — Outcome::Error to block
robots.txtFileServer::from("./static") or #[get("/robots.txt")] route
Compile-time embedinclude_str!("../static/robots.txt")
Catcher#[catch(403)] fn forbidden() -> &str

FAQ

Can Rocket fairings block or abort incoming requests?

No. on_request returns () — there is no mechanism to return a response or abort the request. This is Rocket's deliberate design: fairings are side-effects (logging, metrics, header injection), not access control. For blocking AI bots, use a Request Guard that returns Outcome::Error((Status::Forbidden, ())).

How do I block AI bots globally without adding a guard to every route?

Three options: (1) Fairing on_response override — check the request UA and rewrite the response to 403. The route handler still runs (wasted CPU/DB) but the response is replaced. (2) Reverse proxy — put nginx or Caddy in front of Rocket and block at the proxy level. Zero wasted computation. This is the recommended production approach. (3) Guard on every route — explicit and Rocket-idiomatic, but verbose for large applications.

What is the difference between Request Guards and Fairings?

Request Guards (FromRequest): per-route, type-checked at compile time, CAN abort with Outcome::Error. Only run on routes that declare them as parameters. Fairings (Fairing trait): global lifecycle callbacks, CANNOT abort requests. on_request modifies the request, on_response modifies the response. Run on all requests regardless of route.

How do I serve robots.txt in Rocket?

(1) FileServer::from("./static") — place robots.txt in ./static/. Simplest approach. (2) Dedicated #[get("/robots.txt")] route — for dynamic content. (3) include_str!("../static/robots.txt") — embeds the file at compile time, zero filesystem reads at runtime. Use FileServer for simple cases, the route for dynamic content, include_str! for maximum performance.

How is this different from the general Rust guide on Open Shadow?

The general Rust guide covers Actix-web wrap_fn and Axum from_fn Tower middleware — both use the standard middleware pattern where you CAN short-circuit. Rocket's architecture is fundamentally different: fairings cannot abort, guards are the blocking mechanism. Different traits, different patterns, different trade-offs. If you're using Rocket, this guide applies. If you're using Actix-web or Axum, see the general Rust guide.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.