How to Block AI Bots on Sanic (Python): Complete 2026 Guide
Sanic is Python's async-native web framework — no WSGI, no synchronous handlers, built to run fast on its own event loop. Unlike Falcon (exception-based blocking via raise HTTPForbidden()) and Starlette/FastAPI (ASGI middleware class), Sanic uses a decorator-based, return-to-block pattern: return an HTTPResponse from @app.on_request to stop the chain.
Return-based blocking — not exception-based
Sanic request middleware blocks by returning an HTTPResponse. Return None (implicitly or explicitly) to continue to the route handler. This is the same pattern as Flask's before_request(), but async. Unlike Falcon, you do not raise an exception to block.
Protection layers
Layer 1: robots.txt
Use app.static() to serve /robots.txt before middleware is considered. Sanic's static handler is wired at the routing layer — it runs before on_request:
# static/robots.txt User-agent: * Allow: / User-agent: GPTBot User-agent: ClaudeBot User-agent: anthropic-ai User-agent: Google-Extended User-agent: CCBot User-agent: cohere-ai User-agent: Bytespider User-agent: Amazonbot User-agent: PerplexityBot User-agent: YouBot User-agent: Diffbot User-agent: DeepSeekBot User-agent: MistralBot User-agent: xAI-Bot User-agent: AI2Bot Disallow: /
# server.py
from sanic import Sanic
app = Sanic("myapp")
# Register static BEFORE adding middleware
app.static('/robots.txt', './static/robots.txt')
app.static('/sitemap.xml', './static/sitemap.xml')
# Now add your middleware (registered after static doesn't matter for routing,
# but keep it explicit for clarity)
@app.on_request
async def block_ai_bots(request):
...Sanic's static routing is implemented at the route-match level — static paths are resolved before middleware handlers are called.
Layers 2, 3 & 4: middleware module
Create middleware/ai_bot_blocker.py with two handlers — one for request phase (block + set context), one for response phase (inject header):
# middleware/ai_bot_blocker.py
from sanic.request import Request
from sanic.response import HTTPResponse, text
AI_BOTS = [
'gptbot', 'chatgpt-user', 'claudebot', 'anthropic-ai',
'ccbot', 'cohere-ai', 'bytespider', 'amazonbot',
'applebot-extended', 'perplexitybot', 'youbot', 'diffbot',
'google-extended', 'deepseekbot', 'mistralbot', 'xai-bot',
'ai2bot', 'oai-searchbot', 'duckassistbot',
]
EXEMPT_PATHS = {'/robots.txt', '/sitemap.xml', '/favicon.ico'}
async def block_ai_bots(request: Request) -> HTTPResponse | None:
"""on_request handler — return a response to block, None to pass through."""
# Always set noai meta context for templates
request.ctx.robots = 'noai, noimageai'
# Exempt paths bypass bot blocking (robots.txt must always be accessible)
if request.path in EXEMPT_PATHS:
return None
ua = request.headers.get('user-agent', '').lower()
if any(bot in ua for bot in AI_BOTS):
return text('Forbidden: AI crawlers are not permitted.', status=403)
# Return None (implicit) to continue to route handler
return None
async def inject_robots_tag(request: Request, response: HTTPResponse) -> HTTPResponse | None:
"""on_response handler — add X-Robots-Tag to every outgoing response."""
response.headers['X-Robots-Tag'] = 'noai, noimageai'
return None # Return None to keep the existing responseRegister both handlers in your app:
# server.py
from sanic import Sanic
from middleware.ai_bot_blocker import block_ai_bots, inject_robots_tag
app = Sanic("myapp")
app.static('/robots.txt', './static/robots.txt')
# Register middleware
app.on_request(block_ai_bots)
app.on_response(inject_robots_tag)
# Alternatively, use the decorator form:
# @app.on_request
# async def block_ai_bots(request): ...Layer 2: noai meta tag
request.ctx is Sanic's per-request context object — a SimpleNamespace scoped to the current request. It is the correct way to pass data from middleware to route handlers and templates (equivalent to Flask's g, but request-scoped and async-safe):
# In block_ai_bots handler (already set above):
request.ctx.robots = 'noai, noimageai'
# In a Jinja2 template (via sanic-ext or sanic-jinja2):
# base.html
# <meta name="robots" content="{{ request.ctx.robots | default('noai, noimageai') }}">
# Route handler — override per-page if needed:
@app.route('/public-page')
async def public_page(request):
request.ctx.robots = 'index, follow' # Override for public content
return await render('page.html', request=request)request.ctx = per-request, scoped to one HTTP request, reset each time. app.ctx = application-level, lives for the app lifetime (like a global store). Always use request.ctx for per-request data.Blueprint-scoped middleware
Use Blueprint middleware to restrict blocking to specific route groups (e.g., only your API routes). Blueprint on_request runs in addition to any app-level middleware:
from sanic import Sanic, Blueprint
from sanic.response import text
app = Sanic("myapp")
# --- API Blueprint with bot blocking ---
api = Blueprint('api', url_prefix='/api')
AI_BOTS = ['gptbot', 'claudebot', 'ccbot', 'anthropic-ai', ...]
@api.on_request
async def block_bots_on_api(request):
ua = request.headers.get('user-agent', '').lower()
if any(bot in ua for bot in AI_BOTS):
return text('Forbidden', status=403)
@api.route('/data')
async def api_data(request):
return text('{"results": [...]}')
# --- Public Blueprint — no bot blocking ---
web = Blueprint('web', url_prefix='/')
@web.route('/')
async def index(request):
return text('Hello!') # Bots can access this
app.blueprint(api)
app.blueprint(web)Blueprint middleware runs after app-level middleware. Registration order within a Blueprint matters — handlers are called in the order they are registered.
Middleware execution order
Sanic middleware executes in FIFO order — first registered runs first. This differs from Starlette's LIFO add_middleware():
app.on_request(block_ai_bots) # runs FIRST for requests app.on_request(log_requests) # runs SECOND for requests # For responses, on_response runs in REVERSE order (LIFO) app.on_response(inject_robots_tag) # runs SECOND for responses app.on_response(log_responses) # runs FIRST for responses
Sanic request middleware fires in registration order (first in, first out). Response middleware fires in reverse registration order (last in, first out). Always register your bot blocker as the first
on_request handler.Older Sanic syntax (pre-21.12)
If you're on Sanic older than 21.12, use @app.middleware('request') and @app.middleware('response'). The handler signatures are identical:
# Sanic < 21.12 — @app.middleware decorator
@app.middleware('request')
async def block_ai_bots(request):
request.ctx.robots = 'noai, noimageai'
if request.path in EXEMPT_PATHS:
return
ua = request.headers.get('user-agent', '').lower()
if any(bot in ua for bot in AI_BOTS):
return text('Forbidden', status=403)
@app.middleware('response')
async def inject_robots_tag(request, response):
response.headers['X-Robots-Tag'] = 'noai, noimageai'Both syntaxes work in Sanic 21.12+. The on_request / on_response form is preferred for new code.
Sanic vs Flask vs Falcon — blocking comparison
Sanic — async, return response
# sanic on_request middleware (async)
@app.on_request
async def block_bots(request):
ua = request.headers.get('user-agent', '').lower()
if any(b in ua for b in AI_BOTS):
return text('Forbidden', status=403) # return stops chainFlask — sync, return response from before_request
# flask
@app.before_request
def block_bots():
ua = request.headers.get('User-Agent', '').lower()
if any(b in ua for b in AI_BOTS):
return Response('Forbidden', 403) # return stops chainFalcon — raise exception
# falcon middleware
def process_request(self, req, resp):
ua = req.get_header('User-Agent') or ''
if any(b in ua.lower() for b in AI_BOTS):
raise falcon.HTTPForbidden() # raise stops chainFastAPI / Starlette — ASGI middleware
# starlette BaseHTTPMiddleware
async def dispatch(self, request, call_next):
ua = request.headers.get('user-agent', '').lower()
if any(b in ua for b in AI_BOTS):
return Response('Forbidden', status_code=403)
return await call_next(request)Sanic and Flask share the return-based pattern. Falcon is exception-based. Starlette/FastAPI uses an ASGI middleware class with call_next().
Testing
Sanic ships with sanic.testing.SanicTestClient. Use app.test_client():
import pytest
from sanic import Sanic
from server import create_app
@pytest.fixture
def app():
return create_app()
def test_blocks_ai_bot(app):
_, response = app.test_client.get(
'/articles/test',
headers={'User-Agent': 'GPTBot/1.0'},
)
assert response.status == 403
def test_allows_browser(app):
_, response = app.test_client.get(
'/articles/test',
headers={'User-Agent': 'Mozilla/5.0 (compatible)'},
)
assert response.status == 200
assert response.headers.get('X-Robots-Tag') == 'noai, noimageai'
def test_robots_txt_accessible_to_bots(app):
_, response = app.test_client.get(
'/robots.txt',
headers={'User-Agent': 'GPTBot/1.0'},
)
# Static route bypasses on_request middleware
assert response.status == 200AI bot User-Agent strings (2026)
Lowercase and check with bot in ua.lower() for case-insensitive matching in Sanic.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.