How to Block AI Bots on aiohttp (Python): Complete 2026 Guide
aiohttp is Python's async HTTP client and server framework — both halves share the same event loop. Unlike Sanic (single-arg decorator, return to block) and Starlette/FastAPI (BaseHTTPMiddleware class with call_next), aiohttp uses a two-argument middleware function: (request, handler) — call await handler(request) to continue, or return a web.Response to block.
Handler-based chain — not return-None
aiohttp middleware explicitly receives the next handler as an argument. To continue: response = await handler(request). To block: return a web.Response without calling handler. This is conceptually similar to Koa.js's await next() pattern — unlike Sanic, which uses return-None to continue (no explicit handler argument).
Protection layers
Layer 1: robots.txt
Unlike Sanic and Flask, aiohttp static routes go through the middleware stack. Use a dedicated route handler and add /robots.txt to your exempt paths:
# static/robots.txt User-agent: * Allow: / User-agent: GPTBot User-agent: ClaudeBot User-agent: anthropic-ai User-agent: Google-Extended User-agent: CCBot User-agent: cohere-ai User-agent: Bytespider User-agent: Amazonbot User-agent: PerplexityBot User-agent: YouBot User-agent: Diffbot User-agent: DeepSeekBot User-agent: MistralBot User-agent: xAI-Bot User-agent: AI2Bot Disallow: /
# app.py — serve robots.txt via route handler
from pathlib import Path
from aiohttp import web
ROBOTS_TXT = Path('static/robots.txt').read_text()
async def robots_handler(request: web.Request) -> web.Response:
return web.Response(text=ROBOTS_TXT, content_type='text/plain')
app = web.Application()
app.router.add_get('/robots.txt', robots_handler)aiohttp's
add_static() routes are not exempt from middleware — unlike Sanic (app.static() bypasses on_request) and Flask (static files bypass before_request). Always exempt /robots.txt in your blocker.Layers 2, 3 & 4: middleware module
Create middleware/ai_bot_blocker.py. The @web.middleware decorator marks a function as middleware. It receives request and handler (the next step):
# middleware/ai_bot_blocker.py
from aiohttp import web
AI_BOTS = [
'gptbot', 'chatgpt-user', 'claudebot', 'anthropic-ai',
'ccbot', 'cohere-ai', 'bytespider', 'amazonbot',
'applebot-extended', 'perplexitybot', 'youbot', 'diffbot',
'google-extended', 'deepseekbot', 'mistralbot', 'xai-bot',
'ai2bot', 'oai-searchbot', 'duckassistbot',
]
EXEMPT_PATHS = {'/robots.txt', '/sitemap.xml', '/favicon.ico'}
@web.middleware
async def ai_bot_blocker(request: web.Request, handler):
"""Block AI bots, set noai meta context, inject X-Robots-Tag."""
# Layer 2: Set noai meta for templates (request is a MutableMapping)
request['robots'] = 'noai, noimageai'
# Exempt paths — robots.txt must always be accessible
if request.path in EXEMPT_PATHS:
response = await handler(request)
response.headers['X-Robots-Tag'] = 'noai, noimageai'
return response
# Layer 4: Hard block for AI bots
ua = request.headers.get('User-Agent', '').lower()
if any(bot in ua for bot in AI_BOTS):
return web.Response(
status=403,
text='Forbidden: AI crawlers are not permitted.',
content_type='text/plain',
)
# Continue to route handler
response = await handler(request)
# Layer 3: X-Robots-Tag on all legitimate responses
response.headers['X-Robots-Tag'] = 'noai, noimageai'
return responseRegister the middleware when creating the application:
# app.py
from aiohttp import web
from middleware.ai_bot_blocker import ai_bot_blocker
app = web.Application(middlewares=[ai_bot_blocker])
# Or append after creation:
# app.middlewares.append(ai_bot_blocker)
app.router.add_get('/robots.txt', robots_handler)
app.router.add_get('/', index_handler)
if __name__ == '__main__':
web.run_app(app, host='0.0.0.0', port=8080)Layer 2: noai meta tag
aiohttp's Request object implements MutableMapping — you can store arbitrary keys directly on it. This is the correct way to pass data from middleware to handlers (unlike Sanic's request.ctx SimpleNamespace):
# In middleware (already set above):
request['robots'] = 'noai, noimageai'
# In aiohttp_jinja2 template (base.html):
# <meta name="robots" content="{{ request['robots'] }}">
# Route handler — override per-page:
async def public_page(request: web.Request) -> web.Response:
request['robots'] = 'index, follow' # Override for public content
context = {'request': request}
return aiohttp_jinja2.render_template('page.html', request, context)
# Or pass directly in template context:
async def handler(request: web.Request) -> web.Response:
context = {'robots': request.get('robots', 'noai, noimageai')}
return aiohttp_jinja2.render_template('page.html', request, context)aiohttp:
request['robots'] — MutableMapping on the Request object itself.Sanic:
request.ctx.robots — SimpleNamespace attribute.Both are per-request and async-safe. aiohttp's approach is dict-like; Sanic's is attribute-based.
Sub-application scoped middleware
aiohttp doesn't have Blueprints (Sanic) or route groups (Express). Instead, use sub-applications — each sub-app is a full web.Application with its own middleware stack, mounted at a path prefix:
from aiohttp import web
from middleware.ai_bot_blocker import ai_bot_blocker
# Main app — no bot blocking on public pages
app = web.Application()
async def index(request):
return web.Response(text='Hello!') # Bots can access this
app.router.add_get('/', index)
# API sub-app — bot blocking enabled
api = web.Application(middlewares=[ai_bot_blocker])
async def api_data(request):
return web.json_response({'results': [1, 2, 3]})
api.router.add_get('/data', api_data)
# Mount sub-app at /api/
app.add_subapp('/api/', api)
# Request to /api/data → runs ai_bot_blocker middleware
# Request to / → does NOT run ai_bot_blockerSub-application middleware runs after the parent app's middleware. If both parent and sub-app have bot blockers, the parent runs first. Sub-app routes are relative to the mount point — /data becomes /api/data.
Middleware execution order
aiohttp middleware executes in list order — the first middleware in the middlewares list is the outermost (runs first on request, last on response). This is the same as Sanic and opposite to Starlette's LIFO add_middleware():
app = web.Application(middlewares=[
ai_bot_blocker, # runs FIRST (outermost)
auth_middleware, # runs SECOND
logging_middleware, # runs THIRD (innermost, closest to handler)
])
# Execution order for a request:
# ai_bot_blocker → auth_middleware → logging_middleware → handler
# Response flows back:
# handler → logging_middleware → auth_middleware → ai_bot_blockeraiohttp middleware wraps like an onion — each middleware calls
await handler(request) which invokes the next middleware inward. The bot blocker should be first in the list to reject bots before auth, logging, or any other processing.Old-style middleware (pre-3.0)
aiohttp versions before 3.0 used a middleware factory pattern — an outer function that receives app and handler, returning an inner handler. The new-style (used above) is preferred for all new code:
# OLD-style (aiohttp < 3.0) — factory pattern
async def ai_bot_blocker_factory(app, handler):
async def middleware_handler(request):
ua = request.headers.get('User-Agent', '').lower()
if any(bot in ua for bot in AI_BOTS):
return web.Response(status=403, text='Forbidden')
response = await handler(request)
response.headers['X-Robots-Tag'] = 'noai, noimageai'
return response
return middleware_handler
# NEW-style (aiohttp 3.0+) — @web.middleware decorator
@web.middleware
async def ai_bot_blocker(request, handler):
ua = request.headers.get('User-Agent', '').lower()
if any(bot in ua for bot in AI_BOTS):
return web.Response(status=403, text='Forbidden')
response = await handler(request)
response.headers['X-Robots-Tag'] = 'noai, noimageai'
return responseBoth styles are registered the same way: web.Application(middlewares=[...]). The new style is simpler — one function, two args, no nested closure.
aiohttp vs Sanic vs Starlette — blocking comparison
aiohttp — handler-based chain
# aiohttp @web.middleware
@web.middleware
async def block_bots(request, handler):
ua = request.headers.get('User-Agent', '').lower()
if any(b in ua for b in AI_BOTS):
return web.Response(status=403, text='Forbidden')
return await handler(request) # call handler to continueSanic — return-based (no handler arg)
# sanic @app.on_request
@app.on_request
async def block_bots(request):
ua = request.headers.get('user-agent', '').lower()
if any(b in ua for b in AI_BOTS):
return text('Forbidden', status=403)
# return None implicitly → continues to routeStarlette/FastAPI — BaseHTTPMiddleware class
# starlette BaseHTTPMiddleware
class AiBotBlocker(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
ua = request.headers.get('user-agent', '').lower()
if any(b in ua for b in AI_BOTS):
return Response('Forbidden', status_code=403)
return await call_next(request)Koa.js — conceptually similar to aiohttp
// koa middleware — same two-arg pattern
async function blockBots(ctx, next) {
const ua = (ctx.get('user-agent') || '').toLowerCase();
if (AI_BOTS.some(b => ua.includes(b))) {
ctx.status = 403; ctx.body = 'Forbidden'; return;
}
await next(); // call next to continue (like handler)
}aiohttp and Koa.js share the two-arg pattern (request + next-callable). Sanic has no handler argument — return None to pass. Starlette wraps it in a class method.
Response headers are mutable
aiohttp web.Response headers are a CIMultiDictProxy — you can mutate them in place after await handler(request) returns. No need to capture a new object (unlike PSR-7's immutable withHeader()):
@web.middleware
async def inject_headers(request, handler):
response = await handler(request)
# Mutable — set directly (like Sanic, unlike PSR-7)
response.headers['X-Robots-Tag'] = 'noai, noimageai'
response.headers['X-Content-Type-Options'] = 'nosniff'
# CIMultiDict — case-insensitive keys
# response.headers['x-robots-tag'] accesses the same header
return responseaiohttp:
response.headers['X-Robots-Tag'] = value (mutates in place).PSR-7 (Slim, CakePHP):
$response = $response->withHeader(...) (must capture).Both are valid after calling the next handler — aiohttp buffers the response.
Testing
aiohttp ships with aiohttp.test_utils.AioHTTPTestCase and works with pytest-aiohttp:
import pytest
from aiohttp import web
from aiohttp.test_utils import AioHTTPTestCase, unittest_run_loop
from app import create_app
# pytest-aiohttp style (recommended)
@pytest.fixture
def client(aiohttp_client):
app = create_app()
return aiohttp_client(app)
async def test_blocks_ai_bot(client):
resp = await (await client).get(
'/articles/test',
headers={'User-Agent': 'GPTBot/1.0'},
)
assert resp.status == 403
async def test_allows_browser(client):
resp = await (await client).get(
'/articles/test',
headers={'User-Agent': 'Mozilla/5.0 (compatible)'},
)
assert resp.status == 200
assert resp.headers.get('X-Robots-Tag') == 'noai, noimageai'
async def test_robots_txt_accessible_to_bots(client):
resp = await (await client).get(
'/robots.txt',
headers={'User-Agent': 'GPTBot/1.0'},
)
assert resp.status == 200 # Exempt path — not blockedAI bot User-Agent strings (2026)
aiohttp headers are case-insensitive via CIMultiDict — but lowercase the User-Agent value before matching: request.headers.get('User-Agent', '').lower().
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.