Skip to content
Guides/aiohttp (Python)

How to Block AI Bots on aiohttp (Python): Complete 2026 Guide

aiohttp is Python's async HTTP client and server framework — both halves share the same event loop. Unlike Sanic (single-arg decorator, return to block) and Starlette/FastAPI (BaseHTTPMiddleware class with call_next), aiohttp uses a two-argument middleware function: (request, handler) — call await handler(request) to continue, or return a web.Response to block.

Handler-based chain — not return-None

aiohttp middleware explicitly receives the next handler as an argument. To continue: response = await handler(request). To block: return a web.Response without calling handler. This is conceptually similar to Koa.js's await next() pattern — unlike Sanic, which uses return-None to continue (no explicit handler argument).

Protection layers

1
robots.txtRoute handler or static file — add to EXEMPT_PATHS (aiohttp static goes through middleware)
2
noai meta tagrequest["robots"] set in middleware — accessed by aiohttp_jinja2 templates
3
X-Robots-Tag headerSet on response object after await handler(request) returns
4
Hard 403 blockReturn web.Response(status=403) without calling handler — route never executes

Layer 1: robots.txt

Unlike Sanic and Flask, aiohttp static routes go through the middleware stack. Use a dedicated route handler and add /robots.txt to your exempt paths:

# static/robots.txt

User-agent: *
Allow: /

User-agent: GPTBot
User-agent: ClaudeBot
User-agent: anthropic-ai
User-agent: Google-Extended
User-agent: CCBot
User-agent: cohere-ai
User-agent: Bytespider
User-agent: Amazonbot
User-agent: PerplexityBot
User-agent: YouBot
User-agent: Diffbot
User-agent: DeepSeekBot
User-agent: MistralBot
User-agent: xAI-Bot
User-agent: AI2Bot
Disallow: /
# app.py — serve robots.txt via route handler
from pathlib import Path
from aiohttp import web

ROBOTS_TXT = Path('static/robots.txt').read_text()

async def robots_handler(request: web.Request) -> web.Response:
    return web.Response(text=ROBOTS_TXT, content_type='text/plain')

app = web.Application()
app.router.add_get('/robots.txt', robots_handler)
Static routes go through middleware
aiohttp's add_static() routes are not exempt from middleware — unlike Sanic (app.static() bypasses on_request) and Flask (static files bypass before_request). Always exempt /robots.txt in your blocker.

Layers 2, 3 & 4: middleware module

Create middleware/ai_bot_blocker.py. The @web.middleware decorator marks a function as middleware. It receives request and handler (the next step):

# middleware/ai_bot_blocker.py
from aiohttp import web

AI_BOTS = [
    'gptbot', 'chatgpt-user', 'claudebot', 'anthropic-ai',
    'ccbot', 'cohere-ai', 'bytespider', 'amazonbot',
    'applebot-extended', 'perplexitybot', 'youbot', 'diffbot',
    'google-extended', 'deepseekbot', 'mistralbot', 'xai-bot',
    'ai2bot', 'oai-searchbot', 'duckassistbot',
]

EXEMPT_PATHS = {'/robots.txt', '/sitemap.xml', '/favicon.ico'}


@web.middleware
async def ai_bot_blocker(request: web.Request, handler):
    """Block AI bots, set noai meta context, inject X-Robots-Tag."""

    # Layer 2: Set noai meta for templates (request is a MutableMapping)
    request['robots'] = 'noai, noimageai'

    # Exempt paths — robots.txt must always be accessible
    if request.path in EXEMPT_PATHS:
        response = await handler(request)
        response.headers['X-Robots-Tag'] = 'noai, noimageai'
        return response

    # Layer 4: Hard block for AI bots
    ua = request.headers.get('User-Agent', '').lower()
    if any(bot in ua for bot in AI_BOTS):
        return web.Response(
            status=403,
            text='Forbidden: AI crawlers are not permitted.',
            content_type='text/plain',
        )

    # Continue to route handler
    response = await handler(request)

    # Layer 3: X-Robots-Tag on all legitimate responses
    response.headers['X-Robots-Tag'] = 'noai, noimageai'

    return response

Register the middleware when creating the application:

# app.py
from aiohttp import web
from middleware.ai_bot_blocker import ai_bot_blocker

app = web.Application(middlewares=[ai_bot_blocker])

# Or append after creation:
# app.middlewares.append(ai_bot_blocker)

app.router.add_get('/robots.txt', robots_handler)
app.router.add_get('/', index_handler)

if __name__ == '__main__':
    web.run_app(app, host='0.0.0.0', port=8080)

Layer 2: noai meta tag

aiohttp's Request object implements MutableMapping — you can store arbitrary keys directly on it. This is the correct way to pass data from middleware to handlers (unlike Sanic's request.ctx SimpleNamespace):

# In middleware (already set above):
request['robots'] = 'noai, noimageai'

# In aiohttp_jinja2 template (base.html):
# <meta name="robots" content="{{ request['robots'] }}">

# Route handler — override per-page:
async def public_page(request: web.Request) -> web.Response:
    request['robots'] = 'index, follow'  # Override for public content
    context = {'request': request}
    return aiohttp_jinja2.render_template('page.html', request, context)

# Or pass directly in template context:
async def handler(request: web.Request) -> web.Response:
    context = {'robots': request.get('robots', 'noai, noimageai')}
    return aiohttp_jinja2.render_template('page.html', request, context)
request[key] vs request.ctx
aiohttp: request['robots'] — MutableMapping on the Request object itself.
Sanic: request.ctx.robots — SimpleNamespace attribute.
Both are per-request and async-safe. aiohttp's approach is dict-like; Sanic's is attribute-based.

Sub-application scoped middleware

aiohttp doesn't have Blueprints (Sanic) or route groups (Express). Instead, use sub-applications — each sub-app is a full web.Application with its own middleware stack, mounted at a path prefix:

from aiohttp import web
from middleware.ai_bot_blocker import ai_bot_blocker

# Main app — no bot blocking on public pages
app = web.Application()

async def index(request):
    return web.Response(text='Hello!')  # Bots can access this

app.router.add_get('/', index)

# API sub-app — bot blocking enabled
api = web.Application(middlewares=[ai_bot_blocker])

async def api_data(request):
    return web.json_response({'results': [1, 2, 3]})

api.router.add_get('/data', api_data)

# Mount sub-app at /api/
app.add_subapp('/api/', api)

# Request to /api/data → runs ai_bot_blocker middleware
# Request to / → does NOT run ai_bot_blocker

Sub-application middleware runs after the parent app's middleware. If both parent and sub-app have bot blockers, the parent runs first. Sub-app routes are relative to the mount point — /data becomes /api/data.

Middleware execution order

aiohttp middleware executes in list order — the first middleware in the middlewares list is the outermost (runs first on request, last on response). This is the same as Sanic and opposite to Starlette's LIFO add_middleware():

app = web.Application(middlewares=[
    ai_bot_blocker,    # runs FIRST (outermost)
    auth_middleware,   # runs SECOND
    logging_middleware,  # runs THIRD (innermost, closest to handler)
])

# Execution order for a request:
# ai_bot_blocker → auth_middleware → logging_middleware → handler
# Response flows back:
# handler → logging_middleware → auth_middleware → ai_bot_blocker
Onion model
aiohttp middleware wraps like an onion — each middleware calls await handler(request) which invokes the next middleware inward. The bot blocker should be first in the list to reject bots before auth, logging, or any other processing.

Old-style middleware (pre-3.0)

aiohttp versions before 3.0 used a middleware factory pattern — an outer function that receives app and handler, returning an inner handler. The new-style (used above) is preferred for all new code:

# OLD-style (aiohttp < 3.0) — factory pattern
async def ai_bot_blocker_factory(app, handler):
    async def middleware_handler(request):
        ua = request.headers.get('User-Agent', '').lower()
        if any(bot in ua for bot in AI_BOTS):
            return web.Response(status=403, text='Forbidden')
        response = await handler(request)
        response.headers['X-Robots-Tag'] = 'noai, noimageai'
        return response
    return middleware_handler

# NEW-style (aiohttp 3.0+) — @web.middleware decorator
@web.middleware
async def ai_bot_blocker(request, handler):
    ua = request.headers.get('User-Agent', '').lower()
    if any(bot in ua for bot in AI_BOTS):
        return web.Response(status=403, text='Forbidden')
    response = await handler(request)
    response.headers['X-Robots-Tag'] = 'noai, noimageai'
    return response

Both styles are registered the same way: web.Application(middlewares=[...]). The new style is simpler — one function, two args, no nested closure.

aiohttp vs Sanic vs Starlette — blocking comparison

aiohttp — handler-based chain

# aiohttp @web.middleware
@web.middleware
async def block_bots(request, handler):
    ua = request.headers.get('User-Agent', '').lower()
    if any(b in ua for b in AI_BOTS):
        return web.Response(status=403, text='Forbidden')
    return await handler(request)  # call handler to continue

Sanic — return-based (no handler arg)

# sanic @app.on_request
@app.on_request
async def block_bots(request):
    ua = request.headers.get('user-agent', '').lower()
    if any(b in ua for b in AI_BOTS):
        return text('Forbidden', status=403)
    # return None implicitly → continues to route

Starlette/FastAPI — BaseHTTPMiddleware class

# starlette BaseHTTPMiddleware
class AiBotBlocker(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        ua = request.headers.get('user-agent', '').lower()
        if any(b in ua for b in AI_BOTS):
            return Response('Forbidden', status_code=403)
        return await call_next(request)

Koa.js — conceptually similar to aiohttp

// koa middleware — same two-arg pattern
async function blockBots(ctx, next) {
  const ua = (ctx.get('user-agent') || '').toLowerCase();
  if (AI_BOTS.some(b => ua.includes(b))) {
    ctx.status = 403; ctx.body = 'Forbidden'; return;
  }
  await next();  // call next to continue (like handler)
}

aiohttp and Koa.js share the two-arg pattern (request + next-callable). Sanic has no handler argument — return None to pass. Starlette wraps it in a class method.

Response headers are mutable

aiohttp web.Response headers are a CIMultiDictProxy — you can mutate them in place after await handler(request) returns. No need to capture a new object (unlike PSR-7's immutable withHeader()):

@web.middleware
async def inject_headers(request, handler):
    response = await handler(request)

    # Mutable — set directly (like Sanic, unlike PSR-7)
    response.headers['X-Robots-Tag'] = 'noai, noimageai'
    response.headers['X-Content-Type-Options'] = 'nosniff'

    # CIMultiDict — case-insensitive keys
    # response.headers['x-robots-tag'] accesses the same header

    return response
Mutable like Sanic/Express, unlike PSR-7
aiohttp: response.headers['X-Robots-Tag'] = value (mutates in place).
PSR-7 (Slim, CakePHP): $response = $response->withHeader(...) (must capture).
Both are valid after calling the next handler — aiohttp buffers the response.

Testing

aiohttp ships with aiohttp.test_utils.AioHTTPTestCase and works with pytest-aiohttp:

import pytest
from aiohttp import web
from aiohttp.test_utils import AioHTTPTestCase, unittest_run_loop
from app import create_app


# pytest-aiohttp style (recommended)
@pytest.fixture
def client(aiohttp_client):
    app = create_app()
    return aiohttp_client(app)


async def test_blocks_ai_bot(client):
    resp = await (await client).get(
        '/articles/test',
        headers={'User-Agent': 'GPTBot/1.0'},
    )
    assert resp.status == 403


async def test_allows_browser(client):
    resp = await (await client).get(
        '/articles/test',
        headers={'User-Agent': 'Mozilla/5.0 (compatible)'},
    )
    assert resp.status == 200
    assert resp.headers.get('X-Robots-Tag') == 'noai, noimageai'


async def test_robots_txt_accessible_to_bots(client):
    resp = await (await client).get(
        '/robots.txt',
        headers={'User-Agent': 'GPTBot/1.0'},
    )
    assert resp.status == 200  # Exempt path — not blocked

AI bot User-Agent strings (2026)

GPTBotChatGPT-UserClaudeBotanthropic-aiCCBotcohere-aiBytespiderAmazonbotApplebot-ExtendedPerplexityBotYouBotDiffbotGoogle-ExtendedFacebookBotomgiliomgilibotDeepSeekBotMistralBotxAI-BotAI2Bot

aiohttp headers are case-insensitive via CIMultiDict — but lowercase the User-Agent value before matching: request.headers.get('User-Agent', '').lower().

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.