How to Block AI Bots on Pyramid (Python): Complete 2026 Guide
Pyramid is Python's full-stack WSGI framework — used by Mozilla, Reddit (legacy), and enterprise deployments. Its middleware system is called tweens — a unique factory pattern unlike Flask's WSGI wrapping, Django's class middleware, or Bottle's hooks. Each tween is a factory function receiving (handler, registry) and returning an inner callable that takes a request.
Tween = factory function, not a class
A Pyramid tween is a function that returns a function. The outer function receives handler and registry at app startup. The inner function receives request at runtime. Call handler(request) to continue, or return a Response to block — same chain model as aiohttp.
Protection layers
Layer 1: robots.txt
Add a dedicated view for /robots.txt and exempt it in your tween. Pyramid static views go through the tween stack:
# static/robots.txt User-agent: * Allow: / User-agent: GPTBot User-agent: ClaudeBot User-agent: anthropic-ai User-agent: Google-Extended User-agent: CCBot User-agent: cohere-ai User-agent: Bytespider User-agent: Amazonbot User-agent: PerplexityBot User-agent: YouBot User-agent: Diffbot User-agent: DeepSeekBot User-agent: MistralBot User-agent: xAI-Bot User-agent: AI2Bot Disallow: /
# myapp/views.py
from pyramid.response import Response
from pyramid.view import view_config
from pathlib import Path
ROBOTS_TXT = Path('myapp/static/robots.txt').read_text()
@view_config(route_name='robots')
def robots_view(request):
return Response(ROBOTS_TXT, content_type='text/plain')
# myapp/__init__.py configure() function:
# config.add_route('robots', '/robots.txt')Layers 2, 3 & 4: tween module
Create myapp/tweens.py. The outer function runs once at startup; the inner function runs per request:
# myapp/tweens.py
from pyramid.response import Response
AI_BOTS = [
'gptbot', 'chatgpt-user', 'claudebot', 'anthropic-ai',
'ccbot', 'cohere-ai', 'bytespider', 'amazonbot',
'applebot-extended', 'perplexitybot', 'youbot', 'diffbot',
'google-extended', 'deepseekbot', 'mistralbot', 'xai-bot',
'ai2bot', 'oai-searchbot', 'duckassistbot',
]
EXEMPT_PATHS = {'/robots.txt', '/sitemap.xml', '/favicon.ico'}
def ai_bot_blocker_tween_factory(handler, registry):
"""
Tween factory — called once at startup.
handler: the next tween or the Pyramid router
registry: app-level registry (access settings via registry.settings)
"""
# Could read bot list from settings:
# bot_list = registry.settings.get('ai_bots', ','.join(AI_BOTS)).split(',')
def ai_bot_blocker(request):
"""Called per request."""
# Layer 2: set noai meta for templates
request.robots = 'noai, noimageai'
# Exempt paths — robots.txt must always be accessible
if request.path in EXEMPT_PATHS:
response = handler(request)
response.headers['X-Robots-Tag'] = 'noai, noimageai'
return response
# Layer 4: block AI bots
ua = request.headers.get('User-Agent', '').lower()
if any(bot in ua for bot in AI_BOTS):
return Response(
'Forbidden: AI crawlers are not permitted.',
status=403,
content_type='text/plain',
)
# Continue to next tween / router
response = handler(request)
# Layer 3: X-Robots-Tag on all legitimate responses
response.headers['X-Robots-Tag'] = 'noai, noimageai'
return response
return ai_bot_blockerRegister in your app's configure() function:
# myapp/__init__.py
import pyramid.tweens
from pyramid.config import Configurator
def main(global_config, **settings):
config = Configurator(settings=settings)
# Register tween — OVER INGRESS = outermost position
config.add_tween(
'myapp.tweens.ai_bot_blocker_tween_factory',
over=pyramid.tweens.INGRESS,
)
config.add_route('robots', '/robots.txt')
config.add_route('home', '/')
config.scan('myapp.views')
return config.make_wsgi_app()OVER/UNDER tween ordering
Pyramid tweens are ordered using OVER and UNDER constants instead of numeric priority values. INGRESS is the implicit outermost position — adding over=INGRESS places your tween as early in the chain as possible:
import pyramid.tweens
# OVER INGRESS = runs first (outermost, closest to request entry)
config.add_tween(
'myapp.tweens.ai_bot_blocker_tween_factory',
over=pyramid.tweens.INGRESS,
)
# UNDER INGRESS = runs after INGRESS-level tweens
config.add_tween(
'myapp.tweens.auth_tween_factory',
under=pyramid.tweens.INGRESS,
)
# OVER another tween = runs before that tween
config.add_tween(
'myapp.tweens.rate_limiter_factory',
over='myapp.tweens.ai_bot_blocker_tween_factory',
)
# Inspect tween chain at startup (development):
# config.registry.queryUtility(ITweens).implicit()Pyramid:
over=INGRESS (declarative graph). Spring Boot: @Order(-100) (lower = earlier). Quarkus: @Priority(900) (lower = earlier). aiohttp: list order (first = outermost).Layer 2: noai meta tag
Set attributes directly on the Pyramid Request object. Pyramid requests are extensible — any attribute you set is available in view callables and templates:
# In tween (already set above):
request.robots = 'noai, noimageai'
# Chameleon template (base.pt):
# <meta name="robots" content="${request.robots}" />
# Mako template (base.html):
# <meta name="robots" content="${request.robots}" />
# Jinja2 template (pyramid_jinja2):
# <meta name="robots" content="{{ request.robots }}">
# View override (per-page):
@view_config(route_name='public_page', renderer='templates/page.pt')
def public_page(request):
request.robots = 'index, follow' # Override for public content
return {}request.robots — per-request attribute (set in tween, per-request).request.registry.settings['robots'] — app-wide config (set in .ini, same for all requests).Use
request.robots for per-request overrides; registry.settings for defaults.Route-scoped blocking inside tween
Tweens are always global. For per-route control, inspect request.matched_route inside the tween (it's populated after routing, but the tween can read the path beforehand):
# Path-prefix scoping inside tween
PROTECTED_PREFIXES = ('/api/', '/content/', '/articles/')
def ai_bot_blocker(request):
request.robots = 'noai, noimageai'
if request.path in EXEMPT_PATHS:
return handler(request)
# Only block on protected prefixes
is_protected = any(
request.path.startswith(prefix)
for prefix in PROTECTED_PREFIXES
)
if is_protected:
ua = request.headers.get('User-Agent', '').lower()
if any(bot in ua for bot in AI_BOTS):
return Response('Forbidden', status=403)
response = handler(request)
response.headers['X-Robots-Tag'] = 'noai, noimageai'
return responseAlternative: NewRequest event subscriber
Pyramid's event system offers a lightweight alternative to tweens for request-phase logic. NewRequest fires before routing — but it cannot return a response directly. Use request.add_finished_callback() or attach a response adapter. For blocking, tweens are simpler:
# Event subscriber (limited — cannot short-circuit request)
from pyramid.events import NewRequest, subscriber
@subscriber(NewRequest)
def set_robots_context(event):
"""Set noai meta on every request — cannot block here."""
event.request.robots = 'noai, noimageai'
# For blocking, stick with tweens — event subscribers
# cannot return a Response to abort the request.
# Use this only for attaching metadata, not for hard blocking.Events (NewRequest/NewResponse) are great for attaching metadata. They cannot abort the request or return a Response. For hard 403 blocking, use a tween — it controls the request/response cycle.
Pyramid vs Django vs Flask — comparison
Pyramid — tween factory
# Pyramid tween factory
def ai_bot_blocker_factory(handler, registry):
def ai_bot_blocker(request):
ua = request.headers.get('User-Agent', '').lower()
if any(b in ua for b in AI_BOTS):
return Response('Forbidden', status=403)
return handler(request) # call handler to continue
return ai_bot_blockerDjango — process_request class method
# Django middleware class
class AiBotBlocker:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
ua = request.META.get('HTTP_USER_AGENT', '').lower()
if any(b in ua for b in AI_BOTS):
return HttpResponseForbidden('Forbidden')
return self.get_response(request)Flask — before_request return
# Flask hook
@app.before_request
def block_bots():
ua = request.headers.get('User-Agent', '').lower()
if any(b in ua for b in AI_BOTS):
return Response('Forbidden', 403) # return to blockBottle — before_request abort
# Bottle hook
@app.hook('before_request')
def block_bots():
ua = request.headers.get('User-Agent', '').lower()
if any(b in ua for b in AI_BOTS):
abort(403, 'Forbidden') # raises to blockPyramid tweens and Django middleware share the handler/get_response pattern. Flask returns to block; Bottle aborts. Pyramid is the only one using a named factory system with OVER/UNDER ordering.
Testing
Pyramid provides pyramid.testing and works with WebTest's TestApp:
import pytest
from webtest import TestApp
from myapp import main
@pytest.fixture
def app():
settings = {'sqlalchemy.url': 'sqlite:///:memory:'}
app = main({}, **settings)
return TestApp(app)
def test_blocks_ai_bot(app):
resp = app.get(
'/api/data',
headers={'User-Agent': 'GPTBot/1.0'},
expect_errors=True,
)
assert resp.status_int == 403
def test_allows_browser(app):
resp = app.get(
'/api/data',
headers={'User-Agent': 'Mozilla/5.0 (compatible)'},
)
assert resp.status_int == 200
assert resp.headers.get('X-Robots-Tag') == 'noai, noimageai'
def test_robots_txt_accessible_to_bots(app):
resp = app.get(
'/robots.txt',
headers={'User-Agent': 'GPTBot/1.0'},
)
assert resp.status_int == 200 # Exempt path
def test_tween_sets_robots_attribute(app):
# Pyramid testing.DummyRequest for unit testing tweens:
from pyramid import testing
from myapp.tweens import ai_bot_blocker_tween_factory
def dummy_handler(request):
from pyramid.response import Response
return Response('OK')
config = testing.setUp()
tween = ai_bot_blocker_tween_factory(dummy_handler, config.registry)
request = testing.DummyRequest(headers={'User-Agent': 'Mozilla/5.0'})
request.path = '/api/data'
response = tween(request)
assert response.status_int == 200
assert request.robots == 'noai, noimageai'
testing.tearDown()AI bot User-Agent strings (2026)
Pyramid normalizes headers via WebOb — request.headers.get('User-Agent', '').lower() gives case-insensitive access.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.