How to Block AI Bots in PHP Hyperf
Hyperf is a high-performance PHP microservices framework built on Swoole (or Swow) — it runs as a persistent long-lived process rather than restarting on every request like traditional PHP-FPM. Middleware follows the PSR-15 standard: MiddlewareInterface::process() returns a ResponseInterface directly to block, or calls $handler->handle($request) to pass through. getHeaderLine() returns an empty string when the header is absent — no null check needed. The critical Hyperf-specific gotcha: middleware classes are instantiated once and reused across concurrent coroutines — never store request state in class properties.
1. Bot detection
Pure PHP 8.0+, no dependencies. str_contains() for literal substring matching — no regex. strtolower() normalises before comparison.
<?php
// BotUtils.php — AI bot detection, no external dependencies
declare(strict_types=1);
namespace App\Middleware;
class BotUtils
{
private const AI_BOT_PATTERNS = [
'gptbot',
'chatgpt-user',
'claudebot',
'anthropic-ai',
'ccbot',
'google-extended',
'cohere-ai',
'meta-externalagent',
'bytespider',
'omgili',
'diffbot',
'imagesiftbot',
'magpie-crawler',
'amazonbot',
'dataprovider',
'netcraft',
];
/**
* Returns true if $ua matches a known AI crawler pattern.
* str_contains() — literal substring match, no regex (PHP 8.0+).
* strtolower() normalises before comparison.
*/
public static function isAiBot(string $ua): bool
{
if ($ua === '') {
return false;
}
$lower = strtolower($ua);
foreach (self::AI_BOT_PATTERNS as $pattern) {
if (str_contains($lower, $pattern)) {
return true;
}
}
return false;
}
}2. PSR-15 middleware — MiddlewareInterface
process() either returns a ResponseInterface (block) or calls $handler->handle($request) (pass). Use SwooleStream for response bodies — required under Swoole's coroutine context. The class must be stateless — it lives for the lifetime of the process, not the request.
<?php
// AiBotMiddleware.php — PSR-15 middleware for Hyperf
declare(strict_types=1);
namespace App\Middleware;
use Hyperf\HttpMessage\Stream\SwooleStream;
use Psr\Http\Message\ResponseInterface;
use Psr\Http\Message\ServerRequestInterface;
use Psr\Http\Server\MiddlewareInterface;
use Psr\Http\Server\RequestHandlerInterface;
class AiBotMiddleware implements MiddlewareInterface
{
/**
* CRITICAL — this class is instantiated ONCE by Hyperf's DI container
* and reused across many concurrent coroutines.
* Never store request-scoped data in class properties ($this->...).
* All state must live in local variables inside process().
*/
public function process(
ServerRequestInterface $request,
RequestHandlerInterface $handler
): ResponseInterface {
// Path guard: robots.txt must be reachable — bots read it for Disallow rules.
$path = $request->getUri()->getPath();
if ($path === '/robots.txt') {
return $handler->handle($request); // pass through
}
// PSR-7 getHeaderLine() returns '' when the header is absent — no null check needed.
// Header name lookup is case-insensitive per PSR-7 / HTTP spec.
// getHeaderLine() returns a single string; getHeader() returns an array.
$ua = $request->getHeaderLine('User-Agent');
if (BotUtils::isAiBot($ua)) {
// Block: return a Response directly — do NOT call $handler->handle().
// Calling handle() after returning a response would run the route handler
// and cause a double-response error.
// SwooleStream is required for Hyperf response bodies under Swoole.
return (new \Hyperf\HttpMessage\Server\Response())
->withStatus(403)
->withHeader('X-Robots-Tag', 'noai, noimageai')
->withHeader('Content-Type', 'text/plain')
->withBody(new SwooleStream('Forbidden'));
}
// Pass: delegate to the next middleware or route handler.
// Inject X-Robots-Tag on the outgoing response.
$response = $handler->handle($request);
return $response->withHeader('X-Robots-Tag', 'noai, noimageai');
}
}3. Global registration — config/autoload/middlewares.php
Add the middleware class to the 'http' array. Middleware runs in declaration order — first entry is outermost (runs first on request, last on response). Register the bot blocker first to short-circuit before auth or business logic.
<?php
// config/autoload/middlewares.php — global middleware registration
// Middleware runs in array order: first entry runs outermost (first in, last out).
declare(strict_types=1);
use App\Middleware\AiBotMiddleware;
return [
'http' => [
// AiBotMiddleware runs for every HTTP request.
// Add before any auth, rate-limit, or route middleware.
AiBotMiddleware::class,
],
];4. Per-controller scoping — #[Middleware] annotation
Apply #[Middleware(AiBotMiddleware::class)] on a controller class (all routes) or a single method (one route). Annotation middleware stacks with global middleware — both run.
<?php
// Per-controller annotation — scopes middleware to a specific controller.
// Requires: use Hyperf\HttpServer\Annotation\Middleware;
// use Hyperf\HttpServer\Annotation\Controller;
declare(strict_types=1);
namespace App\Controller;
use App\Middleware\AiBotMiddleware;
use Hyperf\HttpServer\Annotation\Controller;
use Hyperf\HttpServer\Annotation\GetMapping;
use Hyperf\HttpServer\Annotation\Middleware;
// Apply to all routes in this controller
#[Controller(prefix: '/api')]
#[Middleware(AiBotMiddleware::class)]
class ApiController extends AbstractController
{
#[GetMapping(path: '/data')]
public function data(): array
{
return ['data' => 'value'];
}
}
// Or per-method — scopes to a single route only:
#[Controller(prefix: '/')]
class IndexController extends AbstractController
{
#[GetMapping(path: '/')]
#[Middleware(AiBotMiddleware::class)] // only this route is protected
public function index(): array
{
return ['message' => 'Hello'];
}
}5. robots.txt route
<?php
// app/Controller/RobotsController.php
// robots.txt handler — always accessible (guard in middleware passes it through)
declare(strict_types=1);
namespace App\Controller;
use Hyperf\HttpServer\Annotation\Controller;
use Hyperf\HttpServer\Annotation\GetMapping;
use Psr\Http\Message\ResponseInterface;
#[Controller(prefix: '/')]
class RobotsController extends AbstractController
{
#[GetMapping(path: '/robots.txt')]
public function robots(): ResponseInterface
{
$body = <<<TXT
User-agent: *
Allow: /
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Google-Extended
Disallow: /
TXT;
return $this->response
->withHeader('Content-Type', 'text/plain')
->withBody(new \Hyperf\HttpMessage\Stream\SwooleStream($body));
}
}6. Install and run
# Install Hyperf (requires PHP 8.1+ and Swoole extension)
composer create-project hyperf/hyperf-skeleton my-app
cd my-app
# Install Swoole extension (if not already installed)
pecl install swoole
# Start the server (persistent process — no PHP-FPM or Apache needed)
php bin/hyperf.php start
# Server starts on http://0.0.0.0:9501 by default
# Configure port in config/autoload/server.phpKey points
- Middleware classes are singletons — keep them stateless: Hyperf's DI container instantiates each middleware once and reuses the same object across thousands of concurrent coroutines. Storing anything in
$thisproperties will leak between requests. All data must be in local variables insideprocess(). getHeaderLine()returns'', notnull: PSR-7getHeaderLine()returns an empty string when the header is absent — safe to pass directly toBotUtils::isAiBot(). The lookup is case-insensitive per PSR-7 —'User-Agent','user-agent', and'USER-AGENT'all return the same value.- Do not call
$handler->handle()after returning a response: In PSR-15, the middleware either returns a response or delegates to the handler — not both. Calling$handler->handle()after constructing a blocking response would execute the route handler and attempt to send two responses. - Use
SwooleStreamfor response bodies: PSR-7 requires aStreamInterfacefor response bodies. Standard PHP stream wrappers behave differently under Swoole's coroutine scheduler.Hyperf\HttpMessage\Stream\SwooleStreamis the correct implementation — wrap any string body withnew SwooleStream('body'). - Persistent process — no per-request bootstrap: Traditional PHP-FPM restarts the PHP runtime on every request, clearing all state. Hyperf + Swoole boots once. This means faster requests (no framework init overhead) but requires coroutine-safe code — avoid global variables, static mutable properties, and non-coroutine-safe extensions.
- Global + annotation middleware both run: If
AiBotMiddlewareis inmiddlewares.phpand also applied via#[Middleware], it will run twice. Use one approach for a given scope — global config for blanket protection, annotations for selective scoping.
Framework comparison — PHP middleware patterns
| Framework | Middleware standard | Block | Process model |
|---|---|---|---|
| Hyperf | PSR-15 MiddlewareInterface | return new Response()->withStatus(403) | Swoole persistent (singleton middleware) |
| Slim 4 | PSR-15 MiddlewareInterface | return $response->withStatus(403) | PHP-FPM (new instance per request) |
| Laravel | Laravel middleware (handle()) | return response('Forbidden', 403) | PHP-FPM or Octane (persistent) |
| Laminas | PSR-15 MiddlewareInterface | return new HtmlResponse('Forbidden', 403) | PHP-FPM (new instance per request) |
Hyperf, Slim, and Laminas all implement PSR-15 — the process() signature is identical. The key difference is the process model: Hyperf runs in a Swoole persistent process (singleton middleware, coroutine context), while Slim and Laminas run under PHP-FPM where each request gets a fresh instance. Laravel with Octane is the closest PHP-FPM alternative to Hyperf's persistent model — the same stateless middleware rules apply.