How to Block AI Bots in PHP ReactPHP
ReactPHP is a non-blocking event-loop HTTP server for PHP — one PHP process handles many concurrent requests without threads or forks. Middleware is a callable (ServerRequestInterface, callable): PromiseInterface. To block: return Promise::resolve(new Response(403, …)) without calling $next. To pass: $next($request)->then(fn) to inject headers on the downstream response. getHeaderLine('User-Agent') returns '' (empty string, not null) when the header is absent — PSR-7 §3.2. Never use blocking code (sleep, file_get_contents, PDO) inside middleware — it stalls the entire event loop.
1. Bot detection
Pure PHP, no dependencies. str_contains() for literal substring matching. getHeaderLine() always returns a string so strtolower() is safe without a null-check.
<?php
// AiBotDetector.php — AI bot detection, no external dependencies
class AiBotDetector
{
private const AI_BOT_PATTERNS = [
'gptbot',
'chatgpt-user',
'claudebot',
'anthropic-ai',
'ccbot',
'google-extended',
'cohere-ai',
'meta-externalagent',
'bytespider',
'omgili',
'diffbot',
'imagesiftbot',
'magpie-crawler',
'amazonbot',
'dataprovider',
'netcraft',
];
/**
* Returns true if the User-Agent string matches a known AI crawler.
*
* getHeaderLine() always returns a string ('' when absent — PSR-7 §3.2).
* str_contains() is safe to call without a null-check.
* Case-folded to lowercase before comparison.
*
* @param string $userAgent The raw User-Agent header value (may be '')
* @return bool
*/
public static function detect(string $userAgent): bool
{
if ($userAgent === '') {
return false;
}
$lower = strtolower($userAgent);
foreach (self::AI_BOT_PATTERNS as $pattern) {
if (str_contains($lower, $pattern)) {
return true;
}
}
return false;
}
}2. Middleware and server setup
ReactPHP runs its own HTTP server — there is no Apache or Nginx in front. The HttpServer constructor accepts middleware and a final handler in left-to-right order (outermost first). Install with composer require react/http react/socket.
<?php
// server.php — ReactPHP HTTP server with AI bot blocking middleware
// Install: composer require react/http react/socket
require __DIR__ . '/vendor/autoload.php';
require __DIR__ . '/AiBotDetector.php';
use React\Http\HttpServer;
use React\Http\Message\Response;
use React\Socket\SocketServer;
use Psr\Http\Message\ServerRequestInterface;
use function React\Promise\resolve;
// ── Middleware ─────────────────────────────────────────────────────────────
//
// ReactPHP middleware is any callable with the signature:
// (ServerRequestInterface $request, callable $next): PromiseInterface
//
// - To BLOCK: return resolve(new Response(403, ...)) — do NOT call $next
// - To PASS: return $next($request)->then(fn) — chain to inject headers
//
// getHeaderLine() returns '' when the header is absent (PSR-7 §3.2),
// so strtolower() is always safe — no null-check needed.
$botBlockerMiddleware = function (ServerRequestInterface $request, callable $next) {
// Always allow robots.txt so crawlers can discover Disallow rules.
if ($request->getUri()->getPath() === '/robots.txt') {
return $next($request);
}
// getHeaderLine() is case-insensitive (RFC 7230) and returns '' if absent.
$ua = $request->getHeaderLine('User-Agent');
if (AiBotDetector::detect($ua)) {
// Block: return a resolved promise wrapping a 403 Response.
// Do NOT call $next($request) — that would invoke the app handler.
return resolve(
new Response(
403,
[
'Content-Type' => 'text/plain',
'X-Robots-Tag' => 'noai, noimageai',
],
'Forbidden'
)
);
}
// Pass: call $next, then inject X-Robots-Tag on the downstream response.
// $next($request) returns a PromiseInterface<ResponseInterface>.
// withHeader() returns a NEW PSR-7 object — original is unchanged (immutability).
return $next($request)->then(function (Response $response) {
return $response->withHeader('X-Robots-Tag', 'noai, noimageai');
});
};
// ── Application handler ───────────────────────────────────────────────────
$robotsTxt = <<<'TXT'
User-agent: *
Allow: /
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Google-Extended
Disallow: /
TXT;
$handler = function (ServerRequestInterface $request) use ($robotsTxt) {
$path = $request->getUri()->getPath();
if ($path === '/robots.txt') {
return new Response(200, ['Content-Type' => 'text/plain'], $robotsTxt);
}
return new Response(200, ['Content-Type' => 'application/json'], '{"message":"ok"}');
};
// ── Server bootstrap ──────────────────────────────────────────────────────
//
// HttpServer receives middleware + handler in constructor order:
// leftmost = outermost (first to see the request, last to see the response)
// rightmost = innermost (closest to the application handler)
//
// Middleware are applied in left-to-right order for requests
// and right-to-left order for responses (standard onion model).
$server = new HttpServer(
$botBlockerMiddleware, // outermost — runs first
$handler // innermost — only reached if not blocked
);
$socket = new SocketServer('0.0.0.0:8080');
$server->listen($socket);
echo "Server running on http://127.0.0.1:8080\n";
React\EventLoop\Loop::run();3. Class-based invokable middleware
An invokable class is testable and injectable. ReactPHP accepts any callable — closures and __invoke classes are interchangeable.
<?php
// AiBotMiddleware.php — invokable class middleware
// Classes are preferred over closures for testability and dependency injection.
use Psr\Http\Message\ServerRequestInterface;
use React\Http\Message\Response;
use React\Promise\PromiseInterface;
use function React\Promise\resolve;
class AiBotMiddleware
{
/**
* ReactPHP invokable middleware.
*
* @param ServerRequestInterface $request
* @param callable $next
* @return PromiseInterface<Response>
*/
public function __invoke(
ServerRequestInterface $request,
callable $next
): PromiseInterface {
if ($request->getUri()->getPath() === '/robots.txt') {
return $next($request);
}
// getHeaderLine always returns string — '' when absent.
if (AiBotDetector::detect($request->getHeaderLine('User-Agent'))) {
return resolve(new Response(
403,
['Content-Type' => 'text/plain', 'X-Robots-Tag' => 'noai, noimageai'],
'Forbidden'
));
}
// PSR-7: withHeader() returns a NEW Response — chain with then().
return $next($request)->then(
fn(Response $res) => $res->withHeader('X-Robots-Tag', 'noai, noimageai')
);
}
}
// Usage in server.php:
// $server = new HttpServer(new AiBotMiddleware(), $handler);4. Route-scoped middleware
ReactPHP's HttpServer has no built-in router. Scope middleware by checking $request->getUri()->getPath() inside the middleware and returning $next($request) for paths you want to skip.
<?php
// Route-scoped middleware — protect only /api/* routes.
//
// ReactPHP HttpServer does not have a built-in router.
// Scope middleware by inspecting the path before deciding to block.
$apiOnlyBotBlocker = function (ServerRequestInterface $request, callable $next) {
$path = $request->getUri()->getPath();
// Only apply bot blocking to /api/* paths.
if (!str_starts_with($path, '/api/')) {
return $next($request); // non-API paths pass through unconditionally
}
if (AiBotDetector::detect($request->getHeaderLine('User-Agent'))) {
return resolve(new Response(
403,
['Content-Type' => 'text/plain', 'X-Robots-Tag' => 'noai, noimageai'],
'Forbidden'
));
}
return $next($request)->then(
fn(Response $res) => $res->withHeader('X-Robots-Tag', 'noai, noimageai')
);
};
// Stack: route-scoped blocker on top, then rate-limiter, then handler.
$server = new HttpServer(
$apiOnlyBotBlocker,
$rateLimiterMiddleware,
$handler
);Key points
getHeaderLine()returns'', notnull: PSR-7 §3.2 guaranteesgetHeaderLine()returns a string — empty string when the header is absent. Never null-check the result; callstrtolower()directly.- PSR-7 immutability:
$response->withHeader()returns a new object — the original is unchanged. Always use the returned value:$response = $response->withHeader('X-Robots-Tag', 'noai, noimageai'). - No blocking code:
sleep(),file_get_contents(),PDO,curl_exec(), and any synchronous I/O freeze the entire event loop for every concurrent connection. Use only CPU-bound logic (like string matching) or ReactPHP async adapters (react/filesystem,react/httpBrowser,friends-of-reactphp/mysql). Promise::resolve()for synchronous results: Wraps a value in a resolved promise. Use it when your middleware produces a result synchronously (bot detection from the User-Agent string is CPU-only and needs no async calls).- Do not call
$nextwhen blocking: Calling$next($request)after deciding to block will invoke the app handler anyway, potentially sending two responses. Only call$nexton the pass branch. - Middleware order = left-to-right for requests: In
new HttpServer($a, $b, $handler), middleware$asees the request first and the response last (outermost). This matches Express / Koa convention, unlike some PHP frameworks (Laminas, Slim) which add middleware in LIFO order. - ReactPHP IS the HTTP server: Unlike Slim, Laravel, or Symfony (which sit behind Apache/Nginx/PHP-FPM), ReactPHP binds directly to a port via
react/socket. Serverobots.txtas an explicit route — there is no web-root static file serving by default.
Framework comparison — PHP HTTP server models
| Framework | Process model | Block request | Header access |
|---|---|---|---|
| ReactPHP | Single-process event loop | resolve(new Response(403)) | getHeaderLine() → '' if absent |
| Workerman | Multi-process + event loop | $connection->send(Response(403)) | $request->header('user-agent') |
| Hyperf (Swoole) | Multi-process + coroutines | return (new Response())->withStatus(403) | getHeaderLine() case-insensitive, PSR-7 |
| Laravel / Symfony (FPM) | PHP-FPM: new process per request | return response('Forbidden', 403) | $request->header('user-agent') |
ReactPHP's event-loop model means a single stalled middleware (a blocking DB call) freezes all in-flight requests simultaneously — unlike FPM where blocking only affects that one worker. This is why the no-blocking constraint is non-negotiable. Bot detection from a User-Agent string is purely CPU-bound so it is always safe.