How to Block AI Bots on OpenLiteSpeed: Complete 2026 Guide
OpenLiteSpeed (OLS) is a high-performance open-source web server popular in shared hosting environments, cPanel/WHM setups, and as a faster drop-in replacement for Apache. It supports Apache-compatible .htaccess files — most Apache bot blocking rules work unchanged. This guide covers .htaccess rules, the WebAdmin console, mod_security, and LSCache considerations.
Contents
Bot blocking via .htaccess
OpenLiteSpeed supports Apache-compatible .htaccess files. Enable them first in the WebAdmin console if not already active:
.htaccess — RewriteRule approach (most compatible)
# .htaccess — place in your document root
RewriteEngine On
# Block AI training and scraping bots by User-Agent
RewriteCond %{HTTP_USER_AGENT} (GPTBot|ClaudeBot|anthropic-ai|CCBot|Google-Extended|AhrefsBot|Bytespider|Amazonbot|Diffbot|FacebookBot|cohere-ai|PerplexityBot|YouBot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (DiffBot|SemrushBot|MJ12bot|DataForSeoBot|magpie-crawler) [NC]
RewriteRule .* - [F,L][NC]— No Case (case-insensitive matching)[OR]— combine multiple RewriteCond with OR (default is AND)[F]— Forbidden (returns 403)[L]— Last rule (stop processing further rules).* -— match any URL, no substitution (dash = no rewrite)
Alternative — mod_setenvif approach (cleaner for many bots)
# .htaccess
SetEnvIfNoCase User-Agent "GPTBot" bad_bot
SetEnvIfNoCase User-Agent "ClaudeBot" bad_bot
SetEnvIfNoCase User-Agent "anthropic-ai" bad_bot
SetEnvIfNoCase User-Agent "CCBot" bad_bot
SetEnvIfNoCase User-Agent "Google-Extended" bad_bot
SetEnvIfNoCase User-Agent "AhrefsBot" bad_bot
SetEnvIfNoCase User-Agent "Bytespider" bad_bot
SetEnvIfNoCase User-Agent "Amazonbot" bad_bot
SetEnvIfNoCase User-Agent "Diffbot" bad_bot
SetEnvIfNoCase User-Agent "FacebookBot" bad_bot
SetEnvIfNoCase User-Agent "cohere-ai" bad_bot
SetEnvIfNoCase User-Agent "PerplexityBot" bad_bot
SetEnvIfNoCase User-Agent "YouBot" bad_bot
<RequireAll>
Require all granted
Require not env bad_bot
</RequireAll>SetEnvIfNoCase + <RequireAll> approach requires mod_authz_core and mod_setenvif. Both are typically available in OpenLiteSpeed, but verify in WebAdmin → Server → Modules. If unavailable, use the RewriteRule approach above.Server-level rules via WebAdmin
For rules that apply across all virtual hosts and fire before LSCache, add them at the server level in WebAdmin:
- Log in to WebAdmin Console (typically
https://your-server:7080) - Navigate to Server Configuration → Rewrite
- Set Enable Rewrite to Yes
- In the Rewrite Rules field, add:
RewriteCond %{HTTP_USER_AGENT} (GPTBot|ClaudeBot|anthropic-ai|CCBot|Google-Extended|AhrefsBot|Bytespider|Amazonbot|Diffbot|FacebookBot|cohere-ai|PerplexityBot|YouBot) [NC]
RewriteRule .* - [F,L]- Click Save
- Apply changes: Actions → Graceful Restart
.htaccess or the vhost config) fire after. For blocking AI bots efficiently across all sites, server-level rules are more reliable with LSCache.X-Robots-Tag via WebAdmin or .htaccess
Option 1: WebAdmin Custom Response Headers (recommended)
- WebAdmin Console → Virtual Hosts → [your vhost] → General
- Scroll to Custom Response Headers
- Add:
X-Robots-Tag: noai, noimageai - Save → Graceful Restart
Or at the server level (applies to all vhosts): Server Configuration → General → Custom Response Headers.
Option 2: .htaccess with mod_headers
# .htaccess
Header always set X-Robots-Tag "noai, noimageai"Header directive requires mod_headers. Verify it's loaded in WebAdmin → Server → Modules. If not available, use the WebAdmin Custom Response Headers GUI instead.robots.txt as a static file
Place robots.txt in your document root. OpenLiteSpeed serves static files automatically — no additional configuration needed.
User-agent: *
Allow: /
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: AhrefsBot
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: Amazonbot
Disallow: /
User-agent: Diffbot
Disallow: /
User-agent: FacebookBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: YouBot
Disallow: /
Sitemap: https://example.com/sitemap.xmlmod_security rules
If ModSecurity is installed on your OpenLiteSpeed server (available as a module), you can add WAF rules to block AI bots. ModSecurity rules fire at the server level before .htaccess processing:
# /etc/modsecurity/modsecurity.conf or a custom rules file
# Block AI training bots by User-Agent
SecRule REQUEST_HEADERS:User-Agent "@rx (?i)(GPTBot|ClaudeBot|anthropic-ai|CCBot|Google-Extended|AhrefsBot|Bytespider|Amazonbot|Diffbot|FacebookBot|cohere-ai|PerplexityBot|YouBot)" "id:10001, phase:1, deny, status:403, log, msg:'AI bot blocked', logdata:'Matched UA: %{REQUEST_HEADERS.User-Agent}'"
Enable in WebAdmin: Server → Security → ModSecurity → Enable ModSecurity → Yes. Add your rules file to the ModSecurity Rules path.
LSCache and bot blocking
LSCache (LiteSpeed Cache) serves cached responses before most request processing — including .htaccess rules. This means blocked bots may still receive cached pages. Three strategies to handle this:
Strategy 1: Server-level rewrite rules (recommended)
Place bot-blocking rules at the server level in WebAdmin (not in .htaccess). Server-level rules fire before the cache layer.
Strategy 2: LSCache bot exclusion config
Configure LSCache to not serve cached responses to known bots. In your .htaccess (WordPress/LiteSpeed Cache plugin) or WebAdmin LSCache config:
# wp-config.php or .htaccess (LSCache WordPress plugin)
# The LSCache plugin has built-in bot detection settings:
# WP Admin → LiteSpeed Cache → Cache → Do Not Cache → Bot IPs/UAsStrategy 3: ModSecurity at the server level
ModSecurity fires in phase:1 (request headers), before cache lookup. Use ModSecurity rules for the most reliable blocking regardless of cache state.
curl -A "GPTBot/1.0" https://yoursite.com. Should return 403. If it returns 200, the block is being bypassed by cache — move rules to server level.Full .htaccess example
# .htaccess — OpenLiteSpeed / LiteSpeed Enterprise
# Place in your document root
# ── Enable rewriting ─────────────────────────────────────────────────────────
RewriteEngine On
# ── Block AI bots (RewriteRule approach) ─────────────────────────────────────
RewriteCond %{HTTP_USER_AGENT} (GPTBot|ClaudeBot|anthropic-ai|CCBot|Google-Extended) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (AhrefsBot|Bytespider|Amazonbot|Diffbot|FacebookBot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (cohere-ai|PerplexityBot|YouBot) [NC]
RewriteRule .* - [F,L]
# ── X-Robots-Tag (requires mod_headers) ──────────────────────────────────────
Header always set X-Robots-Tag "noai, noimageai"
# ── Security headers ──────────────────────────────────────────────────────────
Header always set X-Content-Type-Options "nosniff"
Header always set X-Frame-Options "SAMEORIGIN"
Header always set Referrer-Policy "strict-origin-when-cross-origin"
# ── HTTPS redirect ────────────────────────────────────────────────────────────
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R=301,L]
# ── WWW redirect ──────────────────────────────────────────────────────────────
RewriteCond %{HTTP_HOST} !^www. [NC]
RewriteRule ^(.*)$ https://www.%{HTTP_HOST}/$1 [R=301,L]
# ── Static file caching ───────────────────────────────────────────────────────
<FilesMatch ".(css|js|png|jpg|jpeg|gif|ico|woff2|svg)$">
Header set Cache-Control "public, max-age=2592000"
</FilesMatch>Verify and reload
# Test .htaccess is being read (check error log)
tail -f /usr/local/lsws/logs/error.log
# Graceful restart via CLI
/usr/local/lsws/bin/lswsctrl restart
# Or via WebAdmin: Actions → Graceful Restart
# Test bot blocking
curl -A "GPTBot/1.0" https://yoursite.com
# Expected: HTTP/1.1 403 ForbiddenFAQ
How do I block AI bots by User-Agent on OpenLiteSpeed?
Use RewriteCond %{HTTP_USER_AGENT} with [NC] flag and RewriteRule .* - [F,L] in .htaccess. Enable .htaccess support first in WebAdmin. For pre-cache blocking, add the same rules at the server level in WebAdmin → Server Configuration → Rewrite.
Does OpenLiteSpeed support .htaccess files?
Yes — enable in WebAdmin: Virtual Hosts → [vhost] → General → Enable .htaccess → Yes. Rewrite rules, access control, and header directives are Apache-compatible. Not all Apache modules are supported — check OLS documentation for the full list.
How do I add X-Robots-Tag on OpenLiteSpeed?
Via WebAdmin: Virtual Host → General → Custom Response Headers → add X-Robots-Tag: noai, noimageai. Or via .htaccess: Header always set X-Robots-Tag "noai, noimageai" (requires mod_headers).
What is the difference between OpenLiteSpeed and LiteSpeed Enterprise?
OLS is free and open-source. LiteSpeed Enterprise is the commercial version with cPanel/WHM integration, HTTP/3, QUIC, and enterprise support. Bot blocking configuration is identical — .htaccess rules work the same in both.
Does LSCache affect bot blocking?
Yes — LSCache may serve cached responses before .htaccess rules fire. Fix: add bot-blocking rules at the server level in WebAdmin (fires before cache), or use ModSecurity (phase:1, fires before cache), or configure LSCache to not cache bot requests.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.