How to Block AI Bots on Ruby on Rails
Every new Rails app ships with a public/robots.txt — most developers never edit it. Replace its contents in 30 seconds for the quickest fix, then layer in before_action blocking and Rack middleware for defence-in-depth.
Rails ships public/robots.txt by default
Run ls public/ in any Rails project — you'll find robots.txt already there. The default content is minimal. Just edit it — no new files, no routes, no controllers needed.
Replace public/robots.txt
Rails serves public/ as static assets before the stack runs — this request never touches Ruby.
User-agent: GPTBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: CCBot Disallow: / User-agent: Google-Extended Disallow: /
All Methods
public/robots.txt (Recommended)
EasyAll deployments
public/robots.txt
Rails ships with a default public/robots.txt in every new app. Edit it — Rails serves public/ as static assets before the stack runs. No controller, no route, no config needed.
The default file has generic content. Replace it entirely. Plain text only — no ERB syntax.
RobotsController — dynamic robots.txt
EasyAll deployments
app/controllers/robots_controller.rb
A dedicated controller that renders plain text. Useful for environment-based rules (block all in staging) or generating rules from config. Remove public/robots.txt first — it takes precedence.
Use render plain: or a .text.erb template. Register route with format: false to avoid .txt format matching issues.
noai meta tag in application layout
EasyAll deployments
app/views/layouts/application.html.erb
Add <meta name="robots" content="noai, noimageai"> to the application layout's <head>. Applies to all pages using the default layout. Use content_for for per-page override.
Works for HTML responses only. Bots that ignore meta tags still need robots.txt or middleware.
before_action in ApplicationController
EasyAll deployments
app/controllers/application_controller.rb
A before_action that checks request.user_agent and calls head :forbidden for matched bots. Runs before any controller action — no page is rendered for blocked bots.
Use skip_before_action :block_ai_bots in specific controllers to exclude them.
Rack middleware — pre-Rails blocking
IntermediateAll deployments
config/application.rb → config.middleware.use
A Rack middleware class inserted before the Rails stack. Blocks AI bots before routing, session loading, or any Rails processing — the most efficient Ruby-layer method.
Slightly more complex than before_action but runs earlier in the request lifecycle.
nginx reverse proxy
Intermediatenginx deployments
nginx server block config
Block AI bots in nginx before the request reaches Puma/Unicorn and Rails. Zero Ruby overhead for blocked bots. Standard for VPS deployments via Capistrano or Kamal.
Not available on Heroku without custom buildpacks. Use middleware approach on PaaS.
Method 1: public/robots.txt
Rails serves everything in public/ as static assets — Puma (or your web server) handles these requests before Rails routing runs. Open public/robots.txt and replace its contents:
User-agent: * Allow: / User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: OAI-SearchBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: Google-Extended Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: PerplexityBot Disallow: / User-agent: meta-externalagent Disallow: / User-agent: Amazonbot Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: xAI-Bot Disallow: / User-agent: DeepSeekBot Disallow: / User-agent: MistralBot Disallow: / User-agent: Diffbot Disallow: / User-agent: cohere-ai Disallow: / User-agent: AI2Bot Disallow: / User-agent: Ai2Bot-Dolma Disallow: / User-agent: YouBot Disallow: / User-agent: DuckAssistBot Disallow: / User-agent: omgili Disallow: / User-agent: omgilibot Disallow: / User-agent: webzio-extended Disallow: / User-agent: gemini-deep-research Disallow: /
Method 2: RobotsController
For dynamic robots.txt — different rules per environment, reading from config — create a dedicated controller. First, delete public/robots.txt (it takes precedence over the route), then:
# Generate the controller rails generate controller Robots index
# config/routes.rb Rails.application.routes.draw do get '/robots.txt', to: 'robots#index', format: false # ... rest of routes end
# app/controllers/robots_controller.rb
class RobotsController < ApplicationController
skip_before_action :verify_authenticity_token
skip_before_action :block_ai_bots, raise: false # don't block robots.txt itself
AI_BOTS = %w[
GPTBot ChatGPT-User OAI-SearchBot
ClaudeBot anthropic-ai Google-Extended
Bytespider CCBot PerplexityBot
meta-externalagent Amazonbot Applebot-Extended
xAI-Bot DeepSeekBot MistralBot Diffbot
cohere-ai AI2Bot Ai2Bot-Dolma YouBot
DuckAssistBot omgili omgilibot
webzio-extended gemini-deep-research
].freeze
def index
lines = ["User-agent: *", "Allow: /", ""]
if Rails.env.production?
AI_BOTS.each do |bot|
lines << "User-agent: #{bot}" << "Disallow: /" << ""
end
else
# Block all crawlers outside production
lines = ["User-agent: *", "Disallow: /"]
end
lines << "Sitemap: #{request.base_url}/sitemap.xml"
render plain: lines.join("\n"),
content_type: "text/plain",
layout: false
end
endformat: false in routes
Without format: false, Rails may try to serve the route only for .txt format requests and fail for bare /robots.txt paths in some configurations. Always add format: false to the robots.txt route to prevent unexpected 404s.
Method 3: noai Meta Tag in Application Layout
Add the noai meta tag to app/views/layouts/application.html.erb. All pages using this layout (the default for all controllers) will include it:
<%# app/views/layouts/application.html.erb (excerpt) %>
<!DOCTYPE html>
<html>
<head>
<title><%= content_for?(:title) ? yield(:title) : "My App" %></title>
<meta name="viewport" content="width=device-width,initial-scale=1">
<%# Block AI training crawlers on every page %>
<%= yield :robots_meta_override do %>
<meta name="robots" content="noai, noimageai">
<% end %>
<%= csrf_meta_tags %>
<%= csp_meta_tag %>
<%= stylesheet_link_tag "application" %>
</head>
<body>
<%= yield %>
</body>
</html>To allow AI indexing on a specific page, override in that view:
<%# app/views/blog/show.html.erb %> <% content_for :robots_meta_override do %> <meta name="robots" content="index, follow"> <% end %> <%# ... rest of view %>
Method 4: before_action in ApplicationController
Add a before_action to app/controllers/application_controller.rb. Since every controller inherits from ApplicationController, this intercepts all requests before any action runs:
# app/controllers/application_controller.rb
class ApplicationController < ActionController::Base
before_action :block_ai_bots
private
BLOCKED_UA_PATTERN = /
GPTBot|ChatGPT-User|OAI-SearchBot|
ClaudeBot|anthropic-ai|Google-Extended|
Bytespider|CCBot|PerplexityBot|
meta-externalagent|Amazonbot|Applebot-Extended|
xAI-Bot|DeepSeekBot|MistralBot|Diffbot|
cohere-ai|AI2Bot|Ai2Bot-Dolma|YouBot|
DuckAssistBot|omgili|omgilibot|
webzio-extended|gemini-deep-research
/xi
def block_ai_bots
ua = request.user_agent.to_s
head :forbidden if BLOCKED_UA_PATTERN.match?(ua)
end
endTo allow AI bots to reach a specific controller (e.g. your robots.txt controller or a public API), use skip_before_action:
# app/controllers/robots_controller.rb class RobotsController < ApplicationController skip_before_action :block_ai_bots, raise: false # ... end # app/controllers/api/v1/base_controller.rb class Api::V1::BaseController < ApplicationController skip_before_action :block_ai_bots, raise: false # Public API — let AI bots access if desired end
Method 5: Rack Middleware
A Rack middleware class runs before the Rails stack — before routing, before session loading, before ActionController. Create the file and insert it in config/application.rb:
# lib/middleware/block_ai_bots.rb
module Middleware
class BlockAiBots
BLOCKED_UAS = /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|cohere-ai|AI2Bot|Ai2Bot-Dolma|YouBot|DuckAssistBot|omgili|omgilibot|webzio-extended|gemini-deep-research/i
def initialize(app)
@app = app
end
def call(env)
ua = env['HTTP_USER_AGENT'].to_s
if BLOCKED_UAS.match?(ua)
[403, { 'Content-Type' => 'text/plain' }, ['Forbidden']]
else
@app.call(env)
end
end
end
end# config/application.rb
require_relative '../../lib/middleware/block_ai_bots'
module YourApp
class Application < Rails::Application
# Insert before the Rails stack
config.middleware.insert_before 0, Middleware::BlockAiBots
end
endMethod 6: nginx Reverse Proxy
Production Rails apps typically run Puma behind nginx (deployed via Capistrano, Kamal, or manually). Add a user agent check to nginx — matched bots never reach Puma or Ruby:
# /etc/nginx/sites-available/yourapp
upstream puma {
server unix:///var/www/yourapp/shared/tmp/sockets/puma.sock;
}
server {
listen 80;
server_name yourdomain.com;
root /var/www/yourapp/current/public;
# Block AI training crawlers — before Puma/Ruby
if ($http_user_agent ~* "(GPTBot|ClaudeBot|anthropic-ai|CCBot|Bytespider|Google-Extended|PerplexityBot|Diffbot|DeepSeekBot|MistralBot|cohere-ai|meta-externalagent|Amazonbot|xAI-Bot|AI2Bot|omgili|webzio-extended|gemini-deep-research|OAI-SearchBot|ChatGPT-User)") {
return 403;
}
# Serve public/ directly (including robots.txt) — bypass Rails
try_files $uri/index.html $uri @puma;
location @puma {
proxy_pass http://puma;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}AI Bots to Block
25 user agents covering AI training crawlers and AI search bots. The robots.txt, before_action, and middleware patterns above include all of them.
Frequently Asked Questions
Does Rails have a robots.txt file by default?
Yes. Every new Rails app generated with rails new includes a public/robots.txt file. Rails serves the public/ directory as static assets before the Rails stack processes requests — no route or controller needed. The default file contains only a generic comment. Simply replace its content with your AI bot blocking rules. This is the fastest method and works on all Rails deployment targets.
How do I create a dynamic robots.txt controller in Rails?
Generate a controller with rails generate controller Robots index, then add get '/robots.txt', to: 'robots#index', format: false to config/routes.rb. In the action, call render plain: content to return a text/plain response. Alternatively, create an ERB template at app/views/robots/index.text.erb — Rails serves it with the correct Content-Type when the route format is :text. Remove public/robots.txt first — the static file takes precedence over the controller if both exist.
How do I use before_action to block AI bots in Rails?
Add a private method to ApplicationController that checks request.user_agent against a regex of AI bot names and calls head :forbidden (HTTP 403) if matched. Register it with before_action :block_ai_bots to apply it to all controller actions. Since ApplicationController is the base class for all controllers in a Rails app, this blocks bots before any action runs. You can exclude specific controllers using skip_before_action :block_ai_bots.
What is the difference between before_action and Rack middleware for bot blocking in Rails?
before_action runs inside the Rails stack — after routing, after session handling, but before your controller action. It has access to the full Rails request object. Rack middleware runs before Rails itself — before routing, before session loading, before any Rails processing. Rack middleware is more efficient (slightly faster, uses less memory) because it short-circuits earlier. For most Rails apps, before_action is simpler to implement and maintain. Rack middleware is preferable for high-traffic apps or when you want to block bots before any Rails overhead.
How do I add noai meta tags to every Rails page?
Add the meta tag to your application layout template at app/views/layouts/application.html.erb. Place <meta name="robots" content="noai, noimageai"> inside the <head> section, typically after the existing meta charset and viewport tags. This applies to every page that uses the application layout (all pages by default). For per-page control, use a content_for block: define <%= yield :head %> in the layout, then use <% content_for :head do %><meta name="robots" content="index, follow"><% end %> in specific views.
Does blocking AI bots affect Rails caching or ActionView?
No. Rails page caching, fragment caching, and ActionView are completely unaffected by robots.txt directives or noai meta tags. If you use before_action to return 403, the response bypasses ActionView rendering entirely — no views are rendered, no cache is read or written for blocked requests. If you use public/robots.txt (static file), the Rails stack never runs for that request at all.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.