Skip to content
AI Training + Search · Amazon

How to Block Amazonbot

Amazon runs three distinct crawlers — each with different purposes and different privacy implications. Most sites should only block one of them. Here's how to tell them apart and what to do.

🤖 Amazonbot
AI model training crawler — collects data for Amazon's ML models. Block this one.
🔍 Amzn-SearchBot
Rufus AI + Alexa search experiences. NOT AI training. Think before blocking.
👤 Amzn-User
Live Alexa queries — real users asking real questions. Usually allow.

Amazon's Three Crawlers

Unlike most AI companies which run a single crawler, Amazon operates three separate bots with distinct purposes. Getting these wrong — especially blocking Amzn-SearchBot when you're an e-commerce brand — can hurt visibility in Amazon's AI experiences.

Amazonbot
AI model training
Block to opt out of training

Amazon's primary web research and AI training crawler. Collects web content to improve Amazon's machine learning models — Alexa's language understanding, question answering, and general Amazon AI capabilities. This is the bot you're opting out of when you want to stop Amazon from training on your content.

User-Agent
Mozilla/5.0 (compatible; Amazonbot/0.1; +https://developer.amazon.com/amazonbot)
Amzn-SearchBot
Rufus AI + Alexa search (NOT AI training)
Think before blocking

Specifically powers Amazon's search experiences: Rufus (Amazon's AI shopping assistant, used by hundreds of millions of Amazon customers) and Alexa knowledge and shopping results. Amazon explicitly states Amzn-SearchBot does not crawl for generative AI model training.

Blocking Amzn-SearchBot means your products, content, or brand won't appear in Rufus AI answers when customers ask shopping questions on Amazon.com. For most e-commerce brands and publishers that want Amazon visibility, this is the wrong bot to block.

User-Agent token
Amzn-SearchBot
Amzn-User
Live Alexa queries (NOT AI training)
Usually allow

When a customer asks Alexa a question that needs real-time information, Amzn-User fetches live content from the web on that user's behalf — like a browser acting for a real person. Amazon states it does not crawl for AI training. This represents actual user intent: someone actively asking Alexa a question.

User-Agent
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Amzn-User/0.1) Chrome/119.0.6045.214 Safari/537.36

Which Amazon Bot Should You Block?

Block Amazonbot if:
You don't want Amazon training AI models on your content. This is the standard AI training opt-out. Safe to block for most sites — no search or visibility impact.
Block Amzn-SearchBot only if:
You specifically don't want your content surfacing in Amazon's Rufus AI answers or Alexa results. Publishers focused on competitor ecosystems or those with exclusive content arrangements might block this. E-commerce brands and most content publishers should allow it.
Block Amzn-User only if:
You don't want Amazon acting as a proxy to fetch live content from your site for Alexa users. Unusual case — most sites benefit from Alexa users accessing their current content. Consider this if you have rate-limit concerns or strict access controls.

Option 1: Block via robots.txt

Block Amazonbot only (AI training opt-out)Recommended for most sites
robots.txt
# Block Amazon AI training crawler only
# Allows Amzn-SearchBot (Rufus + Alexa search) and Amzn-User (live queries)
User-agent: Amazonbot
Disallow: /

Stops AI model training without affecting your visibility in Rufus AI answers or Alexa experiences.

Block all three Amazon crawlers
robots.txt
# Block all Amazon crawlers
User-agent: Amazonbot
Disallow: /

User-agent: Amzn-SearchBot
Disallow: /

User-agent: Amzn-User
Disallow: /

Full Amazon block. Prevents training, Rufus/Alexa search indexing, and live query access. Appropriate if you want no Amazon AI touchpoints.

Block Amazonbot + all major AI training crawlers
robots.txt
# Block Amazon AI training
User-agent: Amazonbot
Disallow: /

# Block other major AI training crawlers
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: MistralBot
Disallow: /

User-agent: xAI-Bot
Disallow: /

User-agent: DeepSeekBot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /

# Search engines — unaffected
User-agent: Googlebot
Allow: /

User-agent: Bingbot
Allow: /

User-agent: Amzn-SearchBot
Allow: /

Option 2: Next.js App Router

app/robots.ts
import { MetadataRoute } from 'next';

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      // Block Amazon AI training only
      { userAgent: 'Amazonbot', disallow: ['/'] },

      // Allow Amazon search/Alexa experiences (Rufus, etc.)
      { userAgent: 'Amzn-SearchBot', allow: ['/'] },
      { userAgent: 'Amzn-User', allow: ['/'] },

      // Block other AI training crawlers
      { userAgent: 'GPTBot', disallow: ['/'] },
      { userAgent: 'ClaudeBot', disallow: ['/'] },
      { userAgent: 'anthropic-ai', disallow: ['/'] },
      { userAgent: 'Google-Extended', disallow: ['/'] },
      { userAgent: 'PerplexityBot', disallow: ['/'] },
      { userAgent: 'CCBot', disallow: ['/'] },

      // Allow search engines
      { userAgent: 'Googlebot', allow: ['/'] },
      { userAgent: '*', allow: ['/'] },
    ],
    sitemap: 'https://yoursite.com/sitemap.xml',
  };
}

Option 3: nginx — Hard Block

For server-level enforcement that doesn't depend on Amazon honoring robots.txt:

nginx.conf
# Block Amazonbot (AI training) — hard 403
if ($http_user_agent ~* "Amazonbot") {
    return 403;
}

# Optionally block Amzn-SearchBot too
# if ($http_user_agent ~* "Amzn-SearchBot") {
#     return 403;
# }

Returns HTTP 403 before the request reaches your application. Combine with robots.txt for defense in depth.

Option 4: Cloudflare WAF Rule

Cloudflare WAF → Custom Rules → Expression
(http.user_agent contains "Amazonbot")

Set the action to Block. To also block Amzn-SearchBot:

Block both training + search crawlers
(http.user_agent contains "Amazonbot") or (http.user_agent contains "Amzn-SearchBot")

Cloudflare Dashboard → Security → WAF → Custom Rules → Create rule

What Is Rufus — And Why It Matters for Publishers

Rufus is Amazon's AI shopping assistant, launched in 2024 and now available to hundreds of millions of Amazon customers. When a customer asks Rufus a question — "What's the best laptop for video editing?" or "Compare these two coffee makers" — Rufus draws on content indexed by Amzn-SearchBot, including product reviews, buying guides, and editorial content from the open web.

The Amzn-SearchBot tradeoff for publishers

If you publish product reviews, buying guides, comparison content, or any content that Amazon customers might ask about — blocking Amzn-SearchBot means Rufus won't surface your site as a source. That's potential referral traffic and brand visibility you're opting out of. Blocking Amazonbot (the training crawler) carries no such cost.

For most content publishers, the right call is: block Amazonbot, allow Amzn-SearchBot. You stop contributing to Amazon's AI training datasets while keeping your content visible to Rufus users and Alexa queries.

Verify Your Block

bash
# Check nginx access logs for Amazon bots
grep "Amazonbot" /var/log/nginx/access.log | tail -20
grep "Amzn-SearchBot" /var/log/nginx/access.log | tail -10

# Confirm Amazonbot fetched robots.txt (then stopped)
grep "Amazonbot" /var/log/nginx/access.log | grep "robots.txt"

# If server-level blocked — confirm 403s
grep "Amazonbot" /var/log/nginx/access.log | grep " 403 "

# Check published IPs against your access logs
# Amazon publishes Amazonbot IPs at:
# https://developer.amazon.com/amazonbot/live-ip-addresses/

Seeing Amazonbot fetch /robots.txt followed by no content requests confirms the block is working. Amazon updates its robots.txt cache every 30 days — allow up to 24 hours for changes to take effect.

Frequently Asked Questions

Does Amazonbot respect robots.txt?
Yes. Amazon officially states all three crawlers honor the Robots Exclusion Protocol (RFC 9309), including user-agent and allow/disallow directives. Amazon also respects rel=nofollow and page-level meta tags (noarchive, noindex, none). Note: Amazon does not support the crawl-delay directive.
What user agents does Amazon use?
Amazon uses three: (1) Amazonbot — AI training. Token: Amazonbot. (2) Amzn-SearchBot — Rufus AI + Alexa search, NOT training. Token: Amzn-SearchBot. (3) Amzn-User — live Alexa queries, NOT training. Full UA: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Amzn-User/0.1) Chrome/119.0.6045.214 Safari/537.36.
What is the difference between Amazonbot and Amzn-SearchBot?
Amazonbot is the AI training bot — it feeds Amazon's ML models. Amzn-SearchBot is the search and discovery bot — it feeds Rufus (Amazon's AI shopping assistant) and Alexa knowledge results. Amazon explicitly states Amzn-SearchBot does NOT crawl for AI model training.
Should I block Amzn-SearchBot if I have an e-commerce or content site?
Probably not. Amzn-SearchBot feeds Rufus — Amazon's AI shopping assistant used by hundreds of millions of customers. Blocking it means your products or content won't appear in Rufus answers. Block Amazonbot (the training bot) instead. Only block Amzn-SearchBot if you specifically don't want Amazon's AI ecosystem surfacing your content.
Does blocking Amazonbot affect my Amazon product listings?
No. Blocking Amazonbot has no effect on your Amazon product listings or Amazon.com search. Amazonbot collects data for AI model training, not the Amazon marketplace index. However, blocking Amzn-SearchBot will prevent your content from appearing in Rufus AI answers.
Will blocking Amazonbot affect my Google or Bing rankings?
No. Amazon bots are completely separate from Googlebot and Bingbot. Blocking any Amazon crawler has zero effect on your search engine rankings.
What is Amzn-User and should I block it?
Amzn-User fetches live web content on behalf of Alexa users asking real-time questions. Amazon states it does not crawl for AI training. Most publishers allow it — it represents actual user intent. Block it only if you don't want Amazon acting as a proxy to your site for Alexa users.

Related Guides

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

Scan My Site Free →

Related Guides