Quantcast
Viewing latest article 28
Browse Latest Browse All 46

How do I stop Amazon from crawling a website?

I would like to prevent Amazon from scraping product data on my website. So I found this document: https://developer.amazon.com/amazonbot

And this example:

User-agent: Amazonbot             # Amazon's user agentDisallow: /do-not-crawl/             # disallow this directory

So, if I add:

User-agent: Amazonbot             # Amazon's user agentDisallow: /                       # disallow access to all the website

or maybe

User-agent: Amazonbot             # Amazon's user agentDisallow: /Technology/            # disallow access to Technology category page

In particular, would this prevent access to all products on the Technology page on the website?

What concerns me also is the mention of crawl delay in their 'Help' Page?

I currently have:

User-agent: *Disallow: /admin/Disallow: /api/Crawl-delay: 1User-agent: Amazonbot Disallow: / 

Which obviously has a crawl delay and this comment within their 'Help' page:

Today, AmazonBot does not support the crawl-delay directive in robots.txt and robots meta tags on HTML pages such as “nofollow” and "noindex".


Viewing latest article 28
Browse Latest Browse All 46

Trending Articles