I added the following lines to my robots.txt file few weeks ago:
User-agent: msnbotCrawl-delay: 5
But msnbot keeps crawling the site at much faster rate than 5 seconds. It generates twice as much traffic as googlebot or yahoo. What's wrong?
Hi,
please contact me at bwmc@microsoft.com with "MSNBot overcrawling" in the subject line and our team will work with you to find the cause/solution.
Thanks and my apologies.
~B
*I no longer work for Bing.
Here is anther way mentioned by "Brett Yount" [Bing Expert]:
"Another way to help reduce load on your servers is by implementing HTTP compression and Conditional Get."
You can read about this method here:
http://blogs.msdn.com/webmaster/archive/2008/02/12/announcing-crawling-improvements-for-live-search.aspx
Also, I suggest you read this question:
"Q: How do I decrease MSNBot’s crawl rate?"
Here:
http://www.bing.com/community/forums/t/651373.aspx
We already use HTTP compression. The problem is number of requests - it is way too high. As for the suggested read - it says to use Crawl-delay, which brings me back to my question.
1) Any idea why it is not working?
2) Does anyone have similar experience?
Thanks
We are having the same problem at http://events.berkeley.edu/
Our robots.txt file says:
http://events.berkeley.edu/robots.txt
but we see Microsoft IPs hitting the site multiple times per second and MSNBot is the top robot visitor by a factor of 10. We are about to block these IPs at the server level as a last resort.
How can we stop this level of traffic?
-Sara