MSNbot not obeying robots.txtI

Answered (Verified) This question is answered

I've had this robots.txt file on my server (http://www.franklin.library.upenn.edu/robots.txt) since at least January 2009.  

   User-agent: *

   Disallow: /

MSNbot appears to be ignoring this rule when following deep links into my site.  I'm getting hundreds of log entries like this:
   65.55.207.101 - - [05/Aug/2009:17:11:45 -0400] "GET /cgi-bin/Pwebrecon.cgi DB=local&Search_Arg=tkey%20%22What%27s%20cooking%20Madam%20chairman%3f%3f%22&Search_Code=CMD&CNT=30&HIST=1 HTTP/1.0" 200 12937
in my web server logs.  How can I get MSNbot to stop indexing other than blocking at the firewall?

Verified Answer
  • Hi McKenzie,

    We are tracking this situation and vigorously working to fix these errors. Could you please send an email to bwmc@microsoft.com with your domain name and the title of this post in subject line. Could you please also send any documentation, such as clips from your log file, that might help us positively identify which bot is causing this issue.

    Your help is greatly appreciated.

    ~B

    ~B

    *I no longer work for Bing.  

All Replies
  • We are having the same issue on our site.

    Our site contains forums, and the MSNBot is crawling the forum pages sometimes 4 or 5 times in the same day, with multiple connections every second. We have updated the robots.txt file per Bing specs, but nothing has changed.

    We are not having bandwith issues, but the forums are database driven and at times the database is bogged down with requests.

    Other than blocking the IP address of the bot, is there any other way to get the bot to obey the robots.txt limitations?

Page 1 of 1 (4 items)