Hello,
I've notice that bing is crawling each page of my website twice, first making an HTTP 1.1 request and getting a compressed response then immediately issuing an HTTP 1.0 request to receive the same page without gzip compression
The following lines from my log show the issue (there are thousands more similar occurrences)
65.55.207.74 - - [13/Dec/2009:14:58:42 +0000] "GET /specimen/235698/ HTTP/1.1" 200 1742 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"65.55.207.74 - - [13/Dec/2009:14:59:06 +0000] "GET /specimen/235698/ HTTP/1.0" 200 4259 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"65.55.106.209 - - [13/Dec/2009:15:03:08 +0000] "GET /specimen/250262/ HTTP/1.1" 200 1733 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"65.55.106.209 - - [13/Dec/2009:15:03:14 +0000] "GET /specimen/250262/ HTTP/1.0" 200 4164 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
There's also a discussion about what appears to be the same issue at webmasterworld.
This seems a waste of bandwidth and completely defeats the point of supporting http compression.
Tom
Hi,
could you please mail this information to bwmc@microsoft.com and I will get our crawling team to check it out?
Thanks,
~B
*I no longer work for Bing.
Look at these posts for help:
http://www.bing.com/community/blogs/webmaster/archive/2009/08/10/crawl-delay-and-the-bing-crawler-msnbot.aspx
http://www.bing.com/community/blogs/webmaster/archive/2008/04/18/ramping-up-msnbot.aspx
http://www.bing.com/community/forums/t/648631.aspx
The best web directory | Computer products | SEO Blog | PHP Scripts
Thank you for your prompt response, however the links you have posted don't address the issue.
The problem is not the rate at which Bing is requesting pages, but rather that it is unnecessarily requesting each page twice.
Hi Tom,
I offer a gzip version on my site which is the one google downloads. But msnbot on my site is only picking up the 1.0 non-gzip version. Maybe there is a transitional process going on and they are switching over? Hopefully someone from the bing team can jump on and fill us in.
Find Australian Business - Business Directory
Hi Tom
Brett (Bing) is the person to contact about this. I've seen a couple of other posts a while ago with a similar problem. Brett will hopefully be able to find out why it's happening and put a stop to it:
http://www.bing.com/community/members/Brett-Yount/default.aspx
Regards
Archie
I noticed the same thing and also posted a question about it here: http://www.bing.com/community/forums/t/652257.aspx
cheers
Reuben
Hi Brett,
Thanks for your reply, I've sent you an email.
regards,
Hello Brett/Tom...
This thread is getting very interesting with daily inputs specially with Brett's involment.
Printer Ink
Seeing the same thing on my sites: msnbot requests every page twice.
I have seen this here on my server's sites all autumn and you're only getting around to the collection of live examples NOW...!? Possibly everyone would do better to forget the gzip angle. The crux of the issue is that MSN is deliberately using both 1.0 and 1.1 HTTP protocols concurrently and so deliberately misusing the bandwidth and resources of those of us on the internet who choose to cooperate with the myriad faults (seemingly permanently) programmed into MSNBOT. I wrote some code to separate out responses to calls made by MSNBOT in both protocols. It made no difference to your use of duplication. You're testing the use of protocols on the internet and the relevant responses. Other major engines use either protocol but only MSNBOT uses BOTH concurrently... well, at few seconds apart from the same IP and with the same UA signature. You're using our bandwidth to test but don't even have the basic honesty to sign it off with a different or beta signature UA.
Bing didnt crawl my website www.ronicmile.us even once....i am still waiting to be indexed
I've notice that bing is crawling each page of my website twice, request to receive the same page without gzip compression The following
Compare Web Hosting Hostgator vs DreamHost