Bing blogs

This is a place devoted to giving you deeper insight
into the news, trends, people and technology behind Bing.

msnbot and non-existant files

Webmaster

Webmaster
This group is devoted to Bing Webmaster Tools discussions.

msnbot and non-existant files

This question has suggested answer(s)

for many many years, msnbot has been crawling my sites looking for files that have never existed... i'm trying to figure out why...

the filenames have changed slightly in recent times but they have been similar in structure since the beginning... they are something like 000092601_00002.temp0001.htm... in other words, 9 numbers underscore 5 numbers dot temp 4 numbers dot htm... the search for these is all over my server's directory tree...

i'll emphasize once more that these files have never existed on my site and i have no clue how msnbot may have picked them up...

now, how can i get msnbot to stop polluting my logs looking for them???

All Replies
  • Peter freeman
    Can you  share your url  with us because we  are a group of volunteers and starting a new initiative in a community. Your post provided us valuable information to work on.You have done a marvellous job!

    as written in a previous message, is my profile broken? ;) ;) ;)

  • kenb
    Here are a couple of server log entries I have for the bad msnbot requests:

    '65.55.207.126'|Tue, 15 Dec 2009 20:39:49 -0500|'msnbot/2.0b (+http://search.msn.com/msnbot.htm)'|'*/*'|'/ADBF3C7AB534E8356F30D8AC05291640_00000.temp019f.html'|''
    '65.55.207.28'|Wed, 16 Dec 2009 05:46:22 -0500|'msnbot/2.0b (+http://search.msn.com/msnbot.htm)'|'*/*'|'/000166709_00001.temp00be.html'|''

    exactly, ken... the first one is a newer format that has appeared recently... i can provide a few thousand of these, too...

    kenb
    Note that the requested file is obviously a totally manufactured file name, which is designed to cause a 404 error. Seeing maybe one or two of these per week in the server logs wouldn't be too much of an annoyance, but when you see multiple requests per day it is really annoying.

    i understand what you are saying but i don't agree with the forced 404 concept... i'm thinking deeper... like maybe something related to attempting to edit pages with FrontPage or similar... the question is how msnbot is finding out the name of the temp files... especially when FrontPage has never been used on the site in question... my Apache servers have no idea what FrontPage is... they never have and never will ;)

  • have you contact Bing Customer Support yet using the instructions I posted earlier?

  • Archie
    have you contact Bing Customer Support yet using the instructions I posted earlier?

    IIRC, archie, yes... and they simply lead me back into this maze of twisted tunnels and passages :?

    however, i'm also unable to click on the numbers at the bottom of this thread and get back to the original posts so i'm unable to confirm that i have followed your instructions :? :(

    i fear that someone's broken something in the javascript stuffs or my AdBlockPlus or NoScript are silently blocking something that is not necessary that wasn't a few days ago... i dunno and i grow extremely weary of even trying to get someone at microsoft to follow along and correct this 10+ years long problem :? :( :( :( :( :( :( :(

  • Archie
    In that case do it this way:

    1 - Go to http://www.bing.com/

    2 - Click "Help" - bottom right hand corner

    3 - Click "Get More Help" - again bottom right hand corner

    4 - And then click "get Support" - this will give you a contact form

    just to follow up for completion on this...

    the last "get support" link opens another browser window and there is no form to fill out... instead there is this very unhelpful text :?

    "Welcome to Bing Help

    This is your gateway to help with Bing. The topic list in the left pane displays answers to frequently asked questions or Help topics related to the link that you clicked to open Help. You can also use the Search for box in the left pane to find answers to your questions.

    If more than one topic matches your search, links to the topics are displayed in the left pane.

    To search more effectively:

    • Limit the number of words in your query.
    • Make sure that your search terms are spelled correctly.
    Note

    To tell us what you think about a particular Help topic, click one of the emoticons Was this information helpful? next to Was this information helpful? at the bottom of the topic. Type your feedback, and then click Submit."

    which only leads me back into this maze :? :? :( :( :(

  • wkitty42
    i understand what you are saying but i don't agree with the forced 404 concept... i'm thinking deeper... like maybe something related to attempting to edit pages with FrontPage or similar... the question is how msnbot is finding out the name of the temp files... especially when FrontPage has never been used on the site in question... my Apache servers have no idea what FrontPage is... they never have and never will ;)

    No, it has nothing to do with FrontPage or anything similar to that. I'm certain that the vast majority of webmasters would find these entries in their log files if they looked.  I know that msnbot does lots of stupid stuff like this and that MSFT is very evasive about why they do it. My own belief is that they are trying different methods to detect cloaking and other black hat SEO techniques designed to game search engines.  Here are some other recent forum discussions on bad msnbot behaviors (links to blog entries referencing both Webmaster World and Bing forum threads:

    • http://www.seroundtable.com/archives/021362.html
    • http://www.seroundtable.com/archives/021147.html
    • http://www.seroundtable.com/archives/020777.html
    • http://www.seroundtable.com/archives/020734.html
    • http://www.seroundtable.com/archives/020728.html

    Honestly, if Bing/MSN Search/Live wasn't a major search engine I would have blocked msnbot years ago because of its constant bad behavior.  The reason Microsoft can come up with a decent search engine with a real potential to compete against Google is because msnbot wastes way to much time and resources behaving badly rather than crawling what it is supposed to in order to help MSFT create a really good search index.

  • Man I wish this forum had a way to edit posts AND obeyed its own formating code created by the rich text editor. It killed my last post.

    One key mistake was that I meant to say MSFT CAN'T create a decent search engine.

  • Very strange - the contact form can take a while to load, but eventually it should give you this:

    I'm afraid I don't know any other way to contact Bing if that's not working; give this URL a try:

    https://support.discoverbing.com/eform.aspx?productKey=bingcontentremoval&ct=eformts

    If that doesn't work, then I'm sorry, but as far as I know there is no other way to contact support

  • Just wanted to confirm that I have the same exact problem. I get a log report everyday from my server. The MSNbot has searched for these non-existent files every day on my server for the past month. I'm so tired of seeing the list I've decided that I'll add an automatic forward for any *temp* file that is trying to be accessed. Thanks MSN for wasting my time on files that don't exist. Just may add msnbot to my block list.