We are taking a quick time-out on our series on large site optimization to provide feedback on a question we are getting asked frequently in the Webmaster Center forum . Site owners want to control which content gets indexed. We have worked hard to ensure that we follow and respect the Robots Exclusion Protocol as directed in your robots.txt file. But occasionally there are pages that get indexed because they were not blocked or for some other technical reason and you want to have the URLs or site removed from the Live Search index. If there is content that you want removed, the best thing to do is to specify this exclusion in either your robots.txt file, your HTTP header, or add a meta tag exclusion...
For the large website, there are many critically important issues in optimizing for search. In Part 1 of this series of posts, we discussed the importance of reducing the number of URLs you expose through canonicalization. But there are other ways to reduce the surface area of your site to search engines and focus on pages that matter. While you may have reduced the number of URLs you exposed to Live Search, a large site can still have a large surface area to crawl. In crawling your site, search engines may not get all the best content or can eat unnecessary bandwidth that you pay for. This is where HTTP compression and conditional GET can help. Enabling HTTP compression Whether or not you are...
At Live Search, one of the most common questions we receive from our peers at microsoft.com and msn.com is how to optimize their sites for search. But microsoft.com is unlike most other sites on the Internet. It is huge, containing millions of URLs, and is growing all the time. However, large content sites like microsoft.com and msn.com are not the only sites that can have an infinite number of URLs. There are also large ecommerce sites and government agency sites that produce very large numbers of URLs. As with any site, our original recommendations on how to rank in Live Search are still important. But we’ve given it some thought and wanted to provide some recommendations oriented toward very...