Bing XML Sitemap Plugin

 

The Bing XML Sitemap Plugin is an open source server-side technology that takes care of generating XML Sitemaps compliant with sitemaps.org for websites running on Internet Information Services (IIS) for Windows® Server as well as Apache HTTP Server. Easy to install and highly configurable, the Bing XML Sitemap Plugin provides an ideal solution for webmasters and web hosting companies looking to enhance search engine discoverability for their own or their customers' web sites.

 

noteNote
The XML Sitemaps generated by the Bing XML Sitemap Plugin adhere to the XML Sitemap protocol and can be used by all search engines that support the protocol, not just by Bing.

 

Download the Bing XML Sitemap Plugin

You can download the Bing XML Sitemap Plugin as well as a copy of the source code directly from the Microsoft Download Center or by using the download button below.

 

 

System Requirements

Bing XML Sitemap Plugin for IIS:

  • IIS 6 or higher
  • ISAPI Filter installed as an IIS module
Bing XML Sitemap Plugin for Apache on Linux:
  • Apache developer package, including the Apache extension tools (apxs) and apr libraries.
  • zlib developer package
  • libtool package
  • C & C++ (gcc & g++) package.

 

How the Bing XML Sitemap Plugin Works

The Bing XML Sitemap Plugin generates two types of Sitemaps:

  1. A comprehensive Sitemap of URLs seen in server traffic
  2. A Sitemap dedicated to URLs that have changed recently (delta)

Having both comprehensive and delta Sitemaps provides you with significant benefits, as you will always have a full, up-to-date list of all URLs on your website that search engines can use for deep crawl, as well as a concise Sitemap of URLs that were modified recently, which search engine crawlers can prioritize. This can help in keeping bot traffic bandwidth down. In addition, the Sitemap Plugin automatically adds <lastmod> values to your Sitemap, and generates <priority> values to the Sitemap based on how popular your URLs are.

 

You Control What Gets Added

The Bing Sitemap Plugin allows you to control exactly what gets added to the Sitemap. Not only does the Bing Sitemap Plugin detect any Disallow and Allow directives inside your site's robots.txt, skipping any URL patterns that shouldn't be added, it also provides additional control through configuration files with rules that augment your existing robots.txt Disallow directives. Finally, the webmaster is also in full control when it comes to which query paramaters to honor and include in the added URLs.

 

Supported Configurations

The Bing XML Sitemap Plugin can be configured to operate in the following site and server scenarios:

  1. Single site, single server
  2. Single site, multiple servers
  3. Multiple sites on a single server
  4. Multiple sites on multiple servers

When operating across multiple servers, the Bing XML Sitemap Plugin has a merge service to generate a unified Sitemap, which in turn is distributed back to all front-end servers. 

 

Open Source License

To allow you to understand exactly what the plugin does to keep your Sitemaps humming, we have released the Bing Sitemap Plugin (Beta) as open source under the Apache License, Version 2.0. The source for the Beta is available for download on the Microsoft Download Center.

 

 

Features Overview

This is an overview of the key features options available in the XML Sitemap Plugin:

Sitemaps.org Compliant XML Sitemaps

The Bing XML Sitemap Plugin generates XML Sitemaps compliant with http://sitemaps.org. In doing so, it creates two types of Sitemaps: a comprehensive Sitemap based on recent traffic activity and a "delta" Sitemap that contains all pages that changed within a configurable time window. The latter is generated based on a signature computation that establishes whether or not a page changed since it was last seen by the plugin.

Configurable URL Parameter Handling

By default, query string parameters that are seen by the plugin are not added to the sitemap URLs automatically. However, if your site uses query parameters to uniquely identify content you can easily include each significant parameter to the configuration file called normalization.txt. This file lives in the SitemapData folder for each host that you are running the sitemap plugin for.

Link rel=Canonical Handling

If your pages use link rel=canonical, then the plugin will use the URL value from <link rel="canonical"> when found in the pages HTML source as the URL to add to the XML Sitemap. Note: parameters that are part of the canonical URL will not be dropped irrespective whether they are listed in normalization.txt)

Redirect Handling

The plugin will only add pages that return a HTTP status code of 200 to the Sitemap. However, when you redirect a page after it was 200 OK previously, it will still show up in the Sitemap for some time, that is, until the configured time decay has passed (configurable in config.ini/VisitTimeoutSec.

404 Handling

In addition to the comprehensive and delta Sitemaps, the plugin actually generates a third type: a Sitemap file that only contains URLs that returned a 404 on the site. The location of this file is currently commented out in the Sitemap index file because it should not be seen nor processed as a regular XML Sitemap by Search Engines, but it can be used by the search engines to inform the crawlers that these URLs are no longer valid.

Robots.txt Rules Handling

The XML Sitemap Plugin honors the robots.txt rules you define in your site’s robots.txt in the sense that it will not add blocked URLs to the XML Sitemap. More flexibility of what gets added: Disallow.txt For added flexibility you can also add specific disallow rules for the Bing XML Sitemap plugin, using the same syntax as robots.txt in the file disallow.txt which lives in the SitemapDateFolder for the site. Any disallow rules there will be observed by the plugin in the sense that disallowed URL patterns will not be added to the XML sitemaps.

Automatic <priority> Field Calculation

Since the plugin works off of traffic-based signals, it uses these too to establish a priority value for a given URL in the XML Sitemap. Based on the premise that 0.5 represents the average priority and that more important URLs need to get awarded a priority higher than 0.5, less important ones a lower priority, the currently used formula to calculate priority is as follows:

Priority Algorithm in Bing XML Sitemaps Plugin

In other words: the priority of the page is the visit count for the URL divided by 2 times the average visit count for the entire host site.

Configuration Options in config.ini


Setting Description
wwwroot This setting points to the path wwwroot of the site. This value should generally not be changed.
wwwhost This setting points to the path of the content folder for the host. This value generally should not be changed.
PublishSitemap Determines if the XML Sitemap is published to wwwroot (1) or not (0).
WriteToRobots Determines if a sitemap: directive should be written to the host site’s robots.txt file.
gzip Determines if the plugin should decompress compressed pages (1) or not (0). Default = 1. Since we need to decompress pages to calculate a signature, do not change this to 0 in production environments.
Decompress Determines if the plugin can decompress compressed pages. Default = 1
PingSearchEngines Determines if the search engines should be informed (1) or not (0) using a ping to their respective services. Currently the plugin can ping Google and Bing.
HelpSSP Determines if the plugin can write additional debugging information to the Sitemap that helps Bing improve the plugin (1) or not (0).
MaxSnapshotFileMB Maximum file size for the snapshot file used to generate the XML Sitemaps. Default is 4096MB.
MaxMemoryCacheMB Maximum amount of memory the plugin is allowed to use. Default is 32MB.
SitemapGenerationPeriodHours Number of hours between Sitemap generations. Default is the recommended value of 24 (once per day).
DiskMinFreeMB

Minimum amount of free disk space to keep. When disk space is smaller than the configured value, the Sitemap Plugin cleans up its temporary data.

MergeCacheMemoryMB Maximum amount of memory used by the merge service to merge the Sitemaps in a multi-server scenario (default is 512MB)

VisitTimeoutSec

Longest period of time in seconds a URL remains in the Sitemap without being seeing any traffic. Default is 2592000 which equates 30 days.

Feedback and Suggestions

We welcome your feedback, suggestions and feature requests in the Bing Webmaster forums.