Live Search Webmaster Center now supports the following four items, which are a great help in identifying problems with your site and how Live Search is spidering your content:
- File not found (404) errors, a straight forward date stamped account of the HTTP “404 File Not Found” errors that Live Search encountered when crawling the site. Conveniently, this includes broken links within your own site and sites that you are linking to.
- Pages Blocked by Robots Exclusion Protocol (REP), reported when Live Search has been prevented from indexing or displaying a cached copy of the page because of a policy in your robots exclusion protocol (REP).
- Long Dynamic URLs, reported when Live Search encounters a URL with an exceptionally long query string. These URLs have the potential to create an infinite loop for search engines due to the number of combination’s of their parameters, and are often not crawled. I haven’t come across one of these yet and so far I haven’t seen any documentation of what ‘exceptionally long’ means, so clarification on that point would be handy.
- Unsupported Content-Types, reported when a page either specifies a content-type that is not supported by Live Search, or simply doesn’t specify any content type. Examples of supported content-types are: text/html, text/xml, and application/PowerPoint.
In 2007, Microsoft removed the ability for users to drill into backlink data within Live Search. It took a long time, however that functionality has now been replaced within Live Search Webmaster Center and is looking quite promising.
Common functionality shared between the crawl information and back link data, is that Live Webmaster allows you to download the information CSV format. Possibly the best feature for a large complex site though, is that each of the above options can be filtered (search style) further by entering in a subdomain and/or directory to restrict the results to. The backlink interface additionally supports a top level domain in the search box, allowing you to isolate only back links originating from an Australian site by entering in .au.
The interface doesn’t support paging of results, in case you want to step through a few pages without wanting to export information in CSV format. If you do want to download more information, there isn’t an option to export all information in a hit – you can only retrieve 1000 lines of data. I can appreciate that they don’t want to provide an ‘all’ option or that they want to limit how many can be fetched at once, however there isn’t a way to set 1000 items per page to download them and then go to the next page and download them. The other issue with the 1000 lines of data, is that there is no information on how the 1000 lines are selected. As an example, the backlink section uses the language ‘Download up to 1000 results’ – however there isn’t any indication of how the 1000 are selected.
While there is still room for improvement and really, when isnt there, I’m personally encouraged by the changes that Microsoft are making to Live Search Webmaster Center. The sooner services from Microsoft start to catch up to other services offered by the leaders – the sooner more businesses and webmasters will spend investing time into the Live Search product.