Stopping Search Engines Indexing Website Maintenance Pages

There is no schedule for when a search engine will or won’t turn up on your web sites door step and starting indexing it; what happens when they turn up unannounced during scheduled maintenance?

Under normal conditions, search engine spiders will notice the difference between what they crawled last time and will include your changes into their index. Of course, you really don’t want your ‘we are currently performing scheduled maintenance and expect to be back in 1 hour’ message showing up in search engine results when your users enter an appropriate query.

To stop search engines indexing your site while it is in maintenance mode, there are two simple solutions available:

  • HTTP 404 response code
    Return a HTTP 404 (Not Found) response code for each URL on your site effected by the maintenance outage. Search engines don’t immediately remove your web pages from their index because they cannot access it on a given request and instead try again later. Only after repeatedly failing to retrieve the document will they mark that particular page as being non-existent and remove it from their index.
  • HTTP 503 response code
    Return a HTTP 503 (Service Unavailable) response code for each URL on your site effected by your downtime. When a search engine spider encounters a 503 response code, it signals to the bots to come back later. Like the 404 error code, search engines won’t immediately remove the URLs from the index because your site is experiencing an outage.

What you should avoid doing during the scheduled maintenance or outage:

  • HTTP 200 (Okay)
    Returning a HTTP 200 response code, which is the normal response code when everything is working as expected. It is quite common to see error handlers like a 404 or 503 mistakenly return a 200. In this example, Google might index the content on your error page against each of those unavailable URLs which will impact your rankings and visits.
  • HTTP 301 (Permanent Redirect)
    Returning a HTTP 301 on each of the unavailable URLs to some other URL. For example, maybe a particular directory on the site is offline and you redirect those URLs to the home page. Like the 200 example above, this might cause Google to move all of those URLs with the 301 response code and replace them with the home page, effectively deleting those URLs from the site.

Each of these options are simple to confirm they are implemented properly today and to fix the configuration if they aren’t to avoid unnecessary impacts from scheduled maintenance or unplanned outages.