<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>#if debug &#187; spider</title>
	<atom:link href="http://ifdebug.com/articles/tag/spider/feed/" rel="self" type="application/rss+xml" />
	<link>http://ifdebug.com</link>
	<description>Technical thoughts of a coffee addicted developer</description>
	<lastBuildDate>Sun, 11 Apr 2010 12:53:04 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Stopping Search Engines Indexing Website Maintenance Pages</title>
		<link>http://ifdebug.com/articles/stopping-search-engines-indexing-website-maintenance-pages/</link>
		<comments>http://ifdebug.com/articles/stopping-search-engines-indexing-website-maintenance-pages/#comments</comments>
		<pubDate>Sun, 02 Dec 2007 14:06:33 +0000</pubDate>
		<dc:creator>Alistair</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[crawler]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[maintenance]]></category>
		<category><![CDATA[search engine]]></category>
		<category><![CDATA[spider]]></category>

		<guid isPermaLink="false">http://ifdebug.com/articles/stopping-search-engines-indexing-website-maintenance-pages/</guid>
		<description><![CDATA[There is no schedule for when a search engine will or won&#8217;t turn up on your web sites door step and starting indexing it; what happens when they turn up unannounced during scheduled maintenance? Under normal conditions, the search engine &#8230; <a href="http://ifdebug.com/articles/stopping-search-engines-indexing-website-maintenance-pages/">Continue reading <span class="meta-nav">&#8594;</span></a>


Related posts:<ol><li><a href='http://ifdebug.com/articles/google-local-business-centre-receives-upgrade-with-humorous-outage-message/' rel='bookmark' title='Permanent Link: Google Local Business Centre Receives Upgrade With Humorous Outage Message'>Google Local Business Centre Receives Upgrade With Humorous Outage Message</a> <small>The Google Local Business Centre is currently undergoing maintenance for...</small></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>There is no schedule for when a search engine will or won&#8217;t turn up on your web sites door step and starting indexing it; what happens when they turn up unannounced during scheduled maintenance? Under normal conditions, the search engine spiders will notice the difference between what they crawled last time and will include your changes into their index. Of course, you really don&#8217;t want your &#8216;we are currently performing scheduled maintenance and expect to be back in 1 hour&#8217; message showing up in search engine results when your users enter an appropriate query.</p>
<p>To stop search engines indexing your site while it is in maintenance mode, there are two simple solutions available:</p>
<dl>
<dt>HTTP 404 response code</dt>
<dd>Search engines don&#8217;t immediately remove your web pages from their index because they cannot access it on a given request; just like they won&#8217;t remove it if your site is returning an internal server error. Instead, they will take notice that they attempted to crawl a given web page at time particular time and try again later. Only after repeatedly failing to retrieve the document will they mark that particular page as being non-existent and remove it from their index.</dd>
<dt>META no-index tag</dt>
<dd>When a search engine spider encounters a no-index meta tag, they should immediately abort indexing that particular page. After the scheduled maintenance is over and the spiders return, the no-index flag is no longer present &#8211; so the spiders will proceed with the crawl as normal.</dd>
</dl>
<p>Next time your site is under maintenance, make sure you&#8217;ve implemented one of these point or you could be very surprised what&#8217;ll show up in the search engine results the following day!</p>


<p>Related posts:<ol><li><a href='http://ifdebug.com/articles/google-local-business-centre-receives-upgrade-with-humorous-outage-message/' rel='bookmark' title='Permanent Link: Google Local Business Centre Receives Upgrade With Humorous Outage Message'>Google Local Business Centre Receives Upgrade With Humorous Outage Message</a> <small>The Google Local Business Centre is currently undergoing maintenance for...</small></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://ifdebug.com/articles/stopping-search-engines-indexing-website-maintenance-pages/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Internet Scale</title>
		<link>http://ifdebug.com/articles/internet-scale/</link>
		<comments>http://ifdebug.com/articles/internet-scale/#comments</comments>
		<pubDate>Wed, 07 Nov 2007 14:52:31 +0000</pubDate>
		<dc:creator>Alistair</dc:creator>
				<category><![CDATA[Search]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[search engine]]></category>
		<category><![CDATA[spider]]></category>
		<category><![CDATA[yahoo!]]></category>

		<guid isPermaLink="false">http://ifdebug.com/articles/internet-scale/</guid>
		<description><![CDATA[After launching ifdebug on the 3rd November, it&#8217;s only taken Googlebot and Yahoo! Slurp an amazing four days to crawl and index the site. Many moons ago, people would report having their sites online for literally months before being crawled &#8230; <a href="http://ifdebug.com/articles/internet-scale/">Continue reading <span class="meta-nav">&#8594;</span></a>


No related posts.]]></description>
			<content:encoded><![CDATA[<p>After launching ifdebug on the 3rd November, it&#8217;s only taken Googlebot and Yahoo! Slurp an amazing four days to crawl and index the site. Many moons ago, people would report having their sites online for literally months before being crawled by search engines, let alone have the content showing up in their index.</p>
<p>In August, Matt Cutts pointed out that the Google index is becoming <a href="http://www.mattcutts.com/blog/minty-fresh-indexing/">minty fresh</a>. What used to take months back in the year 2000, is now happening in days and what was taking days in 2005 is now regularly happening in hours or minutes. While the majority of the world don&#8217;t care about this sort of stuff and it never even enters into their consciousness, I find this nothing short of a technical marvel.</p>
<p>All major search engines currently report that they index literally billions of objects reaching into the farthest corners of the internet. This is where the amazing aspect comes into effect, <a href="http://ifdebug.com">ifdebug</a> is but one of hundreds of millions online and some how the major search engines manage to find the time to crawl and index it only a matter of days after it was <em>created</em>!</p>
<p>The fast crawl rate is surely due to the link from my personal blog pointing here, as it is already well indexed and receives constant attention from the major search engines on a daily basis. As for managing the on going freshness of the site, sitemaps and online services such as <a href="http://pingomatic.com">pingomatic</a> must play a reasonably substantial role in helping to keep their indexes fresh.</p>
<p>I&#8217;ll be keeping an eye on the major search engines over the coming weeks and months to see how they are performing; I&#8217;ll report back with the finding if there are any worth mentioning.</p>


<p>No related posts.</p>]]></content:encoded>
			<wfw:commentRss>http://ifdebug.com/articles/internet-scale/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
