Archive for December, 2007

You Know You’re Popular When

Saturday, December 29th, 2007

Today my personal site was pinged by Live Business Radio. As I do as a matter of course, I checked out the Live Business Radio web site and was disappointed to find that it’s nothing more than your average run of the mill site ridden with advertising, spam and buy this crap product now.

I get pinged by web sites regularly that don’t have anything to do with me and when I saw the site, I was about to abandon it immediately. Just before I did though, I scanned over the article and noticed that I’d been featured in a list of sites with a high Google Pagerank which offered links which are ‘followed’. It wouldn’t be a good filthy spammers site if they didn’t offer you software (for a fee) which you could use to spam take advantage of the followed links.

If you’re not quite sure what I’m referring to regarding the ‘followed’ remark, you can read about it on my personal site:

I should feel so honoured.

Google Alerts Getting Smarter

Monday, December 24th, 2007

Google Alerts lets the user define keyword lists and phrases, which when found by Google while crawling and indexing web sites - will send a user a notification about that particular occurrence.

Historically, it always appeared as though the technology behind the alerting system was quite simple - literally matching the keyword and phrases that the user had nominated. Recently alerts have been generated that don’t strictly meet the keyword list and phrase requirements for a given page. It seems as though Google are using all of the additional meta data about a web site and the content to infer certain pieces of information.

As an example, I was recently notified about the my name being used within the post about extending the Nintendo Wii. If you view that particular item, you will not find the phrase “alistair lattimore” anywhere within it. Just to be sure, I have also ruled out my name within the RSS feeds generated by the site as well.

Putting the tinfoil hat on for a second, there is a raft of information that Google know about me already:

  • I have a Gmail account
  • The same Gmail account is associated to Google Analytics, Google Reader, Google Webmasters, Google Adwords and Google Adsense.
  • Within Google Webmasters, I monitor my personal blog and this site.
  • Within Google Analytics, I monitor my personal site and this site.
  • Within Google Reader, I subscribe to the feed of both sites.
  • Google are a domain registar, which means they could theoretically see that I purchased both domains.
  • I have linked in both directions between the two sites in the past.

When you start to see how all of that information is inter-linked, it becomes quite easy to see how Google can provide insightful results through their various services. Of course if you take the tin foil hat off and look at the more standard items such as web site content, my name is listed in the title on the front page and also on the about page. Those two bits of information might have been all it took, who knows.

If the technology behind that flexibility has a high level of accuracy in determining or inferring that information, it really is an excellent service. In the above example, if Google hadn’t of inferred my name as being associated to that document - I would have never found out about it via the alerting system. Granted in this particular example, it makes no difference as I know I wrote it - however for all other content on the internet it really lifts the products capability.

Extending Nintendo Wii

Sunday, December 23rd, 2007

Johnny Lee, a human computer interaction student has released a series of videos demonstrating alternative uses for the Nintendo Wii gaming console. At this stage, Johnny has released videos showcasing:

  • Minority Report style hand gestures to control your computer
  • low cost interactive white boards
  • head tracking for desktop virtual reality displays

The cool factor of the videos has certainly caught the eyes of a lot of people as Johnny is reporting the 30 days following his initial video has seen over half a million unique visitors into the site and coverage from half a dozen of the top technology sites on the net.

Also, for those that care the videos above are accompanied by their respective .NET code samples.

Django Northwind Coming Soon

Saturday, December 15th, 2007

Whenever someone in the Microsoft development world has needed to demonstrate virtually anything utilising a database, the Microsoft Northwind database has been used.

The beauty of the Northwind database being so widely recognised and frequently reused, is that developers around the world don’t need to concern themselves with learning or understanding a new database schema every time another developer wishes to provide an example. Instead, the developer doing the example can simply state that his or her example is based off of the Northwind database and by proxy of its popularity, the majority of the readers will immediately understand.

Since the Microsoft Northwind database is so popular and used so widely, it seemed like a good idea to make it available for anyone wanting to do some simple examples using Django. I’ll be releasing the initial version of it in the coming days, keep your eyes peeled if you’re interested.

Google Analytics & URL Rewriting Caveats

Thursday, December 13th, 2007

As the internet has matured and web sites have aged and expanded over the years, it has now become common place for web site owners to restructure their web sites to increase the sites accessibility and search engine effectiveness.

During the restructuring process, less savvy web masters reorganise their web sites without any concern for the impact it might have to their search engine rankings, referrals and user experience while more savvy web masters understand that cool URL’s don’t change. That isn’t to say that the content that was originally published against that URL must remain there, just that the URL exists so that anyone linking into it don’t receive missing document or HTTP 404 error.

When restructuring web sites, the savvy web master mentioned earlier requires a way to make an existing URL redirect to its new URL after the restructure. The two common methods to handle the redirection are:

  • It is perfectly acceptable to use a standard HTML web page with the tracking code installed and a meta refresh to redirect a user from the old to the new. This method does have the down side that all of the redirections for the web site are scattered throughout.
  • Another solution is offloading the redirection into a utility such as the Apache mod_rewrite module or the equivalent ISAPI_Rewrite for IIS. Using this method allows the web master to place all of the URL redirection in once place for easy management.

Under normal conditions such as option one above, where Google Analytics is installed on every web page within a site - it’s possible for the service to collect a complete click stream for the site. Google Analytics is also capable of handling standard HTTP redirects, so long as the tracking code is installed on both the referring and destination pages.

While it is convenient to use URL rewriting, there is a caveat which reduces the amount of information that Google Analytics can collect. The redirection will happen before any content is returned to the user, which means there is no opportunity for Google Analytics tracking code to fire. This results in Google Analytics reporting zero activity against the redirecting URL.

Changing The HTML Source Order Can Damage Search Engine Referrals

Saturday, December 8th, 2007

What follows is a quick digest of the impacts a web site owner might expect from reordering the HTML source on a web site, in particular what effects it can have on the search engine performance of a web site.

During the month of June, I decided to freshen up the layout of my personal site. When it was complete and without a whole lot of consideration - I published the new design onto my site. I sat back and admired my work for a little while until a few days later when I started to notice a decline in the number of natural search engine referrals. At the time, I didn’t bother to look into it and associated it to random flux on the internet. A few days later I checked the statistics again to confirm that it had recovered, only to find that not only had it not recovered but that it had dropped further.

After investigating the problem, I immediately realised that I had changed the HTML source order within my WordPress template. After the change, the primary content of the site was now placed at the bottom of the HTML document with a large amount of less important content above it. Worst yet, the information listed first within the HTML was identical for the entire site, as it was related to the sidebar which is largely static.

The table below shows the number of Google search engine referrals per month to the site. As you can see, the monthly referrals have been steadily increasing from the start of the year until they started to drop in June. Realising what had happened, I took the hit on the search engine referrals to see just how far it would drop down if it were left for a complete month. The ordering of the HTML was not restored until the beginning of August, as such July represents a complete month with the suboptimal ordering of the HTML.

  Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov
Referrals 9899 10395 13206 13281 13200 10942 8426 10580 12648 12084 12361

The change in the number of natural search engine referrals was caused by what was being listed within Google as the snippet for each page within the site. Ordinarily, the primary content is listed toward the top of the HTML document and as such, it is featured heavily within the snippet. After reorganising the order of the HTML, the snippet within the search engine results was displaying information about the current list of months in the sidebar of the site. As a by product of the snippet not being contextually relevant to the title of the page, the click through rate plummeted.

In an ideal world, a webmaster should be able to change their site layout as frequently as they choose without it impacting their search engine ranking and associated click through rate. In this particular case, changing the layout and unknowingly the HTML source order, had a significant knock on effect as I wasn’t controlling what was being displayed within the search engines specifically via a meta description tag. By not specifying it directly, I was relying on the search engines to automatically generate or choose one on my behalf and after changing the HTML ordering, the choice was suboptimal.

Measuring The Impact Of Page Titles

Monday, December 3rd, 2007

Following on from my simple experiment regarding the performance of different Adsense themes, it seemed like as good a time as any to start another experiment surrounding the impact of web page titles on search engine performance.

Out of the box, WordPress doesn’t ship with particularly search engine friendly page titles:

  • {blog title} » {blog archive} » {page title}

The biggest problem with the default title format from a search engine optimisation point of view, is that it contains relatively useless information in the title, positioned in the highest visibility location. The useless information I’m referring to, is of course the blog title and that a given page is within your blog archive. Some people might argue that your blog title is very important and in some cases that is the case, however in my opinion that isn’t the norm. As for the fact that a web page is within your archive, that has little significance, as publishing content online is essentially places it into an archive immediately - its called the internet.

To move the highest importance keywords and phrases into the highest visibility location, I have opted to place the page title at the start of the title tag. Since I don’t think users care about a web page being archived, I’ve also opted to drop that from the page title too. These changes have resulted in the page titles that you’re seeing currently, which take the form:

  • {page title} | {blog title}

Given that #if debug is taking a very small amount of traffic currently, any changes to the site at the moment are highly visible within the web statistics. I’m hoping that with the changes to the page title format, the search engine referrals will increase by a figure of over 10%. It is a modest figure, however given that I have so little content at the moment - it is something that I hope is attainable.

Stopping Search Engines Indexing Website Maintenance Pages

Sunday, December 2nd, 2007

There is no schedule for when a search engine will or won’t turn up on your web sites door step and starting indexing it; what happens when they turn up unannounced during scheduled maintenance? Under normal conditions, the search engine spiders will notice the difference between what they crawled last time and will include your changes into their index. Of course, you really don’t want your ‘we are currently performing scheduled maintenance and expect to be back in 1 hour’ message showing up in search engine results when your users enter an appropriate query.

To stop search engines indexing your site while it is in maintenance mode, there are two simple solutions available:

HTTP 404 response code
Search engines don’t immediately remove your web pages from their index because they cannot access it on a given request; just like they won’t remove it if your site is returning an internal server error. Instead, they will take notice that they attempted to crawl a given web page at time particular time and try again later. Only after repeatedly failing to retrieve the document will they mark that particular page as being non-existent and remove it from their index.
META no-index tag
When a search engine spider encounters a no-index meta tag, they should immediately abort indexing that particular page. After the scheduled maintenance is over and the spiders return, the no-index flag is no longer present - so the spiders will proceed with the crawl as normal.

Next time your site is under maintenance, make sure you’ve implemented one of these point or you could be very surprised what’ll show up in the search engine results the following day!