Google Alerts lets the user define keyword lists and phrases, which when found by Google while crawling and indexing web sites – will send a user a notification about that particular occurrence.
Historically, it always appeared as though the technology behind the alerting system was quite simple – literally matching the keyword and phrases that the user had nominated. Recently alerts have been generated that don’t strictly meet the keyword list and phrase requirements for a given page. It seems as though Google are using all of the additional meta data about a web site and the content to infer certain pieces of information.
As an example, I was recently notified about the my name being used within the post about extending the Nintendo Wii. If you view that particular item, you will not find the phrase “alistair lattimore” anywhere within it. Just to be sure, I have also ruled out my name within the RSS feeds generated by the site as well.
Putting the tinfoil hat on for a second, there is a raft of information that Google know about me already:
- I have a Gmail account
- The same Gmail account is associated to Google Analytics, Google Reader, Google Webmasters, Google Adwords and Google Adsense.
- Within Google Webmasters, I monitor my personal blog and this site.
- Within Google Analytics, I monitor my personal site and this site.
- Within Google Reader, I subscribe to the feed of both sites.
- Google are a domain registar, which means they could theoretically see that I purchased both domains.
- I have linked in both directions between the two sites in the past.
When you start to see how all of that information is inter-linked, it becomes quite easy to see how Google can provide insightful results through their various services. Of course if you take the tin foil hat off and look at the more standard items such as web site content, my name is listed in the title on the front page and also on the about page. Those two bits of information might have been all it took, who knows.
If the technology behind that flexibility has a high level of accuracy in determining or inferring that information, it really is an excellent service. In the above example, if Google hadn’t of inferred my name as being associated to that document – I would have never found out about it via the alerting system. Granted in this particular example, it makes no difference as I know I wrote it – however for all other content on the internet it really lifts the products capability.