Search engine optimisation consultants world wide have been pushing the URL canonicalisation wagon for quite some time. URL canonicalisation ensures that a given internet resource can only be reached by a single URL.
That might seem like a relatively straight forward task, however poorly configured web servers and the wide ranging quality of modern contentn management systems has meant that it isn’t as simple as first throught.
If there was ever going to be a company that you’d think would nail URL canonicalistion right across the board, it’d be Google. However while searching for the signup URL for a new Google Account [google account], I found something rather interesting – the first search result was https://google.com/accounts/.
There are two things wrong with that result:
- as a general rule of thumb, all Google products live under the www sub-domain and not the root domain
- more importantly, the SSL certificate is valid for http://www.google.com and not http://google.com
A couple of other slightly interesting bits about that result:
- a Google cache check for the www and non-www versions of that URL show the exact same crawl time
- a Google link check for the www and non-www versions shows the same number of links into both URLs
- based on the first point, it would appear that Googlebot is happy to crawl an secure page with a broken SSL certificate