Tag Archives: google

Google Analytics Ecommerce Outage

Six weeks ago, along with colleagues from my work place – we implemented Google Analytics Ecommerce functionality within a handful of sites.*

The statistics had been pouring into Google Analytics and then around April 25, same time that Australia has a long weekend to celebrate ANZAC Day, the transactions going through the site started to drop. At first I didn’t think much of it, in the tourism industry it is common place to see lower periods of activity over a long weekend.

I continued to keep an eye on the transactions being reported and expected them to resume the next work day, however that didn’t happen. At this stage, I investigated the issue further to see what the actual figures were and my suspicion was confirmed – the transactions going through the site had dropped, however no where near the levels that the ecommerce functionality within Google Analytics was suggesting.

A fortnight has passed and I haven’t seen any noises about it online and then today when I logged into Google Analytics, the dashboard included a notice stating that analytics was delayed in processing data from 30th April to 5th May and that ecommerce data across that period was unable to be recovered. I’m pleased that the Google Analytics team have posted a notice about it, at least that confirms that it wasn’t something that we had done which inadvertently stopped us reporting the transactions into Google.

Two things:

  1. The image above suggests that the outage began on the 27th April, not 30 April as Google suggested. Either the sudden drop was the lull of the long weekend or Google have reported the wrong date?
  2. Why did it take a fortnight to post a notice about the unplanned outage? While I appreciate it wasn’t going to change anything, if I had of known that there was an outage in place – I wouldn’t have spent any time investigating the lull and just moved on.

* For those that have an ecommerce site and aren’t utilising the ecommerce functionality within Google Analytics, I cannot impress on you how amazing this feature is; the insight it provides into the revenue that your site(s) generate is amazing.

Gaming Google Reader For Higher Click Through Rates

Everyone looking to promote their web sites are always looking for ways to get more traffic, higher click through rates and better conversion rates (whatever a conversion might represent).

For a long time, publishers around the world were looking at ways of exploiting small omissions in how the search engines crawled, indexed and subsequently displayed a result within the search engine listings. One of the most popular methods was adding in non-standard characters into the <title> element for a page, in an attempt to make it stand out within the search engine results.

It didn’t take long for the search engines to cotton onto this tactic and it was shut down – however I’ve recently noticed that a handful have slipped through into Google Reader.

Based on the image above, does the additional star at the start of the title catch your attention? For me it immediately grabbed it, as its similar to the star used by Google Reader to remember a feed item for later.

For comparisions sake, you can see that Google search is filtering that same non-standard character out of the search results; it’s a matter of time before the Google Reader team pug that hole.

Detecting Duplicates Within XML Feeds

The same web page, shown within Google Reader multiple times

At the end of January, I commented on and offered a suggestion to the Google Reader team about how to improve their product by removing duplicate feed items.

At the time, I didn’t think to post a screenshot to aid in my explanation but remembered to grab one recently and felt it would help explain just how annoying this can be within Google Reader.

From the screenshot, you can see that I have highlighted eight different references to an article by Simon Willison about jQuery style chaining with the Django ORM. When a human looks at that image, it is abundantly clear that each of the eight highlighted references are ultimately going to link through to the same page.

The Google Reader team could use this new feature to their advantage by collapsing the duplicates and offering a visual clue that the item is hot/popular based on the number of references found to the same article. Google search already has the notion of the date/time when content is published, so using that information along with the number of inbound references they discover, the number of duplicates collapsed within your RSS streams could be quite useful.

I know I would really love better facilities within Google Reader for detecting duplicates within RSS, it’d just remove so much noise from the information stream when you’re trying to keep a eye on what is happening within the community.

Non-English Languages & Whacky Domain Names

Asian language glyphs from the web sites of James Holderness While doing a little research for an upcoming article tonight, I revisited the web site of James Holderness. If the name looks familiar, it’s because I linked to him in January regarding detecting duplicate items within RSS feeds.

When I stumbled onto his site, I couldn’t believe the domain that he was using:

  • http://www.xn--8ws00zhy3a.com

as it seemed completely unmanageable for a normal person. At the time, it seemed so unmanageable for a normal person that I thought James must have been participating in some obscure SEO challenge; today I realise that isn’t the case at all.

The image shown above is displayed on James site beside his name. It turns out that those three glyphs some how translate into the obscure domain listed earlier as can be seen by the following screenshot from Google Search:

Non-English written characters or glyphs displayed within a Google Search result as the domain name

For those that are interested, Yahoo!, MSN and Live search all showed the English translation of the foreign language in the domain name and not the glyph based version – though were more than happy to display the glyphs within the title of the web site.

Does anyone know how a glyph is translated into the standard English alphabet and more so, what within the domain name delineates one glyph from the next?

Google Analytics Benchmarking

Google have announced a new feature for Google Analytics named Benchmarking. The Google Analytics Benchmarking service is still in its beta phase, however aims to allow analytics users to compare or benchmark their web sites against other web sites.

The benchmarking service from Google is opt-in, not default-in. If a user would like to view benchmarking data for their sites, they must first opt-in to allow Google to use their own web statistics. Of interest, opting in is on a per account basis – not per web profile. As such, if you have 50 web profiles set up within your account – opting in will share all of your web profiles data with Google.

After opting into the benchmarking service, Google proceed to anonomise the users web statistic information. What this means is that any identifiable information within the web statistics is removed and only aggregate information is held; as such it isn’t possible to spy on your competitor directly or visa versa.

At this early stage, the benchmarking data is fairly high level but provides you comparative metrics on:

  • Visits
  • Pageviews
  • Pages/visit
  • Average Time on Site
  • Bounce Rate
  • Percentage New Visits

The usefulness and ultimately the success of the benchmarking service is reliant on how many Google Analytics users opt-in to sharing their web statistics with Google. If the greater user base don’t feel inclined to share their web statistics with Google in this manner, then the comparative nature of what they are offering is hamstrung to some degree.

Rapid Development Using Django

In February, Jesper Nøhr wrote about taking an idea from conception to profitable web site in 24 hours. The project involved building an advertising product for the indie crowd, so they could advertise their products throughout other web sites in a similar fashion to how Google Adwords & Google Adsense works.

The final product named The Indiego Connection, allows advertisers to sign up and their account is manually verified to make sure that it meets their indie requirements. Once their account has been approved, they can go about configuring their advertising, which is then displayed throughout other web sites.

The really interesting thing for me about this project was the technical aspects of it, which involved:

Jesper wrote the front end of the site using Django, as he uses it for his day job. Given the demanding time frame that the product was built in, I expect that as many of the existing applications were utilised – such as auth. The prototype for the advertising server was built using CherryPy and once Jesper was satisfied with how it was constructed, moved that into Erlang for the lightweight threading and performance.

About 24 hours after starting the project, Indiego Connection was pushed into the wild. Word got out about a free advertising product for the indie crowd quickly and within hours they had over 100 users.

In any sort of normal environment, working an idea from start to finish in 24 hours would seem nearly impossible; especially if technology is involved. Through clever use of the tools, its allowed Jesper to rapidly develop a complete product in a short space of time.

Google Account Signin With CAPTCHA

Google Account login featuring CAPTCHA for additional securityTonight I was presented with a Google login page which was different in a few ways:

  • size and shape of the control were different
  • instead of using an in page control, it took me to a completely new page
  • required additional CAPTCHA validation

I suspect this may have been triggered by logging in and out of various Google products tonight, where I closed a tab but didn’t close the browser, opened new tabs and logged in and out and it might have had conflicting session information.
Does anyone know what causes this type of login prompt to be thrown up by Google?

Google Reader Duplicate Item Improvement

One of the features that I love about RSS, is that it allows users to keep their finger on the pulse of certain topics very easily. As some people may know, I quite like the Python web framework Django and I use Google Reader to help me keep up to date about what is happening within the greater Django community. I do this by subscribing to RSS feeds of people who I know regularly write about the product but also by utilising online social booking marking sites such as del.icio.us and ma.gnolia.

I recently read an article by James Holderness about detecting duplicate content within an RSS feed (via). For those not bothered with the jump, James outlines different techniques that the top x many feed reading products use to detect duplicate RSS content, which ranges from using the id field within the RSS down to comparing title and description information.

Back to the improvement, which is related to the information that James provided. When I subscribe to the social booking marking sites, they end up providing back a huge range of content matching certain criteria. The ones I’m subscribing to at the moment for Django are:

As you can imagine, each of these services has a different and overlapping user base. Each of which will find common items throughout the internet and bookmark them each day. When that stream of information is received by Google Reader, it will display half a dozen of the same unique resource, but masked by different user accounts within their bookmarking tool of choice.

What would be a great optional feature to add into Google Reader would be the ability to detect duplicate items even when they are sourced via the same domain or different domains.

The trick to something like this would be identifying the pattern, so as to allow Google to use an algorithm to flag it. For the sake of this concept, I think it’d be reasonable to consider items posted into social bookmarking sites and an aside or link drop in a blog to be reasonably similar.

My initial concept would involve comparing the amount of content within an item. If there are less than a predefined limit of words and a small number of links, then that item might be considered to be a link drop. You could apply that logic not only to social bookmarking sites but also to the standard blog, where an author might find something cool they want to link to.

The next thing up for consideration might be which items to remove as duplicates and which to include. News of any kind on the internet tends to reverberate throughout it quite quickly, so it’s common to find the same information posted many times. As the vibrations are felt throughout, people will tend to link back to where they found that information (as I did above). Google Reader could leverage off the minty fresh search engine index to help with this by using the number of attribution links passed around. As a quick and simple example, imagine the following scenario:

  • Site A has the unique Django content that I’m interested in
  • Sites B through Z all link to site A directly
  • Some sites from C through Z also link back to B, which is where they found that information

I don’t subscribe to site A directly, however some of the sites B through Z have been picked up by the social networks. Using the link graph throughout those sites, it’d be possible to find out which one(s) among that list are considered authoritative (based on attribution or back links) and start filtering based on that. It might then be possible to use other features of the Google Search index to do with theme, quality, trust to filter it further.

I think a feature like that within Google Reader would be fantastic, especially if I could apply those sorts of options on a per feed or folder basis. That way, I could group all of the common information together (Django) and have Google Reader automatically filter out the duplicates that matched the above criteria.

I’m sure the development team from Google Reader will hear my call; who knows, in a few months maybe a feature like this could work its way into the product.

You Know You’re Popular When

Today my personal site was pinged by Live Business Radio. As I do as a matter of course, I checked out the Live Business Radio web site and was disappointed to find that it’s nothing more than your average run of the mill site ridden with advertising, spam and buy this crap product now.

I get pinged by web sites regularly that don’t have anything to do with me and when I saw the site, I was about to abandon it immediately. Just before I did though, I scanned over the article and noticed that I’d been featured in a list of sites with a high Google Pagerank which offered links which are ‘followed’. It wouldn’t be a good filthy spammers site if they didn’t offer you software (for a fee) which you could use to spam take advantage of the followed links.

If you’re not quite sure what I’m referring to regarding the ‘followed’ remark, you can read about it on my personal site:

I should feel so honoured.

Google Alerts Getting Smarter

Google Alerts lets the user define keyword lists and phrases, which when found by Google while crawling and indexing web sites – will send a user a notification about that particular occurrence.

Historically, it always appeared as though the technology behind the alerting system was quite simple – literally matching the keyword and phrases that the user had nominated. Recently alerts have been generated that don’t strictly meet the keyword list and phrase requirements for a given page. It seems as though Google are using all of the additional meta data about a web site and the content to infer certain pieces of information.

As an example, I was recently notified about the my name being used within the post about extending the Nintendo Wii. If you view that particular item, you will not find the phrase “alistair lattimore” anywhere within it. Just to be sure, I have also ruled out my name within the RSS feeds generated by the site as well.

Putting the tinfoil hat on for a second, there is a raft of information that Google know about me already:

  • I have a Gmail account
  • The same Gmail account is associated to Google Analytics, Google Reader, Google Webmasters, Google Adwords and Google Adsense.
  • Within Google Webmasters, I monitor my personal blog and this site.
  • Within Google Analytics, I monitor my personal site and this site.
  • Within Google Reader, I subscribe to the feed of both sites.
  • Google are a domain registar, which means they could theoretically see that I purchased both domains.
  • I have linked in both directions between the two sites in the past.

When you start to see how all of that information is inter-linked, it becomes quite easy to see how Google can provide insightful results through their various services. Of course if you take the tin foil hat off and look at the more standard items such as web site content, my name is listed in the title on the front page and also on the about page. Those two bits of information might have been all it took, who knows.

If the technology behind that flexibility has a high level of accuracy in determining or inferring that information, it really is an excellent service. In the above example, if Google hadn’t of inferred my name as being associated to that document – I would have never found out about it via the alerting system. Granted in this particular example, it makes no difference as I know I wrote it – however for all other content on the internet it really lifts the products capability.