Source Control Commit Visualisation

Software development relies on source control management software such as CVS, Subversion, SourceSafe, Bitkeeper, Mercurial, Git and the like to track and manage the changes in the source code over time. As a project progresses, developers come and go, contractors come and go and the activity on a given project ebs and flows as required.

Attempting to visualise who, what and how much of a project is changing is quite complex as there are so many variables – however Michael Ogawa has built a project named code_swarm which does just that. Instead of providing tabular or static images to help visualise a projects changes, he has managed to animate it into something quite spectacular.

Following are five different code swarm visualisations of popular open source projects:

The amazing thing that a visualisation such as code_swarm provides, is to show just how many people actively participate in a given open source project, how much each of them participates and what sort of tasks they are normally performing on that project. As an example, comparing the number of different people in SQLAlchemy compared to Django isn’t a competition – Django is ahead by a mile, though compared to Apache, the others seem insignificant.

Posted in General, Programming | Tagged , , , , , , , , , , , | Leave a comment

Automattic Account Management

Automattic, the fine folk behind the WordPress blogging engine, wordpress.com and Akismet have started merging accounts between wordpress.com and Gravatar.

Toward the start of 2006, I signed up for an account with wordpress.com and for obvious reasons, I’ve never needed to use it. Not that long afterward, I used the API key that was provided with my wordpress.com to fight spam using Akismet.

Today I signed into Gravatar, a web service acquired by Automattic late in 2007 to check some settings and was presented with some information about upcoming changes to my existing account. Not having used my wordpress.com account actively, I had to go sifting through signup emails from two years ago; not unsurprisingly, my account still had the randomly generated password!

Within two minutes of finding my wordpress.com account information, I’d followed the prompts and merged/associated it with my existing Gravatar account. The way in which this is being handled is great, it’s a passive change that happens when you next sign in and if you do have an existing wordpress.com account – you can associate them together.

The best thing is now I have one less login to worry about and I can see all of my information for all Automattic assets in one place!

Posted in Services | Tagged , , , | Leave a comment

Changing Temporary (302) To Permanent (301) Redirects

It’s common place to register multiple variations of a domain to protect the brand or product that the domain is related to. At some point, a web master must choose what he or she is going to do with the variations, the normal choices are:

  • Do nothing, simply owning them is sufficient
  • Set them up, alias them so the site content is accessible via any of the variations
  • Set them up and redirect the variations to the primary domain

This post is going to discuss the third option, as I have recently seen what I’d consider strange results in that space.

Setting The Scene

Imagine you sell Product A and you have a web site at http://producta.com. For three years http://producta.com has been used as the main web site, however in an exercise for brand consistency – you opt to move the web site to http://brandproducta.com.

The change of domain is handled using a temporary redirect and is successful. Soon after the move, http://producta.com is no longer visible in the search engines and has been replaced with http://brandproducta.com.

Weirdness

As a clean up exercise, I recently went through and updated the redirects on the domain variations (including http://producta.com) to use permanent (301) redirects. At the time, I didn’t think I’d see any changes in the search engine result pages, as http://producta.com hasn’t been in use for quite some time and all that was changing was a temporary (302) redirect into a permanent (301) redirect.

What has happened is that a brand+producta search term which would have returned http://brandproducta.com as the first listing, is now sharing that space with http://producta.com. Since that domain hasn’t been in use for such a long time, Google are using the results from DMOZ for the title and snippet.

Explanation

I’ve read through the information that Matt Cutts provided when he discussed 302 redirects back in January 2006. There is a lot of good information on that page and also the previously linked article about URL canonicalisation – however nothing that I felt described what I have outlined above.

What I think has happened is that the temporariness of the 302 redirect has kicked in. Google have been seeing the 302 redirect from http://producta.com into http://brandproducta.com for quite some time and have been checking it periodically since it was temporary. When something changed (hence temporary) – Google kicked back into gear and displayed the results from http://producta.com.

Since it is now showing a 301 permanently moved redirect, I suspect that within a short amount of time Google will remove the listing for http://producta.com and it’ll be replaced by http://brandproducta.com.

I’d love to hear from someone if they have a more comprehensive answer on the results I’ve seen.

Posted in Search | Tagged , , , , , , | Leave a comment

Google Analytics Benchmarking Verticals

In March, Google announced a new feature for Google Analytics named Benchmarking. One of the most compelling reasons to opt-in to the benchmarking component of Google Analytics is to compare how your sites perform against other sites.

Once the data from your sites has been analysed by Google Analytics, it is then possible to compare the following metrics against other sites:

  • Visits
  • Pageviews
  • Pages/visit
  • Average Time on Site
  • Bounce Rate
  • Percentage New Visits

Google Analytics allows the user to choose which one of a number of industry verticals to place their site into for comparison; telecommunications, travel, business and news are but just a few. This industry specific targeting allows for comparison against sites which are similar in theme – vitally important, as you wouldn’t want to compare the statistics of a heavily ecommerce driven site against that of a social networking site.

To make sure that the first two metrics above make sense to each site, Google Analytics automatically places a site into one of three categories based on the number of visits – small, medium or large. When viewing benchmarking data about a site, only the data from other sites within your size category are visible. As such, if you have a small but up and coming site – it isn’t possible to see what the market leader may potentially be doing.

So far, we can compare six simple but very useful metrics against similarly sized web sites within the same industry vertical, though specifying a vertical for comparison is completely optional. While very useful, having a better unerstanding of exactly what you’re comparing against would be handy. I’d personally like a little clarification on the following points:

  • What is the boundary in visits per time period for small, medium & large?
  • How long does a site need to sustain the number of visits per time period to officially be moved between size categories?
  • If a site does move between categories, as a user – am I notified that it has happened?
  • If I use a country specific domain, am I comparing only against sites of a similar size within the country specific domain name space or is it a global comparison? I find this point quite important, as users from different countries have different usage patterns.
  • Does placing your site within a country via Google Webmasters have an impact on the previous point – in case you use a top level domain such as a .com/.net?
  • How are sites placed into an industry vertical and is it possible to see what vertical a given site has been placed in? The latter part of that question is important, as if your site has been placed into the wrong sub-category list and as a user you are nominating a different category (which you feel is the correct one), it could be providing you a different skew of the results.

The benchmarking service from Google Analytics has only just been launched and is still marked beta. I expect as more people start sharing their information with Google, more and more questions will get raised, more will be answered and the product will continue to evolve as do most Google products.

Posted in Services | Tagged , , | Leave a comment

Google Analytics Ecommerce Outage

Six weeks ago, along with colleagues from my work place – we implemented Google Analytics Ecommerce functionality within a handful of sites.*

The statistics had been pouring into Google Analytics and then around April 25, same time that Australia has a long weekend to celebrate ANZAC Day, the transactions going through the site started to drop. At first I didn’t think much of it, in the tourism industry it is common place to see lower periods of activity over a long weekend.

I continued to keep an eye on the transactions being reported and expected them to resume the next work day, however that didn’t happen. At this stage, I investigated the issue further to see what the actual figures were and my suspicion was confirmed – the transactions going through the site had dropped, however no where near the levels that the ecommerce functionality within Google Analytics was suggesting.

A fortnight has passed and I haven’t seen any noises about it online and then today when I logged into Google Analytics, the dashboard included a notice stating that analytics was delayed in processing data from 30th April to 5th May and that ecommerce data across that period was unable to be recovered. I’m pleased that the Google Analytics team have posted a notice about it, at least that confirms that it wasn’t something that we had done which inadvertently stopped us reporting the transactions into Google.

Two things:

  1. The image above suggests that the outage began on the 27th April, not 30 April as Google suggested. Either the sudden drop was the lull of the long weekend or Google have reported the wrong date?
  2. Why did it take a fortnight to post a notice about the unplanned outage? While I appreciate it wasn’t going to change anything, if I had of known that there was an outage in place – I wouldn’t have spent any time investigating the lull and just moved on.

* For those that have an ecommerce site and aren’t utilising the ecommerce functionality within Google Analytics, I cannot impress on you how amazing this feature is; the insight it provides into the revenue that your site(s) generate is amazing.

Posted in Services | Tagged , , , , | Leave a comment

Gaming Google Reader For Higher Click Through Rates

Everyone looking to promote their web sites are always looking for ways to get more traffic, higher click through rates and better conversion rates (whatever a conversion might represent).

For a long time, publishers around the world were looking at ways of exploiting small omissions in how the search engines crawled, indexed and subsequently displayed a result within the search engine listings. One of the most popular methods was adding in non-standard characters into the <title> element for a page, in an attempt to make it stand out within the search engine results.

It didn’t take long for the search engines to cotton onto this tactic and it was shut down – however I’ve recently noticed that a handful have slipped through into Google Reader.

Based on the image above, does the additional star at the start of the title catch your attention? For me it immediately grabbed it, as its similar to the star used by Google Reader to remember a feed item for later.

For comparisions sake, you can see that Google search is filtering that same non-standard character out of the search results; it’s a matter of time before the Google Reader team pug that hole.

Posted in Services | Tagged , , , , | Leave a comment

Detecting Duplicates Within XML Feeds

The same web page, shown within Google Reader multiple times

At the end of January, I commented on and offered a suggestion to the Google Reader team about how to improve their product by removing duplicate feed items.

At the time, I didn’t think to post a screenshot to aid in my explanation but remembered to grab one recently and felt it would help explain just how annoying this can be within Google Reader.

From the screenshot, you can see that I have highlighted eight different references to an article by Simon Willison about jQuery style chaining with the Django ORM. When a human looks at that image, it is abundantly clear that each of the eight highlighted references are ultimately going to link through to the same page.

The Google Reader team could use this new feature to their advantage by collapsing the duplicates and offering a visual clue that the item is hot/popular based on the number of references found to the same article. Google search already has the notion of the date/time when content is published, so using that information along with the number of inbound references they discover, the number of duplicates collapsed within your RSS streams could be quite useful.

I know I would really love better facilities within Google Reader for detecting duplicates within RSS, it’d just remove so much noise from the information stream when you’re trying to keep a eye on what is happening within the community.

Posted in Services | Tagged , , , , , , , | Leave a comment

Non-English Languages & Whacky Domain Names

Asian language glyphs from the web sites of James Holderness While doing a little research for an upcoming article tonight, I revisited the web site of James Holderness. If the name looks familiar, it’s because I linked to him in January regarding detecting duplicate items within RSS feeds.

When I stumbled onto his site, I couldn’t believe the domain that he was using:

  • http://www.xn--8ws00zhy3a.com

as it seemed completely unmanageable for a normal person. At the time, it seemed so unmanageable for a normal person that I thought James must have been participating in some obscure SEO challenge; today I realise that isn’t the case at all.

The image shown above is displayed on James site beside his name. It turns out that those three glyphs some how translate into the obscure domain listed earlier as can be seen by the following screenshot from Google Search:

Non-English written characters or glyphs displayed within a Google Search result as the domain name

For those that are interested, Yahoo!, MSN and Live search all showed the English translation of the foreign language in the domain name and not the glyph based version – though were more than happy to display the glyphs within the title of the web site.

Does anyone know how a glyph is translated into the standard English alphabet and more so, what within the domain name delineates one glyph from the next?

Posted in Search | Tagged , , | 1 Comment

.htaccess Inheritance Gotcha

I recently set up a subdomain within a CPanel hosting account and ran into a strange problem.

After the subdomain was built, DNS propogated, FTP access and a database set up – I uploaded the latest version of the ever propular WordPress blogging software. As the installation instructions suggested, about two minutes later WordPress was up and running. Unfortunately the image overlays for the TinyMCE rich text editor were not displaying correctly; the buttons were present but the overlay that depicts a capital B for bold or an I for italics were not showing.

Knocking off the simplest things first:

  • F5
  • CTRL+R and CTRL+F5 to force the browser to reload all content
  • Cleared browser cache
  • Uploaded the wp-includes folder again, in case any of the images failed to upload successfully
  • Checked that I could view the images in question manually in the browser by typing in the URL, which worked

at this point I was beginning to draw a bit of a blank as to what it might have been.

As is often the case, I left the problem alone for a little while and the solution popped into my head. To enable the friendly URL’s within WordPress, I needed to either make the .htaccess file writable or upload one with the appropriate configuration in it. As a matter of simplicity, I copied the .htaccess file from the main WordPress installation on the site and dropped it into the subdomain installation. Copy and paste, the bain of all evil.

Back in May 2006, people from MySpace were hotlinking images from my site. The cost of popularity from my anonymous MySpace friends was pushing my web hosting account well over its monthly data limits and I was forced to block their access using some simple rules in my .htaccess file.

Since I copy and pasted my .htaccess file into the subdomain (which resides in a folder under the primary account) – the settings in the parent .htaccess file were inheriting into the subdomain account. The result was that any requests from the subdomain that didn’t meet the hotlinking requirements were being blocked by Apache and mod_rewrite.

The solution of course was straight foward, remove the restrictions in the .htaccess file that I had uploaded into the subdomain and suddenly the images from the WordPress admin and TinyMCE started showing up as you’d expect.

A work collegue of mine would term this a junior error.

Posted in Programming, Web Development | Tagged , , , | Leave a comment

Google Analytics Benchmarking

Google have announced a new feature for Google Analytics named Benchmarking. The Google Analytics Benchmarking service is still in its beta phase, however aims to allow analytics users to compare or benchmark their web sites against other web sites.

The benchmarking service from Google is opt-in, not default-in. If a user would like to view benchmarking data for their sites, they must first opt-in to allow Google to use their own web statistics. Of interest, opting in is on a per account basis – not per web profile. As such, if you have 50 web profiles set up within your account – opting in will share all of your web profiles data with Google.

After opting into the benchmarking service, Google proceed to anonomise the users web statistic information. What this means is that any identifiable information within the web statistics is removed and only aggregate information is held; as such it isn’t possible to spy on your competitor directly or visa versa.

At this early stage, the benchmarking data is fairly high level but provides you comparative metrics on:

  • Visits
  • Pageviews
  • Pages/visit
  • Average Time on Site
  • Bounce Rate
  • Percentage New Visits

The usefulness and ultimately the success of the benchmarking service is reliant on how many Google Analytics users opt-in to sharing their web statistics with Google. If the greater user base don’t feel inclined to share their web statistics with Google in this manner, then the comparative nature of what they are offering is hamstrung to some degree.

Posted in Services | Tagged , | Leave a comment