Some time ago I stumbled onto a math based problem solving site for programmers named Project Euler. The Project Euler site describes itself as:
Project Euler is a series of challenging mathematical/computer programming problems that will require more than just mathematical insights to solve. Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required to solve most problems.
The motivation for starting Project Euler, and its continuation, is to provide a platform for the inquiring mind to delve into unfamiliar areas and learn new concepts in a fun and recreational context.
Possible solutions to the problems are verified through the Project Euler web site, where your successes are recorded if you sign up for an account. You don’t need to work through the problems in order, however doing so may be beneficial as work completed and previous questions is often reused.
I thought Project Euler would be a great exercise to explore more of the Python programming language. While I’ve read quite a bit about the Python language, I’ve not written anything substantial in it and I often find myself using the lowest common denominator in my approaches to solving problems as a by product.
My intention while working through the problems presented within Project Euler is to find a neater, cleaner method of solving the problem which is hopefully more pythonic. I’ll present my solution to the problems and with a little luck, I’ll receive some helpful improvements from the great programming community as well.
The Python web framework Django supports internationalistaion (i18n) for nearly 30 different languages already.
While reviewing the changesets flowing through the Django source repository, I often notice amendments to the internationalisation code and it got me thinking about how ‘complete’ the i18n status is for the languages that Django is attempting to support.
Enter a visually simple but very informative web site built using Google App Engine which polls the Django subversion repository periodically and compiles a table showing the percentage completion for each of the different languages.
I’m impressed that with nearly 30 different languages under their belt that the majority of them are reporting very solid percentage completion numbers, no wonder so many non-English speaking developers are using Django.
While doing a little research this morning, I stumbed onto a paid advertisement within Google for Ask.com, informing me that I could booking BreakFree hotels & resorts from within the Australian localised Ask.com portal.
Being the curious kind of person, I followed their advertisement and was quite shocked by how deceptive they were with their ad and also the page it took me to.
Instead of Ask.com providing some sort of useful service inside their portal, they provided 10 Google advertising results front and center which were displayed as though they were organic results, followed by actual organic results (click the image for an expanded screenshot of their handy work).
I don’t necessarily have a problem with then doing paid advertising within Google for services that they offer (though in this case, they don’t have a service relating to my search results which was very deceptive). However, I do have a beef with the way they frame or lack there of, of the paid results from Google within Ask.com search results. If they had placed the same 10 results in the right hand side gutter or boxed them with a different background colour – then at least the user would have a chance of knowing the difference.
I wonder whether or not that sort of behaviour falls within the Google terms of service? It actually reminds me of when Microsoft were advertising on Google for MSN Messenger and taking the user into more search results within Live Search.
Virtually every webmaster has heard of Google Webmaster Tools and use it regularly to check on the health of their site, unfortunately very few know of Live Search Webmaster Center which complements Microsoft Live Search.
Recently I wrote about the significant improvements that Live Search Webmaster Center has gone through, which has really boosted the product. To put the new enhancements through their paces, it seemed like a good idea to compare what it was displaying versus what Google Webmaster Tools was showing.
Live Search Webmaster Center showed that I didn’t have anything wrong with my robots.txt file, nor was I suffering from long and complex dynamic URLs – however I did have a handful of 404 errors through the site. Live Webmaster Center had picked up that I had linked to another site without the http:// in the href attribute, like:
- <a href=”www.domain.com/important/article/”>important article</a>
which when clicked, was delivering a 404 error on my site with a URL like:
- http://ifdebug.com/article/my-article/www.domain.com/important/article/
To my surprise, when I explored that same information within Google Webmaster Tools – they had not picked up that I had linked that article incorrectly.
Moral of the story, don’t put all your eggs in one basket. While Google hadn’t picked it up or had just compensated for my mistake – simple mistakes like that may have an adverse effect on less capable search engines.
This week Google Analytics received a small upgrade – specifically related to the login process.
Until now, no matter how often you use Google Analytics, as a user you were forced to login every time you returned to the site. It frustrates users so much that if you use Google Analytics quite a lot, it became a habit to leave a window open with Google Analytics logged in just for the simplicity.
With the latest update, the Google Analytics team are saying that you no longer need to login and that the process has been streamlined. I’d argue that only part of that statement is true, you do not need to authenticate – however it isn’t streamlined.
The majority of other Google services, once you’ve authenticated once and subsequently return – it reads in your Google Account information and you immediately have access to the service. For some reason, the Google Analytics team have chosen against a consistent authentication progress that is common amongst many other Google services and the user is forced to click a button to enter.
The process won’t be streamlined until it functions like Google Mail, Google Reader and so on. I welcome the improvement – at least I no longer need to type in my account information all the time – however since they already know that I’m authenticated, I shouldn’t need to click again to re-enter the application.
In the last few days, the Live Search Webmaster blog have posted about two significant improvements to the webmaster center, how Live Search crawls your site and more detailed backlink information.
Live Search Webmaster Center now supports the following four items, which are a great help in identifying problems with your site and how Live Search is spidering your content:
- File not found (404) errors, a straight forward date stamped account of the HTTP “404 File Not Found” errors that Live Search encountered when crawling the site. Conveniently, this includes broken links within your own site and sites that you are linking to.
- Pages Blocked by Robots Exclusion Protocol (REP), reported when Live Search has been prevented from indexing or displaying a cached copy of the page because of a policy in your robots exclusion protocol (REP).
- Long Dynamic URLs, reported when Live Search encounters a URL with an exceptionally long query string. These URLs have the potential to create an infinite loop for search engines due to the number of combination’s of their parameters, and are often not crawled. I haven’t come across one of these yet and so far I haven’t seen any documentation of what ‘exceptionally long’ means, so clarification on that point would be handy.
- Unsupported Content-Types, reported when a page either specifies a content-type that is not supported by Live Search, or simply doesn’t specify any content type. Examples of supported content-types are: text/html, text/xml, and application/PowerPoint.
In 2007, Microsoft removed the ability for users to drill into backlink data within Live Search. It took a long time, however that functionality has now been replaced within Live Search Webmaster Center and is looking quite promising.
Common functionality shared between the crawl information and back link data, is that Live Webmaster allows you to download the information CSV format. Possibly the best feature for a large complex site though, is that each of the above options can be filtered (search style) further by entering in a subdomain and/or directory to restrict the results to. The backlink interface additionally supports a top level domain in the search box, allowing you to isolate only back links originating from an Australian site by entering in .au.
Future Improvements
The interface doesn’t support paging of results, in case you want to step through a few pages without wanting to export information in CSV format. If you do want to download more information, there isn’t an option to export all information in a hit – you can only retrieve 1000 lines of data. I can appreciate that they don’t want to provide an ‘all’ option or that they want to limit how many can be fetched at once, however there isn’t a way to set 1000 items per page to download them and then go to the next page and download them. The other issue with the 1000 lines of data, is that there is no information on how the 1000 lines are selected. As an example, the backlink section uses the language ‘Download up to 1000 results’ – however there isn’t any indication of how the 1000 are selected.
Promising
While there is still room for improvement and really, when isnt there, I’m personally encouraged by the changes that Microsoft are making to Live Search Webmaster Center. The sooner services from Microsoft start to catch up to other services offered by the leaders – the sooner more businesses and webmasters will spend investing time into the Live Search product.
Approximately six months ago, I mentioned that I was going to conduct a small test regarding the impact that optimising the HTML <title> element has from a search engine optimisation stand point.
In December when I wrote that, #if debug was a very new site – in fact it has been online for exactly one month. It’ll come as no surprise that no one knew about the site, in fact even to this day a limited number of people know about the site. Fortunately though, I do have evidence to suggest that more know about it now than they did in December!
In the announcement, I had said that a 10% increase in traffic would have been considered a success. Given that the site was taking approximately 60 visits per week at that time – optimising the <title> attribute would need to increase that to around 65 visits to be considered a success.
In the image above, you can see if the effect of optimising the <title> element in the HTML. The change was made at the marker point directly above the 25 in “Nov 25, 2007 – Dec 1, 2007″. I’m not sure what caused the dip in traffic immediately after the change, however once it recovered – the increased traffic has been maintained or increased. The marker point second from the right delivered a whopping 67 visits for the week and as such I’m going to claim this a victory (even if it is very very small!).
In the following few months, this tiny site has grown from zero visits and has steadily been increasing month on month to a lofty figure a little over 400 visits per month! I realise that isn’t a lot of traffic by anyones measurement, however for a site that has had very little effort put in and next to no attention directed its way – it isn’t half bad.
Bitbucket is the latest project by Jesper Nøhr. If the name looks familiar, it’s because I wrote about a Jesper in March when he used Django and Python as a rapid development environment for an indy advertising product named Indiego Connection.
This time around, Jesper has moved gears to provide a hosting for a popular distributed version control system named Mercurial. I haven’t started drinking the distributed version control kool-aid just yet, however it has been gaining a lot of attention lately via another open source product named Git, developed by Linus Torvalds – the creator of the Linux kernel.
The Mercurial hosting provided by Bitbucket comes in a few different flavours, one of which is free and allows up to 150Mb of storage. I really like the fact that they are not attempting to offer a completely free service, if they were – I suspect that it’d be under enormous pressure. The cost of using Bitbucket to host your Mercurial repositories is very reasonable, starting from $5/month and stepping up to $100/month which includes 25Gb of storage.
Bitbucket provides a very convenient interface for interacting with the Mercurial repositories. As with most web interfaces to source control management packages, you can browse through different repositories, see all of the changes flowing through them and compare them if you like. A couple features that simpler products don’t support that I like is that you can ‘follow’ a repository, create queues for patches related to a repository, download the repository at time x in zip, gz or bz2 formats and it provides an easy to understand visual linking between changesets.
If you are looking for Mercurial hosting, I would definitely investigate whether Bitbucket is a suitable candidate to store whatever you need versioned. The service certainly looks the goods and from what I’m reading online, it is getting really solid reviews already.
Google have simplified the account management interface for Google Analytics. Previously when adding a user into the system, you needed to provide:
- an email address of a a valid Google Account
- first name
- surname
- access level (administrator/reporting)
It appears that you no longer need to provide the first name and surname information. Interestingly though, they have not been marked optional fields, they have been completely removed from the interface.
To my knowledge, the first name and surname information isn’t visible anywhere within Google Analytics (please correct me if I’ve just missed it). If it isn’t displayed or is in limited use, it’s possible Google realised that they were increasing the barrier of entry for no tangible benefit or that they were duplicating information already available within a Google Account.
Sun Java & Bundling Google Toolbar
During this particular update, I happened to notice (not sure if it was there before) – however Sun are now bundling (optionally of course), Google Toolbar with the Java installer. I’m all for providing the automatic update, however I don’t believe they should be bundling additional software, optional or otherwise with an automatic update.
I have no issue if you just installed Java for the first time and you have chosen to install the additional software, however adding it into an update and having it enabled by default is just a little to slimy for my liking.