Between the Devil and the Deep Blue Sea

[This piece was written for Public Sector Forums (external link) and is cross-posted here to allow comments from those who don't have access to that site.]

Or SOCITM and SiteMorse vs. the ODPM and Site Confidence...

There's a pithy saying, much loved by researchers and statisticians, that goes something like this:

Be sure to measure what you value, because you will surely come to value what you measure.

Sage advice, which you would be wise to follow whatever business you're in. In reality, a greater danger comes from the likelihood that others will come to value what you measure, or worse still, what others measure about you.

We're all familiar with the arguments for and against automated web testing. It should form an integral part of any web team's quality assurance policy, and can save enormous amounts of time pinpointing problems buried deep in your site. By itself an automated testing tool can be a valuable aid in improving the quality of your website. But when automated tests are used to compare websites the problems start to come thick and fast. The recent disparity between the 'performance' tests from SiteMorse and Site Confidence are a case in point.

Who can you trust? SiteMorse will tell you that their tests are a valid measure of a site's performance. Site Confidence will tell you the same. Yet as previously reported on PSF the results from each vary wildly. SOCITM have offered this explanation for the variation:

"The reality is that both the SiteMorse and Site Confidence products test download speed in different ways and to a different depth. Neither is right or wrong, just different."

And therein lies the real problem. If both are valid tests of site performance then neither is of any value without knowing precisely what is being tested, and how those tests are being conducted. The difficulty is that no-one is in a position to make a judgement about the validity of the tests, because no-one outside of the two companies knows the detail.

It's worryingly easy to pick holes in automated tests. Site Confidence publishes a 'UK 100' benchmark table (external link) on its website, and at the time of writing it has Next On-Line Shopping (external link) sitting proudly at number 1, with an average download speed of 3.30 sec for a page weighing 15.33kb. The problem is that the Next homepage is actually over 56kb. At number 5 is Thomas Cook (external link), with a reported page size of 24.92kb, when in fact it's actually a whopping 172kb. Where does the problem lie in this case? Are the sites serving something different to the Site Confidence tool? Is the tool missing some elements, perhaps those referenced within style sheets, or those from different domains? The real problem is that we can't tell from the information provided, and the same holds true for SiteMorse league tables.

A few associates and I have been in correspondence with SOCITM for some months now about the use of automated tests for Better Connected. To date the responses from SOCITM have not completely alleviated our concerns. While some issues have been addressed by SiteMorse, many remain unanswered, and perhaps the greater concern is the attitude of SOCITM. For example, when pressed on why SOCITM hadn't sought a third party view of SiteMorse's testing methods, the response was:

You wonder why we have not done an independent audit of the SM tests. To date when detailed points have been raised, SM has found the reason and a satisfactory explanation, almost always some misunderstanding of the standard, or some problem caused by the CMS or by the ISP. In other words, there has been little point in mounting what would be an expensive exercise. You may, of course, not be satisfied with the explanations in the attached document to this set of detailed points.

I'll leave you to draw your own conclusions from that response, other than to say that I wasn't the slightest bit comforted by it.

Our concerns extend beyond Better Connected to the publication of web league tables in general. The fact is that we know very little about how SiteMorse conduct their tests, or what they are actually measuring. In some cases SiteMorse, or any testing company, will have to assert their interpretation of guidelines and recommendations to test against them, and have to make assumptions about what effect a particular problem might have on a user. For example SiteMorse will report an error against WCAG guideline 1.1 if the alt attribute of an image contains a filename, despite there being legitimate circumstances where such an alt attribute might be required. The fact is there are only two WCAG guidelines which can be wholly tested by automated tools (external link).

While SOCITM make no use of the accessibility tests from SiteMorse, there are similar concerns about performance tests based on no recognised standard, or which have no impact on users. For example SiteMorse raises a warning for title elements with a length of more than 128 characters, citing the 1992 W3C Style Guide for Online Hypertext (external link) as the source of the guidance. This guide is at best a good read for those with an interest in the history of the web, but for SiteMorse to use it as the basis for testing sites over a decade later is highly questionable. To quote from the first paragraph of the guide:

It has not been updated to discuss recent developments in HTML., and is out of date in many places, except for the addition of a few new pages, with given dates.

SiteMorse justifies the use of this test in league tables by saying that many browsers truncate the title in the title bar. But this ignores the fact that the title element is used for more than just title bar presentation (for example for search engine indexing), and that the truncation can depend on the size of the browser window (at 800x600 on my PC, using Firefox, the title is truncated at 101 characters, for example). While it may be useful as a warning to a web developer, who can then review the title for the use of the clearest possible language, it certainly should not be used as an indicator in the compilation of league tables.

From our correspondence with SOCITM it became clear very quickly that SOCITM don't know much about how SiteMorse tests either - as evidenced above there has been blind acceptance of the explanations given by the company and no independent expert view sought.

In most other arenas league tables are based on clear and transparent criteria. Football, exam results, olympic medals - all rely on known, verifiable facts. Unfortunately the same cannot be said of the current LA site league tables.

Our main assertion is that SOCITM should be working with local authorities and UK e-standards bodies (if there are any left) to produce a specification for the testing of websites using meaningful, independently assessed measures which are based on consensus, rather than blindly accepting the existing, opaque tests offered by SiteMorse, Site Confidence or any other private concern. There needs to be public discussion about precisely what we should be measuring, how those measures are conducted and what conclusions it would be valid to draw from the results.

In the end it all comes down to a question of credibility - for Better Connected, SOCITM, the testing companies, and most importantly those of us who are responsible for local authority websites. It's likely that league tables are here to stay, but unless we are prepared to question the numbers behind the tables, and the way those numbers are produced, we're probably getting what we deserve.

Comments

Dan,

Great piece. I think the most worrying thing about league tables in general is their ability to confer apparent simplicity on an issue which is in fact extremely complex and multidimensional. The attempted simplification takes the debate away from what really matters - that as many people are able to use these websites as possible.

This article actually reminds me of that scene in Donnie Darko where Donnie is asked to grade a real-life moral conundrum on a continuum between 'love' and 'fear'. After trying to argue that life is not that simple, he chooses instead to tell his teacher to stick the continuum up her ass. Which is in effect what you're trying to do here. Good work!

Posted by: James Newbery at February 28, 2006 11:16 AM

Dan, as you're no doubt already aware, I'm very much prepared to give of my free time to contribute to any debate on the matter. However, while I work for an LA and understand some of the difficulties they face I can't presume to speak for them and can only give my own opinion...

Posted by: JackP at March 1, 2006 9:18 AM

Dan,

This is an excellent and well-timed piece. I raised questions of Site Morse's methods on PSF in the early days but they avoided answering them directly.

I am particularly annoyed by Site Morse's tactics. When I refused to buy their reports and questioned their methods they started emailing our Chief Exec, pointing out how 'bad' our site was.

As we redevelop our Council's site, which we recognise has failings - that's why we are we are reworking it, we are concentrating on accessibility and accuracy of information. We're getting The Shaw Trust, who I know you've used too, to test our site and I'm happy to work with them to ensure that we make the site available to all users.

I'd be happy to collaborate in any move to get a testing service set up by Local Authorities that we can all trust and use as a benchmark. Site Morse isn't that in my personal view.


Keep up the good work

Ian

Posted by: Ian Watt at March 1, 2006 11:00 AM

I had a similar experience to Ian's, and its galling that a company which acts like this can also claim credibility as 'the' benchmark because it is used by SOCITM. What have SOCITM to say about these tactics, and by their blind faith, inferred recommendation of the service?
I'm a huge fan of transparency, I positively request feedback and insight to where my work requires attention, but such feedback must be creditable. I place far more faith in personal feedback from our website than any automated test result.

Posted by: Doug Finnie at March 1, 2006 11:23 AM

@Jack: It's an open debate, you need not represent your employer to make a valuable contribution (and I know you have a valuable contribution to make).

@Ian: SiteMorse's business model is founded on their league tables, and the FUD they spread among senior staff who can't be expected to know any better. My primary intention isn't to bash SiteMorse though, it's to try to move towards a set of indicators which are meaningful, understood and transparent.

Glad to hear you're working with the Shaw Trust, my experience with them was entirely positive and our site is much improved from the assistance they provided.

@Doug: I do have a problem with SOCITM's implicit endorsement of SiteMorse's league tables. On page 207 of Better Connected 2006 we're told that SOCITM don't use the league tables, but do use some of the results from the tests that go into the league table. Am I the only one who sees that as a paper-thin distinction?

Posted by: Dan at March 1, 2006 2:51 PM

Dan, on the 14th February, Ian Dunmore wrote an article on Public Sector Forums titled: "Website Rankings Nonsense: Better Connected v ODPM" refering to the SiteMorse / Site Confidence conflict, which containing an interesting sentence ( http://www.publicsectorforums.co.uk/page.cfm?LANGUAGE=eng&pageID=993 ):

"Just as well Socitm Better Connected's Martinb Greenwood confirms the Site Morse Local Authority Website League Tables WON'T be used to assess council sites for this years report given the fact they bear no resemblance to a similar but different leage produced at the end of last year for the ODPM."

Do you know if there is any truth in this statement that SOCITM have ditched SiteMorse?

Posted by: Isofarro at May 6, 2006 8:00 PM

Isofarro, I think that sentence in the PSF article refers to this statement on page 207 of the Better Connected report:

"The SiteMorse results are perhaps best known in the form of the SiteMorse league tables, which bring the results together using weightings assigned by the company to all the different tests. Socitm Insight interpretation does not use the league tables at all, but does use the results of some of the tests that go into the league table."

So no, I don't think SOCITM have ditched SiteMorse, preferring instead to take the company's word on trust and not seek an independent view of the validity of the test results or the conclusions drawn from them.

Posted by: Dan at May 9, 2006 9:26 AM

Post a comment

Personal information