On Testing the DOM and Karl Grove's Accessibility Challenge

by Tina Holmboe 8^th of October 2012 archive

A fellow accessibility enthusiast sent us a link to Karl Groves’ blog yesterday — as the topic was accessibility testing tool vendors we were quite naturally interested. After all, we are one!

For those who don’t know him, he has a resume up on his site.

Us here over at GTHQ are, perhaps not surprisingly, in agreement with some, in fact most, tho not all, of what the article say. Sadly, we are utterly in disagreement with the WordPress software; it appears we can’t get an account to comment ’pon the criteria and issues raised.

So, here is our response, as a tool vendor, to Karl’s challenge. Let’s be clear about one thing, however: we will likely never be recommended by him — for one simple reason. Let’s look at his five criteria, in reverse order.

Responsive Customer Support

In cases where a customer needs help, is experiencing problems using or understanding the product, or has a bug, how do you handle it? Given the above — that no software is perfect — how you support customers with problems says a lot. If I recommend a tool to someone and they’re given poor customer support, that would make me look bad.

It’s impossible to disagree with this. As a vendor, even one with staff actually interested in the accessibility biz, we don’t live off the goodness of our hearts, but rather off our ability to respond, and respond timely, to the needs of our clients.

Pure selfishness: a dissatisfied client is lost income, and worse. A client who is not happy with us will not recommend us to others. Any vendor, be it of testing tools or anything else, must keep this firmly in mind.

Heck, Karl, if we do poor customer support it’ll reflect badly on both of us. We do our best to avoid that situation. Are we perfect? No — but I, the author, firmly believe we do our very best.

This part of the challenge we meet.

Bug–free Features that Work as Stated

No software is ever perfect. Anyone who expects perfection is unrealistic. In fact, that’s what these tools are meant to find out, right? So it is understandable that at times tools will have bugs. But if specific features do not work as intended — or don’t work at all — that calls into question your entire approach to quality. This is especially true in cases where blatant bugs to core features go ignored. Not caring about quality is akin to not caring about your customers, in my opinion.

Bugs and us are old friends. There’s not a whole lot we can do to ensure our product, siteSifter, is perfectly bug free at all times, but what we can we do.

Ignoring the various theories on whether or not non–trivial software can be proven error free, our job is to feed siteSifter non–structured, non–consistent data and try to make sense of it.

We operate, in effect, on chaos. No matter what markup, for example, given to us we should, in one way or another, interpret it in a useful manner — if not a correct one.

No, the idea here is not to be perfect, but rather to ensure that when a bug is discovered, and reported, it is also handled.

To do so properly, we have a number of different ways to report issues; a system for giving users updated feedback on problems, and detailed bug tracking software available to clients.

In addition we run automated systems which inform us when things do not behave according to expectation, When stuff go bad, we often begin our attack on the problem before it is reported.

Why would anyone do otherwise? Here we live according to the Perl Virtues: laziness (we’d rather not have to work tomorrow!), impatience (why wait to fix a problem? That way we don’t have to work tomorrow!) and hubris (we’re pretty sure we can fix problems before they are reported. That way we don’t have … you get the idea).

We meet this part of the challenge as well.

Make the Tool User–Friendly

This should go without saying: Tools which are difficult to use won’t get used. In sales, the cost & time of gaining a new customer is always higher than getting repeat business from an existing customer. So if your business model is based on license sales, it would seem smart to make a tool that the customer wants to renew their license for. Surprisingly it seems most tool vendors don’t quite get this. Here’s a hint: If your users are complaining that the tool is hard to use, then you need to fix it. The customers aren’t stupid. They don’t need more training, they need a tool that is easy to use. If I don’t have confidence that my customers will be able to use the tool successfully, I won’t recommend it.

And boy is this a tricky one: what, exactly, is a “user–friendly tool”? We can’t say. siteSifter is a user support tool; it won’t ever tell you that something is or isn’t accessible — it requires that the tester has some background or knowledge.

It doesn’t require extensive knowledge, on the other hand.

It’s also a complex tool — you can do quite a number of tricks here. So we’ve picked a middle way: the public facing, web based, interface to siteSifter is designed to be:

Easy to use. Don’t make the simple complicated or vice versa.
Accessible. Yes, we test ourselves against established standards.
Light. A “darn–that’s–simple” tool which takes ages to do something is user–UNfriendly.
Flexible. You can adjust the interface to a large degree. What is friendly to one isn’t necessarily friendly to another.

Do we meet the challenge? We can’t say — only our users can answer this, and some of them claim we have the best user interface they’ve seen.

Provide Fast, Accurate, and Reliable Results

This is why people buy your tool: They want to do testing and get fast, accurate, and reliable results. They don’t want to sift through false positives and they don’t want to have to “scrub” the reports to filter out repetitive, vague, inaccurate, irrelevant, erroneous, or misleading results. Unfortunately, this is what happens with most tools. Over time this has gotten better, but it still sometimes feels like tool vendors want to offload the work to the user. This is inconvenient, at best, and in my experience is the most often cited reason why tools become abandonware.

Understand this: Absent any real expertise on the part of the user of the tool, they will explicitly trust the results they get from a tool. If I can’t trust that the tool will give my client accurate and reliable results, I won’t put my name on a recommendation.

This is another criteria with which no–one can disagree — at least not in principle.

In reality, however, there is A Problem here: accessibility testing is not like dusting crops. Sure, a tool can return to you the number of images that has no ALT–attributes. That’s easy.

siteSifter can tell you which form controls lack labels. Bit more complicated, but still easy — and easily verifiable and testable. We have thousands of real life test pages to work with, but we can’t predict the human mind.

So I’d claim we pass the challenge, with a fair caveat: we cannot produce accurate or reliable test results of practises not documented and not before seen by us.

We’ll give you results that are verified by our internal tests, minus the bugs, and a host of different report formats so that you can tailor the output based on your needs. But we won’t remove data which one client finds useful because another client finds it useless: we’ll give you built–in filters instead.

And yet.

We’re good, but we’re not prescient.

Test the Browser DOM

I can’t say this often enough or loudly enough: If your tool doesn’t test web documents as rendered in a browser, it isn’t testing the right thing. No, using jDOM or PHP’s various DOM extensions or anything that “makes” a DOM isn’t the same. The product must reside in the browser or use a headless browser and must use the information from the browser DOM to do its testing. As a tool vendor you should know why these two are not the same thing and why the browser DOM is vital. If you don’t know why this is important or refuse to acknowledge it, get out of the tool business. I absolutely won’t recommend an automated testing tool which doesn’t test the DOM.

And, finally, we come to the bits we disagree — rather loudly – with. First of all:

Disagreeing with an opinion, provided that the disagreement is based on facts and analysis, is not a reason for claiming that the other party is not competent.

So, no, we won’t get out of the tool business. Personally I find that part a little low, actually. That said, it is a legitimate concern.

We do not test the browser DOM. Any browser DOM, to be specific – and there are several good reasons for this choice.

siteSifter is a tool designed to test more than one page at a time, occasionally across multiple sites, and cannot practically “reside in the browser”.

It’s project–oriented and meant to be used by teams of one or more people not necessarily in the same geographical position, and as a consequence require both interface and finished test to be remotely accessible to more than a single person. It isn’t a toolbar, and can never be one.

Can we do DOM testing? Theoretically — yes. Projects such as Selenium give us the ability to retrieve the content of a browser–generated node tree.

Practically speaking — no. If a client set up a project to test a few thousand (not uncommon) or tens of thousand (has happened) of pages, it becomes a question of resources.

But this is not the entire story. Resources can be extended if need be. A more fundamental problem exist:

When a browser build a node tree, the result is dependent on the quality of the code going in, and the construction of the parser algorithms involved. It is not a stretch to say that different browsers have vastly different strategies for handling poor code.

In order to get around the problem of UA–specific bugs, we would need to test the DOM from a number of different browsers and versions — say a good spread of Opera, Internet Explorer, Firefox, Chrome, Safari and Lynx.

What do we gain? Can we be certain, for example, that the finished node tree is not sterilised? When handling poorly constructed code, it would be logical for a parser to do as much error recovery as possible, and not turn out an equally poor DOM.

It is this finished node tree which is exposed to assistive technologies through, for example, the MSAA or AT–SPI interfaces. If the parsing has “cleaned up” the original code in one browser, problems which will show up in others might be camouflaged.

Focusing too much on the DOM means focusing on the browser handling, which can change between minor versions and not even be representative of your users, and not on the site issues.

In short: we test the code that goes into the parser, not the tree that comes out; the technique has both advantages and disadvantages.

On the flip side, testing the DOM could mean to let a specific browser create a node tree, then execute associated JavaScript — including, for example, searching the tree for elements with onClick event handlers and running the code.

But what if that click event did something modal like alert()ing? Or what if it did something an automated tool shouldn’t do (like make a POST request)?

Finally, what does this gain our clients? We could, perhaps, summarise: “This JavaScript break checkpoint SO and SO in Opera for Windows Vista version Y, but not in Firefox Z for Android version *.*.*, and works perhaps partially in Chrome for Linux version X” — but is it useful data in an accessibility analysis?

We have the capability to analyse the node tree, but we have elected not to use it. We can — and do — run JavaScript, yet cannot determine whether it do what the author designed it to do. (Was that alert() meant to happen, or is it a leftover debug statement for one specific UA on one specific platform?)

Will this change? Possibly, but right now I fear we fail this part of the challenge. We accept, with regret, that this will put us on Karl Groves’ “blacklist”. Our philosophy of testing is just that different.

The optimal test methodology can be debated — and is.

Finally

I’ve used all the big expensive tools, all the free tools and toolbars.

Surely not all the big expensive tools? Then again, perhaps siteSifter does not rate “big” :)

But in general we agree with Karl Groves’ points, even going so far – and this important — as to admit that siteSifter isn’t perfect either.

The situation, however, is more complex than the Challenge article make it out to be. Testing tools come in all shapes and sources, and not all decisions are made out of ignorance.

We all need to improve. If you’d like to test siteSifter for a week, Karl, and tell us how WE can get better — drop me a note. I’m always around.

References

A Challenge to Accessibility Testing Tool Vendors
http://www.karlgroves.com/2012/10/03/a-challenge-to-accessibility-testing-tool-vendors/
Karl Groves, 3^rd of October 2012

Document Object Model (DOM)
http://www.w3.org/DOM/
W3C, 2005

Selenium
http://seleniumhq.org/

Microsoft Active Accessibility
http://en.wikipedia.org/wiki/Microsoft_Active_Accessibility
Wikipedia

AT–SPI
http://directory.fsf.org/wiki/At-spi
FSF, 12^th of April 2011