Forms, accessibility and automated testing

by Tina Holmboe 28^th of May 2012 archive

Abstract

In this article we’ll look at one very specific example of how automated testing can improve the accessibility of webpages, and illustrate why solid specifications is an absolute requirement for any testing to be possible.

Disclosure

The author is the owner and operator of Greytower Technologies, a company which specialise in automated quality testing of websites.

Introduction

The debate on whether or not automated testing of accessibility is worthwhile, or even possible, has been ongoing since the mid 1990s, and it is not likely that it’ll reach a conclusion any time soon.

Ignoring, for a moment, the fact that no well–formed and objective definition of “accessibility” have been agreed upon, there are several problems inherent in testing computer interfaces meant for human use.

Regardless of methodology, be it user group, expert opinion, or automation, there are a number of difficulties to sort through. Some issues can be removed by applying statistical methods; some are more ephemeral in nature.

An example often used when discussing automated testing is the inability of software to determine whether alternative content – typically used in non–visual contexts in place of graphical elements – communicate the same information as the original.

This criticism is entirely correct: no algorithm has been invented to determine the meaning of unstructured content, much less compare several meanings and point out discrepancies. The problem, however, is not limited to automated testing.

Unless a user is told, beforehand, what an author intended with an image, it is a matter of guesswork for the tester to determine whether or not the specified alternative is indeed equal in meaning — but even so the result will be far better than if left to a computer.

What automated testing can do, and more efficiently than a human, is determine whether or not an alternative has been provided, in particular across sites with a large amount of content.

It is my contention that no single test procedure can determine whether or not a website is accessible to every possible visitor.

For a truly comprehensive accessibility audit, you will need to employ several methods. In the following we’ll focus on a specific use–case in which automation works better than manual review.

The Problem

When visiting a webpage containing forms, a user will more often than not be able to identify fields and their function from visual clues.

It is, for example, quite likely that a control preceded by the text “first name” is meant for your first name.

For someone with low or no vision, possibly even accessing the page in a linear fashion, the association between a form control and the textual label assigned to it is not nearly as simple to make.

Recognising this problem, the HTML 4 specification introduced the LABEL element, intended for use with form controls that had no implicit labels. One consequence was that assistive technology such as speech browsers could explicitly associate a label with a control.

This, in turn, meant that a visually impaired user could access form controls directly via their labels in a deterministic manner. No more guesswork required to figure out if this piece of text belonged to that field.

To make this work, authors would have to either wrap their form controls inside a label element:

<label>
 Area code: <input type="text" name="areaCode">
</label>

or associate the label with the control using the ID and FOR attributes:

 <label for="areaCode">Area Code</label>: <input type="text" name="areaCode" id="areaCode">

Both methods are reasonably simple to detect programmatically. In the first case, an algorithm could determine that

Each and every form control was a descendant of a LABEL element
Each LABEL element had a form control child element
Each label text was unique

In the second case, the implementation is more complex:

Each LABEL element has a FOR–attribute with a non–null value
Each LABEL has non–graphical, non–null content
Each form control has an ID–attribute with a non–null value
Each ID has a corresponding FOR
Each ID/FORM pair is unique to the document

yet still straightforward. In both cases failure to comply with the rules set up should be flagged as, in the very least, an issue to be studied in more detail by a human operator.

If a site to be tested contains a large number of pages with forms, automated testing in this fashion can save resources.

W3C’s WCAG 2 recommendation on creating accessible content covers labels and form controls under the level ‘AA’ SC 2.4.6: “Headings and Labels”. Combined with SC 1.3.1 “Info and Relationships”, it becomes clear that a control and its label must be associated with each other and that the association must be programmatically determinable.

If it is not, the page will not meet WCAG 2 (AA) — but worse, a browser or AT must guess, a highly undesirable solution.

Note that the algorithms described are not officially sanctioned by the W3C.

A Conclusion

We conclude that in order to make forms accessible, a browser or an AT must know that a label and a form control is associated; WCAG 2 require it, and an automated testing tool can easily and objectively verify whether this has been done.

The HTML 5 Question

WCAG 2 reference an HTML–specific technique (H44), which describe the same method for associating controls and labels as we’ve mentioned above.

This method applies not only to HTML 4.*, but to HTML 5 and the “living HTML” specification; the syntax is, and may possibly remain, identical.

That said, both the W3C and WHATWG versions of the HTML 5/living HTML specifications contain the following description of the TITLE attribute; a description which raise a number of questions:

The title attribute represents advisory information for the element, such as would be appropriate for a tooltip. On a link, this could be the title or a description of the target resource; on an image, it could be the image credit or a description of the image; on a paragraph, it could be a footnote or commentary on the text; on a citation, it could be further information about the source; on interactive content, it could be a label for, or instructions for, use of the element; and so forth. The value is text.

If this attribute is omitted from an element, then it implies that the title attribute of the nearest ancestor HTML element with a title attribute set is also relevant to this element

Some elements, such as link, abbr, and input, define additional semantics for the title attribute beyond the semantics described above.

The text — quoted from the specification per the 8^th of May 2012 – appear to imply that the following would be one correct way of associating a label with a form control:

 <input type="text" name="areaCode" title="Area Code:">

and, alternatively, that this would also be correct:

 <p title="Area Code:">
  <input type="text" name="areaCode">
 </p>

These two methods may even fulfil the “sufficient technique” criteria of WCAG 2, and thereby passing SC 2.4.6. At this time, however, we have yet to find an algorithm which can programmatically determine the relationship between label and control in the code above — and are forced to conclude that it is quite possible to write HTML 5 code which cannot meet WCAG 2 level ‘A’ .

Since the TITLE attribute can be used for non–label purposes – such as, for example, advisory text — it is not at this time possible to automate a test which can tell the two purposes apart.

It is unclear how browsers and ATs are supposed to handle this association.

In the end the expert user may have more luck determining whether or not the title should be viewed as a label — but for a user arriving on such a page it is an open question as to how the above code should be interpreted.

Conclusion Number Two

The HTML 4 specification of labels and controls made it easy to understand how the two types of information should be associated.

HTML 5 — or the “living HTML” specification — is not at all as clear, since it extends the possible interpretation of the title attribute to also cover “interactive content”.

In the latter case, despite being a more modern specification, both users and testers are left with an open–ended interpretation of content, which makes it much harder to present the information consistently. This, in particular, will impact people with abilities outside the norm.

References

Please note that these references are not necessarily endorsed by Greytower. Links are valid as of the 24^th of May 2012.

1.3.1 Info and Relationships
http://www.w3.org/TR/2008/REC-WCAG20-20081211/#content-structure-separation-programmatic
W3C, 11^th of December 2008

2.4.6 Headings and Labels
http://www.w3.org/TR/2008/REC-WCAG20-20081211/#navigation-mechanisms-descriptive
W3C, 11^th of December 2008

3.2.3.2 The title attribute
http://www.whatwg.org/specs/web-apps/current-work/multipage/
WHATWG, 8^th of May 2012

3.2.3.2 The title attribute
http://dev.w3.org/html5/spec/single-page.html
W3C, 8^th of May 2012

Providing descriptive labels
http://www.w3.org/TR/WCAG20-TECHS/G131.html
W3C, 11^th of December 2008

Using label elements to associate text labels with form controls
http://www.w3.org/TR/WCAG20-TECHS/H44.html
W3C, 11^th of December 2008

Is this HTML5?
http://www.whatwg.org/specs/web-apps/current-work/multipage/introduction.html#is-this-html5?
WHATWG, 15^th of May 2012

Sufficient and Advisory Techniques
http://www.w3.org/TR/UNDERSTANDING-WCAG20/intro.html#introduction-layers-techs-head
W3C, 11^th of December 2008