DO-IT Video Search Meets WCAG 2.0, Part 2

I began writing this blog post yesterday morning as the sun rose in Boulder, Colorado, and Boulder’s landmark Flat Irons slowly faded dramatically into view. Twenty-four hours later, the Flat Irons are obscured by falling snow, and Boulder is waking beneath a thin sheet of white. I’m here for Accessing Higher Ground, the annual gathering of higher education access technology professionals. Yesterday I gave a presentation titled DO-IT Video Search: A Case Study in Accessible Multimedia. The link points to an accessible PDF version of my Powerpoint slides.

In my presentation, I discussed and demonstrated DO-IT Video Search, and concluded with discussion of our efforts to comply with the World Wide Web Consoritum (W3C) Web Content Accessibility Guidelines (WCAG) 2.0. This blog post is the second in a series related to these efforts.

Similar to version 1.0, WCAG 2.0 has various levels of conformance, designated as AAA (maximum accessibility), AA (not bad), and A (critical accessibility needs are met). If a web site fails to comply with Level A success criteria, the site is sure to exclude certain groups of people.

I’ve been an advocate for web accessibility since the early 1990’s, so many of the WCAG 2.0 success criteria are no brainers. All of our images have meaningful alternate text for people who can’t see them. All of our videos are captioned for people who can’t hear them (In fact, without captions are video search applications wouldn’t even exist since captions are what make the full text of our videos searchable). All of our pages have good HTML structure, using appropriate markup to designate headings, subheadings, lists, etc.

However, even after working in the accessibility field for two decades, I still had problems meeting a few of the WCAG 2.0 success criteria. Some of these were minor problems, but some were major problems that ultimately prevented us from attaining Level AAA conformance. The following is a summary of those success criteria that we had to think twice about. The major issues are identified as such, and will be described in more detail in future installments in this series.

Level A Conformance


1.2.2 Captions (Prerecorded): Captions are provided for all prerecorded audio content in synchronized media, except when the media is a media alternative for text and is clearly labeled as such.

As noted above, captions are a no-brainer. We wouldn’t thinking of making video available without them. However, familiarity sometimes leads to complacency, and we discovered (thanks to the WCAG Working Group) that some of our captions were out of sync. How out of sync is too out of sync? This obviously a judgment call, but in our case captions were off by a few seconds in some places, enough to be a problem both for accessibility and searchability. Since we have nearly forty videos, averaging about fifteen minutes each, correcting the problems required that each video be viewed in its entirety, paused at problem points, and timestamps adjusted accordingly. We created a simple caption editor to facilitate this process, and handed the task off to a couple of patient, detail-oriented students who did a fantastic job identiying and fixing all the caption problems. The entire process took less than a week.

Keyboard Accessibility

2.1.1 Keyboard: All functionality of the content is operable through a keyboard interface without requiring specific timings for individual keystrokes, except where the underlying function requires input that depends on the path of the user’s movement and not just the endpoints.

2.1.2 No Keyboard Trap: If keyboard focus can be moved to a component of the page using a keyboard interface, then focus can be moved away from that component using only a keyboard interface, and, if it requires more than unmodified arrow or tab keys or other standard exit methods, the user is advised of the method for moving focus away.

Both of these success criteria are problematic because we’re using a Flash media player. There are now several Flash media players available that were designed with accessibility in mind (e.g., controller buttons are labeled and accessible to screen reader users). However, in most browsers the Flash object can’t receive focus unless the user first clicks on it with a mouse. As far as I know, the lone exception is Internet Explorer, which supports tabbing seamlessly into and out of Flash objects. However, in other browsers, you have to click to get in, and once you’re in, you’re trapped unless you click again. Screen readers are able to bypass all this clicking nonsense, but it’s a serious problem for sighted users who have physical disabilities, or anyone else who is unable to use a mouse or simply doesn’t have one.

One could argue that the success criterion is met because Flash is accessible in IE, but for me that conjures up unpleasant memories from the "Works best in Netscape" days. Our solution was to provide a set of HTML buttons that allows users to control the Flash media player from outside of Flash. The need for this functionality limits are choice of media players, but JW FLV Media Player has worked out wonderfully. It provides all the accessibility features we need (support for captions and audio description, screen reader accessibility (in some versions)), plus it has a powerful Javascript API that allows us to control it externally.

Page Titles

2.4.2 Page Titled: Web pages have titles that describe topic or purpose

This is another no brainer, but should not be overlooked.

HTML Validation

4.1.1 Parsing: In content implemented using markup languages, elements have complete start and end tags, elements are nested according to their specifications, elements do not contain duplicate attributes, and any IDs are unique, except where the specifications allow these features.

As a standards advocate, I confess to feeling a bit embarassed when the WCAG Working Group pointed out that our site fails to validate. It had validated, but I broke that during an upgrade and neglected to doublecheck. This just goes to show you how important it is to keep the W3C Validator bookmarked and handy, and use it liberally.

Does validation really matter, if the web page seems to work anyway despite having a validation error? My answer to this is unquestionably yes. Standards are the language in which all players in the web community (authors, browsers, assistive technologies, etc.) communicate. The only way to ensure that our message is correctly delivered is to use the language correctly. If we don’t, maybe it’s true that giant desktop browsers such as Internet Explorer and Firefox will understand us, but as browsers get more diverse (e.g., small footprint browsers on pocket devices) validation is sure to play an increasingly critical role.

Level AA Conformance


1.4.3 Contrast (Minimum): The visual presentation of text and images of text has a contrast ratio of at least 5:1 (with some exceptions).

WCAG 1.0 had a similar checkpoint, and I’ve advocated often for high contrast. However, in WCAG 1.0 the checkpoint called for "sufficient contrast" and it was left to the web designer to make that judgment. With WCAG 2.0, there’s a specific ratio to aim for, and there are tools to measure it such as The Paciellos Group’s Contrast Analyser.

Now that contrast is measurable, I find that my past judgment as to what’s acceptable was a bit liberal. I generally opt for black text on a white background, adding only a dash of color here and there for a richer aesthetic. However, where I did choose to add color on the DO-IT Video Search site (e.g., the HTML buttons that control the media player), the original contrast was insufficient.

When faced with making this change, I found in myself a resistance that I think many web developers experience. I liked my design, and really didn’t want to change it. But letting go to such attachments is a good practice. I did let go, and now that some time has passed I actually like the new, higher contrast buttons.

Level AAA Conformance


1.2.8 Media Alternative: An alternative for time-based media is provided for all prerecorded synchronized media and for all prerecorded video-only media.

In plain English, this success criterion calls for a video transcript. I have always been a believer in transcripts. They provide a needed accessibility solution for deaf-blind individuals (captions only help deaf users who can see them). They also provide an accessibility solution for people who can’t access video due to slow Internet connections, and provide quick access to content for users who don’t have 15 minutes to watch the full video.

In fact, I would include transcripts in my "no brainer" category. So why do I include them in this discussion? Our transcripts, like most transcripts, included only the audio content. An important missing component was the content of the audio description. People who are reading the transcript are in the same situation as people who can’t see the video: If content is presented visually, they don’t have access to that information unless the visual content is described.

When the WCAG Working Group brought this to our attention, it sort of through a monkey wrench in our application. In DO-IT Video Search, the transcript is generated automatically from the closed captions. It reads the captions, intelligently adds some formatting code, and writes them to the screen as a transcript. In order to add audio description, we had to modify our production process so that audio description content was added to the caption database, complete with timestamps. Importing captions into the database was simple since captions exist in structured, easily parsed text files. Our audio descriptions don’t exist in a similar format: Our vendors generate a script before recording, but it’s just written out in a Word file, with no structural markup that would support its being automatically parsed. Perhaps this is something we can discuss with our vendors, but for now adding audio description to the database is a manual process. Again, thankfully we have access to students!

In terms of delivery, our transcripts now include audio description. It’s stylized distinctly so it’s clearly apparent to visual users that it’s different than captioned text. For non-visual users, there’s a hidden <span> element (accessible only to screen reader users) that identifies this content as an audio description.

Speaking of audio description…

Audio description

1.2.7 Audio Description (Extended): Where pauses in foreground audio are insufficient to allow audio descriptions to convey the sense of the video, extended audio description is provided for all prerecorded video content in synchronized media.

1.4.7 Low or No Background Audio: For prerecorded audio-only content that (1) contains primarily speech in the foreground, (2) is not an audio CAPTCHA or audio logo, and (3) is not vocalization intended to be primarily musical expression such as singing or rapping, at least one of the following is true: [the specification lists three criteria: No background, background sounds can be turned off, or there’s a 20 decibel difference between foreground and background.]

Our videos have been professionally audio described by three different vendors, all leaders in the industry. Therefore, our failure to meet these success criteria suggests that the WCAG 2.0 guidelines are not reflective of standard practices in the industry. I’ll be talking much more about this issue in a future installment. Please stay tuned…

Sign Language

1.2.6 Sign Language: Sign language interpretation is provided for all prerecorded audio content in synchronized media.

This is a new requirement in WCAG 2.0, and is not something we had considered previously. We had always felt that closed captions were sufficient for providing access to the deaf and hard of hearing. The reason this success criterion exists is that for many deaf individuals, written language is a second language. Reading English captions is more difficult for these individuals than following along with a sign language interpreter. For some individuals English captions aren’t a solution at all.

This is a complex issue with complex solutions, and warrants a blog post of its own. Please stay tuned…

Visual Presentation

1.4.8 Visual presentation: For the visual presentation of blocks of text, a mechanism is available to achieve [an easily readable format, as defined by five stylistic requirements].

This success criterion requires the implementation of specific visual styles designed to make the page easy to read for certain individuals with cognitive, language, and learning disabilities, as well as low vision. One of these items I found to be confusing as written:

line spacing (leading) is at least space-and-a-half within paragraphs, and paragraph spacing is at least 1.5 times larger than the line spacing

My confusion stemmed from my lack of certainty about "line spacing". Since there is no line-spacing property in CSS, I wasn’t sure how to interpret this requirement. If the font size is 1em, and line spacing is the space between lines, then attaining a line spacing of 1.5em within paragraphs would require a CSS line-height of 2.5em, which I think has the opposite of the intended effect: The page becomes more difficult to read for most people. After some discussion with the WCAG Working Group, I now think I understand that that line-spacing and line-height are synonymous, and this requirement would therefore be met with the following style definition:

p {
  font-size: 1em;
  line-height: 1.5em;
  margin-bottom: 1.5em;


3.1.4 Abbreviations: A mechanism for identifying the expanded form or meaning of abbreviations is available.

On our site, specifically on the FAQ page, I had used abbreviations such VHS and DVD, figuring they’re common enough that people are more likely to recognize and understand VHS than "Video Home System". However, the WCAG Working Group cautioned that I shouldn’t make assumptions about users’ vocabularies, and I respect that. However, I struggled with the best strategy for delivering this information in a way that would be accessible to folks who need it, but not obtrusive to folks who don’t. Consider this HTML:

<abbr title="Video Home System">VHS</abbr>

In some browsers (e.g., Firefox 3, Opera 9.5), this content appears visually with a dotted underline. In others there is no visual distinction, so to be safe I added the following CSS definition:

abbr { border-bottom: thin dotted #000; }

In all browsers the title of the abbreviation appears in a tooltip when users hover with a mouse. However, without additional markup an <abbr> element does not receive focus when a keyboard user is tabbing through the document, so this markup is currently inaccessible to non-mousers. Also, screen readers may or may not have access. In JAWS 9, there are configuration options to expand abbreviations and acronyms, but these are off by default. If the user turns them on, then they get the expanded text, but not the original text. Therefore a JAWS user would hear "Video Home System" rather than "VHS", which may arguably be more confusing. My personal workaround is to repeat the abbreviation in the title of the <abbr> tag, like this:

<abbr title="VHS (Video Home System)">VHS</abbr>

That may not be pure semantic markup, but it seems to make abbreviations more accessible to screen reader users.

Reading Level

3.1.5 : Reading Level: When text requires reading ability more advanced than the lower secondary education level after removal of proper names and titles, supplemental content, or a version that does not require reading ability more advanced than the lower secondary education level, is available.

Measuring reading level is challenging, and worthy of a thorough discussion in a future post. Please stay tuned…