Accessible HTML5 Video with JW Player as Fallback

The DO-IT Video site recently got a face-lift. Among the changes was a new player, which uses the HTML5 <video> element for browsers that support it, and falls back to the JW Player for those that don't.

This approach, and much of the code, is the same as what I used in developing an accessible HTML5 audio player, documented in previous blog posts:

At its core, it's very simple to add video to a web page using the HTML5 <video> element:

<div id="playerDiv">
  <div id="videoDiv">
    <video 
      poster="screenshot.jpg" 
      tabindex="0"
      preload="auto" 
      height="240" 
      width="320">
        <source src="somevideo.mp4" type="video/mp4"/>
        <source src="somevideo.ogv" type="video/ogg"/>
        <track kind="captions" src="somevideo.srt" srclang="en"/>
        <p>Final fallback content</p>
    </video>
    <div id="caption"></div>
  </div>
  <div id="controls"></div>
</div>
Screen shot of the DO-IT HTML5 Video player, featuring closed captions and a full set of accessible controller buttons

How It Works

The video is provided in two formats, as specified by each of the two <source> elements. The MP4 (H.264) version is supported by Safari and will be supported by Internet Explorer in version 9, while the OGV (Ogg/Theora) version is supported by Firefox and Opera. Google Chrome supports both. Ultimately we'll probably replace our OGV videos with VP8 (WebM), Google's open video format, which is supported by Firefox 4.0, Opera 10.6, and Chrome 10.6, but that leaves out too many versions of browsers that are still in widespread use so we're not ready to give up OGV yet. Unfortunately, there's no universal support in sight for WebM or any other format so we're stuck delivering video in at least two formats if we want it to work in all browsers (which we do).

The <video> element includes the controls attribute, which tells supporting browsers to display their built-in player controller, and tabindex="0" ensures that users can tab to the player and give it focus. However, the default player is not fully accessible in all browsers. The current state for <video> is the same as for <audio>, which I documented in my Creating Your Own blog post. Also, no video player currently supports closed captions, so if we want a cc button (which we do) we absolutely have to create our own.

If users have JavaScript enabled (which most do, according to WebAIM's recent screen reader survey results) the controls attribute will be changed to false so users don't get the default player. Our JavaScript code then adds a set of <input> elements to the #controls div. Each <input> element has a title attribute, which is read by screen readers, and is stylized with a CSS background image that makes it look like a standard player button. The CSS background images contain a white icon on a transparent background, which allows the background of the #controls div to show through. This is a very flexible approach, allowing the author (or user) to change the background color with a single line of CSS. However, it's dependent on CSS and therefore doesn't degrade gracefully. This has me a bit dissatisfied with this approach. Still contemplating alternatives.

Accesskeys are added to each button so the player can be controlled from anywhere on the web page.

Using Javascript, we also check to be sure the browser supports HTMl5. If it doesn't, we add functionality that loads the JW Player:

  if (video.canPlayType) { 
    //browser can play this HTML5 video
    //add relevant code
  }
  else { 
    //add code to embed JW Player
  }

JW Player, developed by Jeroen Wijering and now distributed by Jeroen's company LongTail Video, is a Flash player with lots of accessibility features and a robust API that allows it to be controlled externally. This allows us to use the same buttons to control either the HTML5 player or the JW Player, thus we have a reasonably consistent interface for all users.

Why not just use the JW Player?

Historically, the JW Player was our preferred and default player. It has a long history of including labeled buttons for screen reader users and support for closed captions, and as far as I know is still the only Flash player to support closed audio description (more on that below). Starting with version 5.3, the JW Player began to support HTML5 <video>. However, it doesn't really support HTML5 <video> - it just uses it is a shorttag for embedding itself. The JW Player inspects the <video> element, loads some of its attributes as configuration options, then replaces the <video> element with its own Flash player. This isn't necessarily a bad thing I suppose, as long as it's accessible and the video plays, but there are a couple of significant problems with the way JW implements this:

  1. Captioning and audio description are supported in the JW Player via accessibility addons, and these only work in the Flash Player, not in the HTML5 player. Therefore, if we use JW Player to run the show, browsers that don't support Flash (e.g., Safari on iPhones and iPads) won't have captions or closed audio descriptions.
  2. These accessibility addons aren't exposed through the JW Player API, which means we can't control them with a CC and AD button on our custom controller. As I write this, there's an active ticket related to this, stemming from a relevant discussion on the JW Forum. I'm not sure whether this will be fixed in a minor upgrade, but work is underway on JW Player version 5.5 alpha, and it sounds like this new version will provide a more seamless method for communicating with addons and plugins via the player's Javascript API.

Beyond this pair of problems, our primary reason for wanting to use HTML5 <video> as it was intended to be used, rather than as a shorttag for the JW Player, is that we believe in the potential HTML5 has for making accessible video simple and viable in the future, and we want to create a real world application that demonstrates the need for HTML5 accessibility features. So, that's what we're doing, and we'll continue to stick with the JW Player as an accessible fallback player.

Searching and Streaming

The DO-IT Video site includes features that allow users to search the full text of all our videos, and the search results include links that target specific start times within videos. There is also a similar feature built into the transcript for each video (for example, see the transcript for Invisible Disabilities in Postsecondary Education). Users can click on any text within the transcript to launch the video at that point.

In order for this sort of functionality to work, we need a player that supports a start time parameter, and we need it to be able to seek ahead to points in the video that may not have been downloaded yet.

The HTML5 spec doesn't specifically address video streaming protocols, and we're not currently streaming either our OGV or MP4 videos. This means users must download the video at least to the point where they're wanting to play, or searching to that point won't work. For DO-IT, this isn't a huge problem because our videos are relatively short and they download pretty quickly even on DSL. However, we're still looking for ways to optimize performance.

For Flash, we're using xmoov-php to http pseudo stream our FLV file. The file needs to be prepped before this will work. First, it needs to be encoded with keyframes added as frequently as you think you'll want landing points (our videos have approximately one keyframe per second). Second, the FLV file must be injected with meta data identifying where the keyframes are. xmoov-php quickly loads the metadata and uses that to find and fetch the specified segments of the FLV. Injecting metadata into an FLV is accomplished using metadata injector software. If you happen to have Captionate ($60 US), it provides this functionality plus additional features for captioning Flash videos. But if all you need is metadata injection, there are free tools for that.

It would be nice if we didn't have to provide three different file types. Two is already one too many. As it turns out, the JW Player supports MP4 in addition to FLV (Adobe added MP4 support to Flash in version 9.0.115). However, xmoov-php doesn't support it (it only supports FLV), so even if we could play MP4 in the JW Player, we couldn't seek ahead.

There actually is a new version of xmoov called xmoovStream that supports MP4 and FLV, but according to the documentation "random access video is not yet supported in Adobe Flash video players when streaming MP4 files".

What About Captions?

In the current draft of the HTML5 spec, there's a mechanism for providing both captions and audio descriptions. It's the <track> element, which you may have noticed in the above code sample. A <track>, to paraphrase the HTML5 spec, is an explicit external timed text track for a media element. It includes a kind attribute, which tells the browser what kind of track it is. For captions, use kind="captions", and for audio description, use kind="descriptions". The src attribute then points to the file that contains the alternative content.

The details of this, including the format of the caption file, are still being sorted out, so it's no surprise that no browsers currently support it. Adding caption support to the player isn't difficult though. I created a caption test page to demonstrate how CSS can be used to display captions as an overlay at the bottom of a player div. In that demo I used a Javascript array of fake caption text, so all I needed to do to make this functional was to populate the array with real text, gathered from a caption file. So I did this, plus added an event listener that listens for the timeupdate event and stands by ready to display the caption that matches the currentTime. We have caption files in a variety of formats, but I chose to build in support for SRT files since HTML5 seems to be sort of headed in that direction.

This is one guy's caption solution. I've tested it in all major browsers and haven't broken it yet with our own SRT files, but it's conceivable that bugs may surface if it's tested or used more widely. Other folks have taken on similar projects to support captions in HTML5 video, including jCaps, a jQuery plugin.

I should also mention that this doesn't work on the iPhone. The captions are there on the web page, but when we tap to play the video, the video is automatically played full screen in the device's internal video player, sans captions and custom controls. There are other methods for adding captions to iPhone videos, but not a standard method that we can use in our universal HTML5 video player. I'm frankly not sure what to do about this.

What About Audio Description?

As noted in the preceding section, audio description is supported in the current spec via the <track> element with kind="descriptions". However, the spec defines "descriptions" as follows:

Textual descriptions of the video component of the media resource, intended for audio synthesis when the visual component is unavailable (e.g. because the user is interacting with the application without a screen while driving, or because the user is blind). Synthesized as separate audio track.

This is a significant departure from how audio description has historically been delivered, using a human narrator's recorded voice. Research conducted collaboratively by IBM Japan and the WGBH National Center for Accessible Media found that users prefer human descriptions, but find listening to synthesized speech to be acceptable, particularly for relatively short instructional and documentary videos (less so for dramatic videos). This study were presented at ASSETS 2010. Full (and very interesting) details are available in the paper Are Synthesized Video Descriptions Acceptable?.

From my perspective as a person working to promote accessible technology in higher education, I see so few people audio describing their videos that anything we can do to simplify the process and reduce the cost is a good thing. However, I also recognize the need for flexibility in the HTML5 spec. Human recorded descriptions should be supported in addition to synthesized text descriptions, but this is complicated since providing recorded audio as descriptions runs into the same licensing problems that plague audio in general: There is no universally supported audio format, so descriptions would need to be provided in multiple formats in order to work in all browsers. Since it's likely to be a long while before this all gets sorted out in HTML5, we are not currently providing closed audio description in our HTML5 player (we do have a button on the control bar though, disabled for now, as a sign of our hope for the future).

Instead, all of our videos are available with open audio description, that is, the description track is mixed into the video. On the Preferences page of the DO-IT Video site users can select whether they want audio description, and if they do, the videos they see will be the audio described versions.

In the JW Player, we actually are able to deliver closed audio description. We consider that experimental and it's off by default, but users can select it on the Preferences page (they can also select a couple of related behaviors). JW Player has supported closed audio description for many years/versions now, and they're continuing to improve it. The Audio Description plugin supports recorded audio in MP3 format. One issue that we've encountered is that when audio description is provided in a separate file, not mixed into the video, the contrast between the program and description audio volume is often inconsistent and sometimes insufficient. If there is insufficient contrast, it can be difficult or impossible to hear the description. Version 2.0 of the Audio Description plugin introduces an experimental feature called ducking to try to address this problem. When ducking is enabled, the plugin analyzes the waveform of the audio description file and automatically lowers the volume of the program audio temporarily so the description can be heard. This currently is a little buggy. The biggest problem is that the video begins to play before the plugin has completed its analysis of the waveform. Therefore, ducking doesn't occur if there is audio description early in the video, which there often is (e.g., descriptions of opening scenes, credits, and title screens). You can check this out on several of several of my JW Player Tests.

The following screen shot shows the JW Player with closed audio description and ducking enabled, plus an optional waveform of the description audio displayed on the screen.

Screen shot of the JW Player with ducking and visible waveform enabled

Despite the current shortcomings, this is an exciting feature that will probably play an important role in how closed audio description is ultimately delivered. I think it's feasible to do something similar in HTML using an audio API like Mozilla's proposed audio data API or the W3C Web Audio API. Fun work lies ahead!

Is This Video Player Open Source?

The content of the DO-IT Video site, including its player, is provided under a Creative Commons License. Please feel free to copy, distribute, transmit, or adapt the work under the same license for noncommercial purposes, provided the source is acknowledged. If you do so, please let us know. Most of the player's functionality is made possible via a single Javascript file and a single CSS file. Unfortunately I don't have time nor funding to provide widespread support for it, but feel free to use this blog as a forum if you're playing around with it and have questions or comments.

Currently the PHP that controls the overall DO-IT Video site, including its search functionality, is not open source. It contains a lot of code that is very specific to our unique needs, and would require a lot of cleanup before we could open source it. Gazing into my crystal ball, I'm not sure if or when I'll have time to do that. In the overall scheme of things though, that was the easy part. We're just using PHP to parse caption files, storing each caption as a record in a mysql database, then executing a mysql query to search the database. We're also parsing timestamped audio description scripts and storing those in the same database, so the audio description and captions can be reassembled (in timestamp order) as a transcript.

That all sounds like a lot of work. Isn't HTML5 <video> supposed to be easy?

You're right. This has been a lot of work, mostly because of the issues surrounding browser and player differences. With this blog post I'm hoping others can learn from my example and build similar solutions with less trial and error. Ultimately though you're right: HTML5 <video> is easy. A novice web developer can pop a video onto their web page in less than a minute with some very simple HTML markup. Unfortunately if they do that today it won't be accessible without a little additional sweat. Someday, hopefully, browsers will do all of this work for us, and every video will be accessible. That's what we're working toward.

2 comments on “Accessible HTML5 Video with JW Player as Fallback

  1. HTML5 spec doesn’t yet cover any meta data outside of duration, width and height (the video spec is one of the more volatile aspects of HTML5). You may have to do it the old fashioned way - with a data object (json or XML) of timestamps for cuepoints...

    Flv Player

  2. @john, Maybe I'm missing something but how does the FLV Player you're recommending (HD WebPlayer) meet the goals of accessible video described in this post? All of the buttons are unlabeled so it's inaccessible to screen reader users, they don't seem to be operable by keyboard so it's inaccessible to people without mice, it doesn't seem to support either closed captions or audio description, and it doesn't work on iOS.

    Seems like a great example of the problem, not the solution.