Accessible Language Pickers: a11y meets i18n/l10n

Helena Zubkow and Mike Herchel published a great article recently comparing the U.S. Presidential candidates websites on accessibility. That article points out several features (both good and bad) that affect accessibility for some visitors to their sites. There’s one feature in particular that I want to expand upon since it’s been on my mind lately. Hillary Clinton’s site (hillaryclinton.com) is available in both English and Spanish, and there’s a link in the main menu that takes users from one site to the other. (It’s no surprise that Donald Trump doesn’t have a Spanish version since he’s the guy who wants to build un muro).

Anyway, I’ve developed an interest in how best to code a menu of languages, as in a language picker for translated versions of a website. Hillary’s site provides an interesting example of why this matters. On the main menu bar at the top of her English website, she has a link to “ES” which targets the Spanish version.

Screen shot of menu bar from Hillary's English website, which includes an ES link

Why “ES”, rather than “Español“? I’d be curious to know the thinking behind that decision. In contrast, the link from the Spanish site back to the English site says “English”, not just “EN”.

Screen shot of menu bar from Hillary's Spanish website, which includes an English link

For screen reader users, it’s important for web developers to identify the language of the page so multilingual screen readers know how to pronounce the words they’re reading. Hillary’s sites as a whole are coded properly: The English site has lang=”en” on the <html> element, and the Spanish site has lang=”es”.

It’s also important though to identify the language of foreign language text within the body of a web page. Otherwise screen readers will pronounce the foreign words using the rules of the main language of the page, which at best makes the screen reader sound silly, and at worse is indecipherable. In Hillary’s case, if she had used Español as the link text for switching to the Spanish site, she would need to specify in the code that Español is a Spanish word, like so:

<a href=”/es/” lang=”es”>Español</a>

Of course, she doesn’t really need to do this since “ES” isn’t a word at all. However, she should do it on the Spanish version, where the word “English” is not correctly tagged, therefore is pronounced by screen readers in a thick Spanish accent. Here’s how it should be tagged:

<a href=”/” lang=”en”>English</a>

What about unpronounceable languages?

This is the issue that has brought me to ponder this topic. Able Player, the accessible media player I’ve developed with help from the open source community, supports subtitles in any language, and users can browse a list of available languages and choose the one they need. On the DO-IT Video website, which is running Able Player, some of our videos have been translated into Chinese (simplified and traditional), French, Greek, Indonesian, Japanese, Korean,
Portuguese, Spanish, and Vietnamese as part of the DO-IT Translation Project. When subtitles are available for any of these languages, they appear in a menu that pops up when users click the CC button.

To me, it seems reasonable to list the languages in their native language rather than in English since the reader might be unable to read English. (Would you English readers recognize “English” if it appeared as “イングレス” on a Japanese website?)

As for Hillary’s using “ES”, that’s the two-character ISO 639-1 code for Español—all major languages have one. But how many web users know their ISO 639-1 code? Maybe all Spanish-speaking people understand that an “ES” link is intended for them, but in case there are a few who don’t, I think spelling it out makes more sense.

But what if a language picker includes multiple languages, including some not supported by users’ screen readers? If we list the languages in their native language rather than English (e.g., 日本語) how will screen readers handle this?

To find out, I created a Language Picker Test Page and tested it using various screen readers:

JAWS 17, tested in both IE11 and Firefox 48 with its default synthesizer (Eloquence)
NVDA 2016.2.1, tested in Firefox 48 with its default synthesizer (eSpeak NG)
VoiceOver in Safari on Mac OS 10.11.2 (El Capitan)
VoiceOver on iOS 8.4.1
TalkBack on Android 4.4 (on a Nexus 4 phone)
Microsoft Narrator on Windows 10
Window-Eyes 9.5 with eSpeak in IE11

The Test Page includes a variety of multilingual examples (a paragraph, several lists of links, and several lists of radio buttons with labels) and features a variety of languages (German, French, Spanish, Japanese, Nepali, and Inuktitut).

The multilingual paragraph was created in order to conduct a simple baseline test to determine whether a screen reader supports automatic language switching at all. This test revealed that VoiceOver on Mac OS X, though it supports a variety of languages, can’t switch between them on-the-fly (the same is true of Narrator and TalkBack). If the voice is set to English, these screen readers reads everything in English, including words like 日本語. Stay tuned for details on how how they handle that.

VoiceOver on iOS does support automatic switching between languages, so hopefully Apple can extend this same functionality to the desktop soon.

Supported, recognized, and unrecognized languages

Screen readers behave differently depending on whether they:

support the language.
recognize the language.
do not recognize the language.

For all screen readers, if they support the language they simply speak the content using the voice for that language. Which languages are supported depends on both the screen reader and the synthesizer. When they don’t support a language, they might at least recognize it, in which case they can identify the language to the user, if not speak it. Or they might not recognize it at all. What happens in the latter case varies widely across screen readers.

In the paragraph example of my test page, JAWS (using its default synthesizer Eloquence) supports English, German, Spanish, and French, and reads them all fluently. It recognizes Japanese and announces it as “Japanese”, but does not attempt to read it. It does not recognize Nepali nor Inuktitut, so it announces them as “ne” and “iu” respectively (the value of their lang attributes), and does not attempt to read their content. “Ne” is pronounced just like in Monty Python and the Holy Grail, and “iu” is pronounced “you”.

The experience with NVDA is much worse for unrecognized and non-supported languages. NVDA with its default synthesizer (eSpeak) supports the same languages JAWS does by default, but for the languages it doesn’t support it spells out the words. For the Japanese text this is a surreal experience as it recognizes some characters as Chinese and others as Japanese, so the entire phrase 私は日本語を話します (“I speak Japanese”) as spoken by JAWS is “Chinese letter, Japanese letter, Chinese letter letter letter, Japanese letter, Chinese letter, Japanese letter letter letter” (rhythmic, but not particularly informative). If a JAWS user inspects this content one character at a time, JAWS adds the Unicode numerical value for each character (e.g., “Japanese letter 307E”).

I included Window-Eyes in my test because they’re rarely included in reports like these and I personally know a few Window-Eyes users. One major annoyance though is that Window-Eyes has 26 synthesizers to choose from in Settings (impressive!) but every time I select a new synthesizer I’m invited to purchase an activation for that synthesizer from AI Squared. I have no tolerance for in-app purchases when I’ve already paid for and licensed a product, so I decided not to test Window-Eyes after all. I did do a cursory test and found that Window-Eyes seems to behave the same as NVDA (both using the eSpeak synthesizer).

As noted above, VoiceOver on Mac OS X doesn’t support automatic switching between languages (as of El Capitan). So all content is read using the default language/voice. When it encounters characters it doesn’t understand, it behaves essentially the same as NVDA and spells out the words, but does so with a bit more linguistic suavity than NVDA does. For example, where NVDA announces “Chinese letter” and “Japanese letter”, VoiceOver announces the Japanese lettering system such as “Hiragana”, “Katakana” and “Kanji”. Similarly, Nepali letters in my example were identified as “Devanagari”, and Inuktitut letters were identified as “Canadian Syllabics”.

VoiceOver on iOS, which does switch automatically between languages, is the only screen reader I tested that supports Japanese out of the box. Predictably though, it does not support Nepali or Inuktitut, and when it encounters content in an unsupported language it simply says “unpronounceable”. It doesn’t make any effort to identify the language. This results in less noise, but also less information.

I included Microsoft Narrator in my tests because there’s a lot of buzz about how much it’s improved in Windows 10, maybe even to the point of being a serious screen reader. I think that’s true in a lot of ways, but it doesn’t support automatic switching between languages, and when it encounters languages it doesn’t recognize, it says nothing, like the kid in the back of the class who’s too shy and frightened to say “I don’t know” when called upon.

The experience with TalkBack is the same as with Narrator: No automatic language switching, and silence when confronted with the unknown.

Since there are only 185 language codes defined by the ISO 639-1 standard, it’s reasonable to expect screen readers to recognize all language codes and announce the name of the language. Acknowledging when content is “unpronounceable” is ok, but users should at least know what language the unpronounceable content is in.

But since screen readers don’t currently do that, using native language names in a language picker doesn’t work so well across screen readers. Granted, your website may not be translated into Nepali or Inuktitut, and if you’re only supporting more common Western languages like English, Spanish, French, and German, then your users’ screen readers may indeed support all of these languages. Unfortunately there’s no way of knowing which screen readers and synthesizers your users are using, and which languages they support. Plus, some of your users are sure to be using screen readers such as VoiceOver on Mac OS X or Microsoft Narrator (maybe?) that don’t support language switching at all.

Native Language Name with English Fallback

Since screen readers don’t reliably identify languages they don’t support, I think it’s a good idea to also provide the name of the language in English.

One could provide that as a title attribute, but then it’s only available to sighted users if they mouse over the native name, plus screen readers don’t read titles on links if link text is present.

A better option may be to make the English language visible, adjacent to the native language, so it’s clear to everyone what the language is (see Example 2 on my test page). If you do this it’s important for the native name and English name to be wrapped in separate elements, each with its own lang attribute (actually, the lang attribute is only necessary on the element containing the native language, since the English name element inherits the lang value of the document).

Here’s what the HTML might look like for a link to the Japanese version within a list of languages:

<li>
  <a href="/ja/">
    <span lang="ja">日本語</span>
    <span lang="en">Japanese</span>
  </a>
</li>

This actually works reasonably well with screen readers. JAWS and NVDA both read or spell the native language as they normally would, but if it’s an unsupported language and the outcome is noise that’s tolerable since it’s accompanied by an English translation.

Unfortunately VoiceOver in iOS is a party pooper and doesn’t support lang attributes on a <span> inside of an <a> element. The only way to get VoiceOver to read the native names in their proper voices is to place the lang attribute on the <a> element. However, if we do that VoiceOver also pronounces the English name in the native language with a thick accent, even if the span for the English name has lang=”en”.

Another approach would be to move the English name so it’s outside the link in a separate container with lang=”en”, then reference it with an aria-describedby attribute, like so:

<li>
  <a lang="ja" href="/ja/" aria-describedby="eng-ja">日本語</a>
  <span lang="en" id="eng-ja">Japanese</span>
</li>

So the link is in Japanese, and its supplemental description is in English, and the lang attributes of both elements reflect their languages. See a live example of this in Example 3 on my test page. The expected behavior would be for screen readers to announce the link text in its native language if that’s a supported language, otherwise to announce it however it typically does (by spelling it out or declaring it “unpronounceable”). Then, after a brief pause, translate whatever it just said into English.

This works as expected in NVDA.

It also works as expected in JAWS, but only in Firefox. In IE JAWS does not support aria-describedby on links when navigating with the tab key. However, if reading the page using the virtual cursor JAWS in IE does read all content in the proper language. JAWS in Firefox may actually be the best at handling this code: For languages it doesn’t support it announces the language, and for those it doesn’t recognize it announces the language code (“ne” and “iu”). That’s not particularly helpful but at least it’s succinct; arguably better for most users than spelling out each letter in Unicode. And since JAWS supports aria-describedby in Firefox, its brief effort to identify the language, whether or not it’s informative, is followed after a brief pause by an English translation. Not bad! If only it worked this well in IE too!

The only problem with JAWS in Firefox is that it reads its own verbiage (e.g., “radio button not checked” in the native voice, even though it’s English verbiage. But at this point I’m willing to accept that as cute.

In VoiceOver on iOS, each native name is now read in the native voice. However, the English name is also read in the native voice. This sucks, but I really can’t fault VoiceOver for doing this. The description, when referenced via aria-describedby from another element, essentially becomes an attribute of the host element (i.e., the one being described). If the value of the description’s lang attribute conflicts with the value of the host element’s lang attribute, which lang should be applied to the description? That’s a tough call, and I don’t think it’s spelled out explicitly in the ARIA spec.

So what have we learned? Nothing is perfect. Every screen reader/browser combination has one or more quirks in the way lang attributes and foreign characters are rendered. Nevertheless, I think using aria-describedby to reference a separate container that contains the English name with lang=”en” makes sense semantically. So let’s stick with that model as we build a slightly more complex widget…

Applying This to Radio Buttons

As explained earlier, my primary interest in a well-coded, accessible language picker stems from Able Player’s support for subtitles. In that use case, a menu of available subtitle languages pops up when users click the CC button on the player. The menu is coded as an unordered list of radio buttons. Based on lessons learned from the previous example, here’s how this might be coded:

<li>
  <input lang="ja" type="radio" name="lang" id="lang-ja" 
    value="ja" aria-describedby="eng-ja"/>
  <label lang="ja" for="lang-ja">日本語</label>
  <span lang="en" id="eng-ja">Japanese</span>
</li>

Note that both the <input> and <label> elements have lang=”ja”. This is because if we had to chose one, which would we choose? I tested both on my test page (Examples 4 and 5) and here’s what I found:

VoiceOver on iOS honors the language of the input, not the label.
JAWS in both Firefox and IE honors the language of the input, not the label.
NVDA doesn’t honor either one. It announces all form fields using the language of the page.

This would seem to suggest that there’s no reason to add lang to the label, since both VoiceOver and JAWS ignore that in favor of the input’s language, but I figure there’s no reason not to add a lang attribute, just in case some screen reader supports it someday.

So how do screen readers support the English text with this markup?

JAWS in Firefox honors the language of the description, and reads the text in English (perfect!)
JAWS in IE does not read the English text at all when navigating in Forms mode. However, in virtual mode it correctly honors the value all lang attributes.
VoiceOver in IOS reads the English name in the language of the input (the native language).
Both NVDA and VoiceOver on Mac OS X read everything in the language of the page.

Conclusion

If you’re a multilingual screen reader user accessing properly tagged multilingual websites, I recommend using JAWS with Firefox!

As a developer, the biggest takeaway for me from these experiments is that unsupported and unrecognized languages are not rendered well by screen readers, which reinforces the need to supplement the noise with a translation in English (or the primary language of the site, if not English).

A language picker is admittedly a relatively minor feature rarely encountered in the wild, especially the extremely rare language picker that’s been coded with accessibility in mind. So it’s not surprising that screen readers are all over the world map in how they handle my sample markup. But we should code according to standards anyway, using lang attributes in logical places, and hopefully screen reader makers will eventually fix their anomalies.

But of course, none of this matters if we in the United States elect a president who initiates Armageddon. So be sure to vote wisely in November!

¡Regístrate para votar aquí! La elección es el martes, 8 de noviembre.

windows 10 pro satın al

What about unpronounceable languages?

Supported, recognized, and unrecognized languages

Native Language Name with English Fallback

Applying This to Radio Buttons

Conclusion

One reply on “Accessible Language Pickers: a11y meets i18n/l10n”