Proposed HTML element language subtag matches language
Description
This rule checks that the primary language subtag of an element matches its default language
Applicability
This rule applies to any HTML element with a lang
attribute for which all the following are true:
- in body: the element is an inclusive descendant in the flat tree of a
body
element; and - HTML: the element is in a document with a content type of
text/html
; and - Valid language: the element’s
lang
attribute value has a known primary language tag; and - Not empty: there is some non-empty text inheriting its programmatic language from the element which is neither empty nor only whitespace.
Expectation
For each test target, the primary language of its lang
attribute value is a most common language of the test target.
Assumptions
-
This rule assumes that user agents and assistive technologies can programmatically determine known primary language tags even on language tags that do not conform to the RFC 5646 syntax.
-
This rule assumes that only language tags with a known primary language tag are enough to satisfy Success Criterion 3.1.2 Language of Parts; this notably excludes grandfathered tags and ISO 639.2 three-letters codes, both having poor support in assistive technologies.
-
This rule assumes that the text nodes contain text that express something in human language and therefore need a correct programmatic language.
Accessibility Support
There are no major accessibility support issues known for this rule.
Background
This rule checks that, if a lang
attribute is used, its value is correct with respect to the content. This rule does not check whether a lang
attribute should have been used or not. Especially, this rule does not check when lang
attributes are missing. This must be tested separately and it is therefore possible to pass this rule without satisfying Success Criterion 3.1.2 Language of Parts.
Related rules
Bibliography
- Understanding Success Criterion 3.1.2: Language of Page
- H58: Using language attributes to identify changes in the human language
- RFC 5646: Tags for Identifying Languages
- The
lang
andxml:lang
attributes
In all examples, the html
element has itself a lang
attribute in order to make sure that the examples satisfy Success Criterion 3.1.1 Language of Page. These html
elements are, however, never applicable because they are not descendants of a body
element, and the example descriptions do not mention them further.
Accessibility Requirements Mapping
3.1.2 Language of Parts (Level AA)
- Learn more about 3.1.2 Language of Parts
- Required for conformance to WCAG 2.0 and later on level AA and higher.
- Outcome mapping:
- Any
failed
outcomes: success criterion is not satisfied - All
passed
outcomes: success criterion is satisfied - An
inapplicable
outcome: success criterion needs further testing
- Any
H58: Using language attributes to identify changes in the human language
- Learn more about technique H58
- Not required for conformance to any W3C accessibility recommendation.
- Outcome mapping:
- Any
failed
outcomes: technique is not satisfied - All
passed
outcomes: technique is satisfied - An
inapplicable
outcome: technique is satisfied
- Any
Input Aspects
The following aspects are required in using this rule.
Test Cases
Passed
Passed Example 1
This span
element has a lang
attribute value of nl
(Dutch), which matches its most common language. The most common language is Dutch because all words are Dutch.
<html lang="en">
<head>
<title>Dutch idioms</title>
</head>
<body>
<p>
The Dutch phrase <span lang="nl">"Hij ging met de kippen op stok"</span> literally translates into "He went to
roost with the chickens", but it means that he went to bed early.
</p>
</body>
</html>
Passed Example 2
The second p
element has lang
attribute value of nl
(Dutch), which matches its most common language. The most common language is Dutch because all English words are in span
elements with a lang
attribute value of en
. Both span
elements also have a lang
attribute matching their most common language.
<html lang="en">
<head>
<title>Dutch idioms</title>
</head>
<body>
<p>Dutch idioms and their English meaning.</p>
<p lang="nl">
<span lang="en">The Dutch phrase</span> "Hij ging met de kippen op stok"
<span lang="en"
>literally translates into "He went to roost with the chickens", but it means that he went to bed early.</span
>
</p>
</body>
</html>
Passed Example 3
This div
element has a lang
attribute value of en
(English), which matches its most common language. The most common language is English because the accessible texts are English, and all other text is in a p
element with a (correct) lang
attribute value of fr
.
<html lang="FR">
<head>
<title>Feu d'artifice du nouvel an</title>
</head>
<body>
<div lang="EN">
<img src="/test-assets/shared/fireworks.jpg" alt="Fireworks over Paris" />
<p lang="FR">
Bonne année !
</p>
</div>
</body>
</html>
Passed Example 4
This span
element has a lang
attribute value of fr
(French), which matches one of its most common languages. The most common languages are both English and French because all the words belong to both languages.
<html lang="en">
<p>
Even though all its words are English and it has meaning in English, the sentence
<span lang="fr">Paul put dire comment on tape</span> is also a French sentence.
</p>
</html>
Passed Example 5
This span
element has a lang
attribute value of en
(English), which matches one of its most common languages. The most common languages are both English and French because all the words belong to both languages.
<html lang="fr">
<p>
Bien que tous les ses mots soient français et qu'elle ait un sens en français, la phrase
<span lang="en">Paul put dire comment on tape</span> est aussi une phrase anglaise.
</p>
</html>
Failed
Failed Example 1
This span
element has lang
attribute value of fr
(French), which does not match its most common language. The most common language is Dutch because all words are Dutch.
<html lang="en">
<head>
<title>Dutch idioms</title>
</head>
<body>
<p>
The Dutch phrase <span lang="fr">"Hij ging met de kippen op stok"</span> literally translates into "He went to
roost with the chickens", but it means that he went to bed early.
</p>
</body>
</html>
Failed Example 2
The second p
element has lang
attribute value of en
(English), which does not match its most common language. The most common language is Dutch because all English words are in span
elements with a lang
attribute value of fr
. Both span
elements also have an incorrect lang
attribute in order to make sure that all targets in this example fail the rule.
<html lang="nl">
<head>
<title>Met de kippen op stok</title>
</head>
<body>
<blockquote>
<p>"Hij ging met de kippen op stok"</p>
</blockquote>
<p lang="en">
<span lang="fr">The Dutch phrase</span> "Hij ging met de kippen op stok"
<span lang="fr"
>literally translates into "He went to roost with the chickens", but it means that he went to bed early.</span
>
</p>
</body>
</html>
Failed Example 3
This div
element has a lang
attribute value of fr
(French), which does not match its most common language. The most common language is English because the accessible texts are English, and all other text is in a p
element with a lang
attribute value of nl
, which also doesn’t match its common language.
<html lang="fr">
<head>
<title>Feu d'artifice du nouvel an</title>
</head>
<body>
<div lang="fr">
<img src="/test-assets/shared/fireworks.jpg" alt="Fireworks over Paris" />
<p lang="nl">
Bonne année !
</p>
</div>
</body>
</html>
Failed Example 4
This div
element has a lang
attribute value of fr
(French), which does not match its most common language. The most common language is English because the accessible name of the img
element is English. The lang
attribute on the p
element is effectively ignored. The p
element is not applicable because there is no text inheriting its programmatic language from it since its content is neither visible nor included in the accessibility tree.
<html lang="fr">
<head>
<title>Feu d'artifice du nouvel an</title>
</head>
<body>
<div lang="fr">
<img src="/test-assets/shared/fireworks.jpg" aria-labelledby="caption" />
<p lang="en" id="caption" hidden>
Fireworks over Paris
</p>
</div>
</body>
</html>
Inapplicable
Inapplicable Example 1
This document is not HTML.
<svg xmlns="http://www.w3.org/2000/svg" lang="en">
<text x="0" y="0">I love ACT rules!</text>
</svg>
Inapplicable Example 2
There is no descendant of a body
element with a lang
attribute.
<html lang="en">
<body>
<p>I love ACT rules!</p>
</body>
</html>
Inapplicable Example 3
The first p
element is empty because the only element inheriting its programmatic language is itself, and it has no text node child.
<html lang="en">
<body>
<p lang="fr"></p>
<p>I love ACT rules!</p>
</body>
</html>
Inapplicable Example 4
This p
element is empty because it has no content that is either visible or included in the accessibility tree.
<html lang="en">
<body>
<p lang="fr" hidden>I love ACT rules!</p>
</body>
</html>
Inapplicable Example 5
The text inheriting its programmatic language from this div
element is only whitespace.
<html lang="en">
<body>
<div lang="invalid"> </div>
</body>
</html>
Glossary
Accessible Name
The accessible name is the programmatically determined name of a user interface element that is included in the accessibility tree.
The accessible name is calculated using the accessible name and description computation.
For native markup languages, such as HTML and SVG, additional information on how to calculate the accessible name can be found in HTML Accessibility API Mappings 1.0, Accessible Name and Description Computation (working draft) and SVG Accessibility API Mappings, Name and Description (working draft).
For more details, see examples of accessible name.
Note: As per the accessible name and description computation, each element always has an accessible name. When no accessible name is provided, the element will nonetheless be assigned an empty (""
) one.
Note: As per the accessible name and description computation, accessible names are flat string trimmed of leading and trailing whitespace. Notably, it is not possible for a non-empty accessible name to be composed only of whitespace since these must be trimmed.
Attribute value
The attribute value of a content attribute set on an HTML element is the value that the attribute gets after being parsed and computed according to specifications. It may differ from the value that is actually written in the HTML code due to trimming whitespace or non-digits characters, default values, or case-insensitivity.
Some notable case of attribute value, among others:
- For enumerated attributes, the attribute value is either the state of the attribute, or the keyword that maps to it; even for the default states. Thus
<input type="image" />
has an attribute value of eitherImage Button
(the state) orimage
(the keyword mapping to it), both formulations having the same meaning; similarly, “an input element with atype
attribute value ofText
” can be either<input type="text" />
,<input />
(missing value default), or<input type="invalid" />
(invalid value default). - For boolean attributes, the attribute value is
true
when the attribute is present andfalse
otherwise. Thus<button disabled>
,<button disabled="disabled">
and<button disabled="">
all have adisabled
attribute value oftrue
. - For attributes whose value is used in a case-insensitive context, the attribute value is the lowercase version of the value written in the HTML code.
- For attributes that accept numbers, the attribute value is the result of parsing the value written in the HTML code according to the rules for parsing this kind of number.
- For attributes that accept sets of tokens, whether space separated or comma separated, the attribute value is the set of tokens obtained after parsing the set and, depending on the case, converting its items to lowercase (if the set is used in a case-insensitive context).
- For
aria-*
attributes, the attribute value is computed as indicated in the WAI-ARIA specification and the HTML Accessibility API Mappings.
This list is not exhaustive, and only serves as an illustration for some of the most common cases.
The attribute value of an IDL attribute is the value returned on getting it. Note that when an IDL attribute reflects a content attribute, they have the same attribute value.
Focusable
An element is focusable if one or both of the following are true:
- the element is part of sequential focus navigation; or
- the element has a tabindex value that is not null.
Exception: Elements that lose focus during a period of up to 1 second after gaining focus, without the user interacting with the page the element is on, are not considered focusable.
Notes:
- The 1 second time span is an arbitrary limit which is not included in WCAG. Given that scripts can manage the focus state of elements, testing the focusability of an element consistently would be impractical without a time limit.
- The tabindex value of an element is the value of the tabindex attribute parsed using the rules for parsing integers. For the tabindex value to be different from null, it needs to be parsed without errors.
Included in the accessibility tree
Elements included in the accessibility tree of platform specific accessibility APIs are exposed to assistive technologies. This allows users of assistive technology to access the elements in a way that meets the requirements of the individual user.
The general rules for when elements are included in the accessibility tree are defined in the core accessibility API mappings. For native markup languages, such as HTML and SVG, additional rules for when elements are included in the accessibility tree can be found in the HTML accessibility API mappings (working draft) and the SVG accessibility API mappings (working draft).
For more details, see examples of included in the accessibility tree.
Programmatically hidden elements are removed from the accessibility tree. However, some browsers will leave focusable elements with an aria-hidden
attribute set to true
in the accessibility tree. Because they are hidden, these elements are considered not included in the accessibility tree. This may cause confusion for users of assistive technologies because they may still be able to interact with these focusable elements using sequential keyboard navigation, even though the element should not be included in the accessibility tree.
Known Primary Language Tag
A language tag has a known primary language tag if its primary language subtag exists in the language subtag registry with a Type field whose field-body value is language
.
A “language tag” is here to be understood as in the first paragraph of the RFC 5646 language tag syntax, i.e. a sequence of subtags separated by hyphens, where a subtag is any sequence of alphanumerical characters. Language tag that are not valid according to the stricter RFC 5646 syntax (and ABNF grammar) definition can still have a known primary language tag. User agents and assistive technologies are more lenient in what they accept. This definition is consistent with the behavior of the :lang()
pseudo-selector as defined by Selectors Level 3.
As an example, de-hello
would be an accepted way to indicate German in current user agents and assistive technologies, despite not being valid according to RFC 5646 grammar. It has a known primary language tag (namely, de
).
As a consequence of this definition, however, grandfathered tags do not have a known primary language tag.
Subtags, notably the primary language subtag, are case insensitive. Comparison with the language subtag registry must be done in a case insensitive way.
Most Common Language of an Element
The most common language of an element is determined by counting the number of words in the text inheriting its programmatic language from this element that are part of any of the languages in the language subtag registry. The same word can be part of multiple languages. In case of ties, the element has several most common languages. If there are no words in the text inheriting its programmatic language from the element, then it has no most common language.
For more details, see examples of most common language.
Namespaced Element
An element with a specific namespaceURI value from HTML namespaces. For example an “SVG element” is any element with the “SVG namespace”, which is http://www.w3.org/2000/svg
.
Namespaced elements are not limited to elements described in a specification. They also include custom elements. Elements such as a
and title
have a different namespace depending on where they are used. For example a title
in an HTML page usually has the HTML namespace. When used in an svg
element, a title
element has the SVG namespace instead.
Outcome
An outcome is a conclusion that comes from evaluating an ACT Rule on a test subject or one of its constituent test target. An outcome can be one of the three following types:
- Inapplicable: No part of the test subject matches the applicability
- Passed: A test target meets all expectations
- Failed: A test target does not meet all expectations
Note: A rule has one passed
or failed
outcome for every test target. When there are no test targets the rule has one inapplicable
outcome. This means that each test subject will have one or more outcomes.
Note: Implementations using the EARL10-Schema can express the outcome with the outcome property. In addition to passed
, failed
and inapplicable
, EARL 1.0 also defined an incomplete
outcome. While this cannot be the outcome of an ACT Rule when applied in its entirety, it often happens that rules are only partially evaluated. For example, when applicability was automated, but the expectations have to be evaluated manually. Such “interim” results can be expressed with the incomplete
outcome.
Programmatically Hidden
An HTML element is programmatically hidden if either it has a computed CSS property visibility
whose value is not visible
; or at least one of the following is true for any of its inclusive ancestors in the flat tree:
- has a computed CSS property
display
ofnone
; or - has an
aria-hidden
attribute set totrue
Note: Contrarily to the other conditions, the visibility
CSS property may be reverted by descendants.
Note: The HTML standard suggests rendering elements with the hidden
attribute with a CSS rule that applies the value none
to the CSS property display
of the element. Although the suggestion is not normative, known user agents render it according to the suggestion (unless the content specifies another CSS rule that sets the value of the display
property). If a user agent does not follow the suggestion, this definition may produce incorrect results for this user agent.
Text Inheriting its Programmatic Language from an Element
The text inheriting its programmatic language from an element E is composed of all the following texts:
- text nodes: the value of any text nodes that are visible or included in the accessibility tree and children of an element inheriting its programmatic language from E;
- accessible text: the accessible name and accessible description of any element inheriting its programmatic language from E, and included in the accessibility tree;
- page title: the value of the document title, only if E is a document in a top-level browsing context.
An element F is an element inheriting its programmatic language from an element E if at least one of the following conditions is true (recursively):
- F is E itself (an element always inherits its programmatic language from itself); or
- F does not have a non-empty
lang
attribute, and is the child in the flat tree of an element inheriting its programmatic language from E; or - F is a fully active document element, has no non-empty
lang
attribute, and its browsing context container is an element inheriting its programmatic language from E.
Visible
Content perceivable through sight.
Content is considered visible if making it fully transparent would result in a difference in the pixels rendered for any part of the document that is currently within the viewport or can be brought into the viewport via scrolling.
For more details, see examples of visible.
Whitespace
Whitespace are characters that have the Unicode “White_Space” property in the Unicode properties list.
This includes:
- all characters in the Unicode Separator categories, and
-
the following characters in the Other, Control category:
- Character tabulation (U+0009)
- Line Feed (LF) (U+000A)
- Line Tabulation (U+000B)
- Form Feed (FF) (U+000C)
- Carriage Return (CR) (U+000D)
- Next Line (NEL) (U+0085)
Implementations
There are currently no known implementations for this rule. If you would like to contribute an implementation, please read the ACT Implementations page for details.
Changelog
This is the first version of this ACT rule.