AWS for M&E Blog
Back to basics: Accessibility signaling with AWS Elemental Media Services
In the first part of this series, Back to basics; Accessibility services for Media, we reviewed the accessibility services available for broadcast and streaming today. In this second part we plan look at how these accessibility requirements can be met using HAQM Web Services (AWS) Elemental Media Services.
In many countries regulators require Broadcasters to increase the accessibility of their content. TV companies are now electing to also apply these regulations to their Over The Top (OTT) streaming offerings. They want to preserve the broadcast end user experience as more and more viewers are starting to consume media content delivered over the internet.
Streaming over the internet is not trivial, as the number of types of devices that can receive media content are much broader in comparison with broadcast. Signaling of the accessibility components becomes a key part of content processing. To help with this OTT streaming formats describe how accessibility services should be signaled in their specifications: like Apple’s HTTP Live Streaming (HLS) specification or the DVB organization’s DVB DASH (Dynamic Adaptive Streaming over HTTP) specification. DASH and HLS – these are the two most commonly used streaming formats today.
In HLS, accessibility information is typically signaled through:
- EXT-X-MEDIA tags in variant playlist with:
-
- CHARACTERISTICS attribute: Indicates accessibility features like “public.accessibility.describes-video” or “public.accessibility.transcribes-spoken-dialog”
- LANGUAGE attribute: Specifies the language of the audio or subtitle track
- TYPE attribute: Can be “CLOSED-CAPTIONS” or “SUBTITLES”
- CEA-608/708 closed captions can be signaled using:
-
- CLOSED-CAPTIONS attribute in the EXT-X-STREAM-INF tag
- “CLOSED-CAPTIONS=NONE” indicates no embedded closed captions
This helps players and streaming services properly handle accessibility features like closed captions, subtitles, and audio descriptions for viewers who need them.
In DASH, accessibility is typically signaled as Accessibility Metadata in manifest/playlist file:
- Closed Captions/Subtitles signaled like this:
- Audio Description like this:
Interestingly that DASH specification defines both accessibility and role descriptors. These can be confusing because of the overlap, as they can signal effectively the same information. For example, both can be set to values that indicate enhanced-audio-intelligibility. In this situation, it is important to understand which of these your client player supports and use the supported option.
Roles tell players what sets of content to default to and provide what general type of bucket the content falls into:
- Main – The primary language of the region or (more rarely) the language of the source content
- Dub – Languages other than the primary, or languages the audio has also been translated to
- Alternate – Visually impaired, hard of hearing, enhanced audio intelligibility
- Commentary – Visually impaired
- Caption – May be used with burned in captions where the media-type is “video”
- Sign – Video representing sign-language interpretation
- Description – Textual or audio media containing a textual description
- Enhanced-audio-intelligibility – Experience containing an element for improved intelligibility of the dialogue where multiple AdaptationSets can be marked similarly, but differentiate by codecs or languages.
If Roles are not supported in the player, the alternative is to use an accessibility tag which informs the player, depending on the schema used, specific details about what type of alternate or commentary data is present.
For example, urn:tva:metadata:cs:AudioPurposeCS:2007:
- @value = “1” for the visually impaired
- @value = “2” for the hard of hearing
- @value = “8” for enhanced audio intelligibility or dialogue enhancement
AWS as a solution provider for modern OTT platforms enables accessibility features in purpose-built services for video processing: AWS Elemental MediaLive, AWS Elemental MediaConvert, AWS Elemental MediaPackage and other AWS Elemental Media Services.
Implementing accessibility features opens your content to a broader viewership while delivering superior user experiences. Beyond meeting regulatory requirements, these services provide a competitive advantage that distinguishes your content in the marketplace
AWS Elemental MediaLive
AWS Elemental MediaLive (MediaLive) is a live video processing service that enables to encode high-quality live video streams for broadcast television and multiscreen devices.
With MediaLive you can configure DASH audio description signaling for an audio track for Microsoft Smooth Streaming (MSS) and a Common Media Application Format (CMAF) output group. It might be used by downstream packagers, like AWS Elemental MediaPackage (MediaPackage), to create the correct accessibility signaling for DASH output.
For captions, with MediaLive you can configure DASH accessibility signaling for an output with captions when a MSS and CMAF output group is used. Alternatively, if you are using a HTTP Live Streaming (HLS) output group, the caption track can define accessibility features, such as written descriptions of spoken dialog, music, and sounds.
AWS Elemental MediaConvert
AWS Elemental MediaConvert (MediaConvert) is a file-based transcoder with packaging capability. MediaConvert enables the captions signaling for both HLS and DASH output types.

Figure 3: Example of the accessibility signaling configuration for a caption output on MediaConvert.
For captions in HLS variant playlists, MediaConvert adds the following accessibility attributes under EXT-X-MEDIA for a subtitles track:
CHARACTERISTICS="public.accessibility.describes-spoken-dialog,public.accessibility.describes-music-and-sound" and AUTOSELECT="YES"
For captions with DASH playlists, MediaConvert adds the following in the adaptation set for the track:
<Accessibility schemeIdUri="urn:mpeg:dash:role:2011" value="caption"/>
For accessible audio, it is possible to use an audio track with pre-mixed audio descriptions, also known as Broadcast Mix audio description. To enable this setting, you need to set the Descriptive Video Service (DVS) for Audio Description (AD) flags in your HLS variant playlist in order to use a CMAF output group.
When you set the Flag option, MediaConvert includes the parameter CHARACTERISTICS="public.accessibility.describes-video"
in the EXT-X-MEDIA entry for this track.
In addition, with MediaConvert you can set the audio descriptor to indicate audio description and broadcaster mix to the player device consuming the stream.
Here under Audio Track Type select BROADCASTER_MIXED_AD when the input contains pre-mixed main audio as well as a second audio description (AD) as a stereo pair. The value for Audio Type will be set to signal to downstream systems that this stream contains “broadcaster mixed AD”.
AWS Elemental MediaPackage
AWS Elemental MediaPackage is a just in time packager prepares, protects, and distributes your video content to a broad range of connected devices. MediaPackage can process Video on Demand (VoD) as well as Live content.
VoD
For VoD, MediaPackage leverages the accessibility subtitle signaling in the source HLS manifest to both HLS and DASH outputs with both caption and audio accessibility signaling present. MediaPackage will automatically create the metadata on the output if the metadata is properly present in the input.
For HLS and CMAF endpoints/outputs it will pass through the #EXT-X-MEDIA CHARACTERISTICS
attribute value for all subtitles tracks.
Here is an example of the resulting signaling for captions:
For audio HLS and CMAF endpoints and outputs. The source HLS variant playlist will include a CHARACTERISTICS="public.accessibility.describes-video"
attribute for audio tracks and pass through the #EXT-X-MEDIA CHARACTERISTICS
attribute value for all audio tracks as well as the #EXT-X-MEDIA:TYPE AUTOSELECT
and DEFAULT
attributes.
Concerning DASH endpoints/outputs, MediaPackage will add two elements to the AdaptationSet level, when when subtitles or audio tracks on the input present a CHARACTERISTICS="public.accessibility.describes-music-and-sound"
attribute value or a public.accessibility.transcribes-spoken-dialog
attribute value:
Live
For Live streams (MedaPackage Live v2) expectation that the CMAF ingest protocol should be used. Accessibility attributes corresponding to a given stream are signaled in MP4 KIND boxes carrying DASH accessibility information. Each KIND box consists of a (scheme, value) pair which will look like this:
As a result, you should expect the following metadata in the resulting DASH Manifest.
For captions:
For audio:
For HLS manifest you should expect accessibility information to be associated with the CHARACTERISTIC property of the EXT-X-MEDIA tag. Here is an example of resulting signaling for captions:
Correspondingly, for audio renditions, the only valid value is public.accessibility.describes-video
.
Conclusion
Our accessibility series concludes with practical guidance for implementing accessibility features with AWS Elemental Media Services. These insights into packaging and player preferences provide you with the essential tools to create inclusive streaming experiences for all viewers.
By implementing accessibility features in AWS Elemental Media Services enables your content to reach viewers who might otherwise be excluded, while improving the overall viewing experience for everyone. By prioritizing accessibility, you’re contributing to a more inclusive digital media environment while ensuring your content serves the needs of all viewers.
Contact an AWS Representative to know how we can help accelerate your business.
Further reading
- AWS Skill Builder Lesson: Media Services Learning Plan
- AWS Media Services Resources
- Content Localization and captions on AWS