In this series of blog posts I will focus on dash.js. I will explain how certain features are implemented and how they can be used within applications. This time we are looking at a new media file format called Common Media Application Format (CMAF), the mechanism it provides to do low latency live streaming and how this works in combination with dash.js.
CMAF – Common Media Application Format
So first of all what is CMAF and why is it a huge step in the right direction? Basically, CMAF is just another media container based on ISOBMFF and fragmented mp4. The great thing about CMAF is that it can be referenced within HLS and DASH manifest files. So in theory we only need to encode and package our content once (within the CMAF container) and are able to target all major platforms. We are no longer forced to create and store separate media files with a TS container for HLS and an ISOBMFF container for DASH. Hence, CMAF immediately cuts our storage and packaging costs in half a doubles the CDN efficiency. CMAF even becomes mandatory when delivering 4k or HDR content with HLS because Apple does not support HEVC with TS containers.
However, some problems still remain. Legacy devices will most certainly not receive an update adding CMAF support. Moreover, encrypting the content in a uniform way still remains an issue. Although CMAF supports Common Encryption (CENC), Apple devices using a Fairplay DRM require a different encryption scheme than devices using Playready or Widevine DRM (cbcs vs. cenc encryption).
How CMAF enables low latency streaming
In addition to the aforementioned benefits CMAF provides us with the necessary tools for low latency live streaming. There is a great talk on that topic by Will Law from this years Demuxed. I highly recommend watching this to get detailed insights on the related technologies. For now, I will just pick up the main facts in order to explain how low latency streaming with CMAF works.
CMAF introduces the concept of “chunks”. Think of a chunk as some kind of small segment inside the classic 2-6 second segments. On a high level a classic segment consists of one “moof” box and one “mdat” box. With CMAF chunks the segment now has multiple of such boxes allowing the client to access the media data before the segment is completely finished. The benefits of the chunked mode become more obvious when we look at a concrete example:
So let’s assume we have 8 second segments and we are currently 3 seconds into segment number four. For classic media segments this leaves us with two options:
- Option 1: Since segment four is not completed we start with segment three. That way we end up 11 seconds behind the live edge, 8 seconds coming from segment three and 3 seconds coming from segment four.
- Option 2: We wait for segment four to be finished and immediately start downloading and playing it. We end up with 8 seconds latency and a waiting time of 5 seconds
Now with CMAF chunks on the other hand we are able to play segment four before it is completely available. In the example above we have CMAF chunks with a duration of 1 second leading to eight chunks per segment. Let’s assume only the first chunk contains an IDR frame and therefor we always need to start the playback from the beginning of a segment. Being three seconds into segment four leaves us with 3 seconds latency. That’s much better than what we achieved with classic segments. We could also fast decode the first chunks and play even closer to the live edge. However, it is important to keep in mind that low latency streaming has a negative influence on our media buffer. Getting closer to the live edge will result in smaller media buffers and less robust playback with regards to network fluctuations.
CMAF low latency with dash.js
Let’s take a closer look how CMAF low latency works within dash.js. Since version 2.6.8 dash.js has a low latency mode. Moreover, they offer two sample streams with low latency support generated by the DASH-IF live simulator.
Signal low latency in the manifest
First of all we need a way to signal the client that our segments are chunked and available prior to being complete. In the sample manifest files we can identify two new attributes:
- @availabilityTimeComplete: Specifies if all Segments of all associated Representation are complete at the adjusted availability start time. If the value is set to false, then it may be inferred by the client that the segment is available at its announced location prior being complete.
- @availabilityTimeOffset (ATO): Provides the time how much earlier segments are available compared to their computed availability start time (AST)
By setting @availabilityTimeComplete to “false” we tell the client that the segments are available prior to being complete. Using the @availabilityTimeOffset (ATO) we can specify how much earlier they are available. In our example the segments have a duration of 8 seconds and the ATO is set to 7 seconds. This means that we have a chunk duration of 1 second and the first segment is available 7 seconds before its usual completion time.
Calculating and updating the segment availability
In dash.js a lot of different processes are running in parallel. One of these processes is responsible for constantly updating the availability window of the media segments. The player uses this information to determine which segment to load at which time. The basic flow of the update process is depicted below:
The PlaybackController triggers an event every 50ms telling the RepresentationController to update its internal segment availability range. The RepresentationController uses the TimelineConverter to get the current segment range values and updates its internal range attribute:
voRepresentation.segmentAvailabilityRange = range;
The TimelineConverter is also the class which uses the new @availabilityTimeOffset from the MPD. Without going into too much detail the offset subtracted from the current wall clock time becomes smaller with a larger @availabilityTimeOffset:
const endOffset = segmentDuration - availabilityTimeOffset;
Requesting the right segments
Now that we know how the player updates the internal segment range state we can take a closer look how the segments requests are generated:
For means of simplicity the flow illustrated above only shows the communication up to the point where we actually use the segment availability range described in the section before. The ScheduleController periodically checks for new fragments to download every 100ms. It issues a request to the NextFragmentRequestRule which calls the DashAdapter for a fragment request. The DashAdapter contacts the DashHandler which calls an internal update function leading to a getSegment request on the TemplateSegmentsGetter. Depending on the type of the manifest the getSegment request can also be issued on a SegmentTimeline specific Getter class. Finally the TemplateSegmentsGetter uses the segmentAvailabilityRange we calculated previously:
const availabilityWindow = representation.segmentAvailabilityRange;
At this point the program flow leads us back to the ScheduleController which executes the fragment request using the FragmentModel.
This pretty much concludes our dive into CMAF, low latency streaming and the corresponding implementation in dash.js. Some final remarks: In order to use the low latency feature, the browser running dash.js needs to support the Fetch API and HTTP 1.1 chunked transfer encoding. The combination of both allows us to access the mediadata prior to the media segment being completely available. Moreover, dash.js requires us to manually enable the low latency feature for a stream. This can be done directly in the options tab of the reference player or by calling the setLowLatencyEnabled function on the mediaPlayer object.