What happens when a video call works perfectly in one environment, but degrades rapidly in another? What about an audio call that sounds good most of the time, but occasionally cuts out?
Sometimes these issues are the result of a bug. Sometimes, however, these issues are the result of “bad” network conditions, which is to say one or more measurable characteristics are outside what we would call the “good” range. Common examples of network issues include video lag/freezing, audio latency, buffering, and delays.
These issues have implications for developers who are building applications where the success of the platform is dependent on user experience. In short, most applications. For developers scaling their platforms to reach thousands of concurrent users, even the slightest network problem can grow exponentially, if the video stack and use case is misaligned.
If you are experiencing issues where things work well on one network but not another, then a few of the key metrics to look at are:
- Bandwidth
- Latency (Lag)
- Packet Loss
- Jitter
Bandwidth
Bandwidth refers to the rate at which data can be transferred between two endpoints.
You can think of it as the minimum size of the Internet pipeline between you and the remote party. In other words, it is the size of the most constrained network leg between the two. You could have an extremely fast connection, but that won’t matter much if the other side is connected to a WiFi hotspot or corporate VPN with only a few KB/s available.
An out-of-control sender that exceeds the available bandwidth of a receiver will quickly overwhelm the network and cause severe degradation in the stream quality.
Symptoms of Overloaded Connections
- Permanent video freezing
- Permanent video frame-rate drops
- Choppy audio
- Dropped connections
How To Test WebRTC Bandwidth
If you experience symptoms like this, it’s easy to test bandwidth using websites like fast.com or speedtest.net. Other network troubleshooting can be accomplished with scalable testing suites of software such as testRTC.
In your application, you can run quick bandwidth tests by simply hosting a large file on a server and seeing how much of that file can be downloaded in a second. It’s a simple approach, but highly effective in troubleshooting a problematic network.
While the connection is active, the best solution is for senders to approximate as closely as possible the available bandwidth to the receiver using RTCP feedback and heuristics. LiveSwitch exposes the RTCP traffic as part of its API to allow applications to fine-tune the media streams in real-time to adapt to their needs (e.g. prioritizing audio over video, preferring to scale image size over encoder quality, etc. Both LiveSwitch Server and Cloud clients can enjoy the power of automatic bandwidth adaptation to ensure streams are optimized for every connection - with confidence.
Latency
Latency is the amount of time it takes (usually measured in milliseconds) to get from one network interface to another.
Round-trip time (RTT) is closely correlated, as it is the time it takes to get from one network interface to another and then back again. If x is the latency from A to B and y is the latency from B to A, then the RTT is calculated as x + y.
By itself, the primary effect latency has on a video call is simply a delay in the time to hear and see the sender’s audio and video. A small latency (less than 100ms) may not even be noticeable in a two-way call. If the stream is one-way audio broadcast, even a significant latency may not be a problem depending on the use case.
Video, however, is an entirely different story.
What is an efficient video stream?
An efficient video stream relies heavily on the use of negative acknowledgements (NACKs) sent by the receiver to the sender whenever a packet is lost or dropped. The sender has the opportunity to resend the missing packet, avoiding the need for a full frame refresh (keyframe) to be sent.
What if you have a high latency network?
A high latency network drastically reduces the effectiveness of this approach, since the round-trip time may very well exceed the length of time the receiver can wait. In order for the video to stay synchronized with audio, the receiver can only wait up to the length of time the audio is buffered for playback, at which point it has to give up and move on. Audio is not typically affected by this, since it does not rely on keyframes.
The symptoms of a high-latency connection are:
- Lag in audio/video playback
- Occasional video freezing
- Occasional video frame-rate drops
- Smooth audio
If you are looking at your LiveSwitch logs, you will also typically see a high number of picture loss indications (PLIs), which are used by the receiver to request a full frame refresh.
WebRTC Packet Loss
Loss is the number, or percentage, of packets that are dropped or lost in a stream over a period of time.
Most media streams will drop a few packets here and there, especially on WiFi networks. In a real-time media stream, packet loss is generally preferred as it allows the connected devices to drop data/frames rather than introduce lag into the playback.
How is Packet Loss Handled?
How loss is handled depends on the nature of the data being sent. If forward error correction (FEC) is enabled for a media stream, a lost packet can sometimes be recovered automatically based on the existing data already received.
What if FEC is not available?
If FEC is not available or fails, an intelligent audio decoder (like Opus) can look at the playout waveform thus far and generate data that closely approximates what the missing audio packet may have contained - a technique known as packet loss concealment (PLC).
In cases where the audio decoder doesn’t support this or the packet loss is too great, zero-byte packets can be generated to fill the gap. This causes the audio to cut out, but keeps the playback buffer ready to handle whatever comes next.
While audio packets can be recovered through the use of NACKs and retransmission, it is not generally recommended for real-time media for a few reasons:
- The recovered packets often arrive too late to be useful (audio waits for no one).
- Since audio is already low bitrate, the NACK requests can increase the audio bandwidth requirements by a significant percent.
- Unless the packet loss is extreme (in which case NACKs probably wouldn’t help anyway), the cutouts are often not significant enough to impact the conversation.
- Unlike video, recovery from packet loss is immediate with no need for a keyframe.
- Video loss is typically handled through NACKs as described in the previous section since the network cost of a full frame refresh is so high.
If the NACKs fail, however, we typically have no choice and must send a PLI to request one.
Modest Packet Loss
A modest amount of packet loss is often unnoticeable on an otherwise “good” network. A high amount of packet loss will result in the following symptoms:
- Frequent video freezing
- Frequent video frame-rate drops
- Choppy audio
If you are looking at your LiveSwitch logs, you will typically see a lot of NACKs being sent, probably a few picture loss indications (PLIs), and lots of generated PLC.
Excessive Packet Loss
If the loss is excessive and recovery attempts are unable to compensate, the video will start to freeze, the frame rate will drop, and the audio will become difficult to understand with frequent cuts in and out. Packet loss has the most severe impact when combined with a high latency network, which cripples the effectiveness of NACK-based retransmissions.
If the root problem is an overloaded network device, then the solution is to lower the bitrate which will, in turn, reduce network demands. This is a frequent problem with WiFi networks which are overloaded or suffering interference from neighbouring networks.
Jitter
Jitter is a measure of the consistency of timing within a network stream. In other words, how much packet delivery deviates from the expected arrival time.
Since UDP (User Datagram Protocol) does not guarantee delivery order, jitter occurs on every connection. Each hop on the network path is an opportunity for jitter to occur, so higher latency networks or networks with higher hop counts are more likely to experience high levels of jitter.
Since a media decoder/playback component must process packets in order, problems arise if each packet received is processed immediately.
Once a packet has been processed by the decoder, any “older” packets must be discarded. This could very easily result in throwing out packets that could have improved audio quality or eliminated the need to retransmit a video packet.
Eliminate Jitter with Jitter Buffer
Eliminating the effects of jitter requires the media receiver to run received packets through a “jitter buffer”.
A jitter buffer is responsible for delaying the processing of media packets just enough to smooth out delivery times and ensure the correct packet order for the next stage in the processing pipeline. A greater delay will do a better job at eliminating the effects of network jitter but at the cost of introducing additional latency to the pipeline. Since network conditions can vary widely over the course of a call, the best jitter buffers are variable: increasing or decreasing their internal delay as needed.
What Does High Jitter Look Like?
Assuming the jitter buffer can adapt quickly to the changing network conditions, the symptoms of high jitter will be:
- Bursts of video freezing
- Bursts of video frame-rate drops
- Bursts of choppy audio
The symptoms are essentially the same as those for packet loss but go away as the jitter buffer adapts its internal delay. Looking at your LiveSwitch logs, you should see the size of the jitter buffer increasing/decreasing to meet changing demands.
How To Solve Other Network Issues
See symptoms that don’t match up with anything described so far?
If a network measures poorly in more than one area, like loss combined with high latency, the symptoms can be a bit more complicated. In cases like these, looking at how the symptoms change over time is often the key to a better understanding. Better yet, select a LiveSwitch product - Server or Cloud - an API & platform architectured for robust, scalable growth.