Loudness: The Necessity for Monitoring and Compliance

Written by Erik Otto, CEO Mediaproxy


Television is a visual medium but without sound it makes little or no sense. Surveys have also shown that viewers are more forgiving of lower quality pictures - something increasingly common in the YouTube era - but do not want to put up with poor, loud or distorted audio. Because of this, comprehensive monitoring and analysis of broadcast sound, complying with technical and regulatory standards, is absolutely essential.


Despite this imperative, audio is often treated as a secondary concern when considering compliance and logging technology. Today's monitoring systems, including Mediaproxy's LogServer, are generally promoted on the strength of being able to cover multiple video platforms and formats, including DVB, ATSC, 4K, HD, SD, SDI, TSoIP, OTT, MPEG2, MPEG3, HEVC and, most recently, SMPTE ST 2022-6. This is understandable in some respects because modern television broadcasting and streaming involves an ever-increasing number of channels and streams for both digital terrestrial broadcasting and video-on-demand.


These involve the visuals plus metadata and supporting services such as closed captions. While this is enough to monitor and analyze, the number of audio sources that also require logging and checking is even greater these days and could, potentially, grow in the future. Audio also has to comply with loudness regulations, which have been implemented over the last 14 years to deal with a problem that was, at one time, among the most complained about aspects of TV viewing.


Loudness is the discrepancies in sound levels between one audio signal and another. The most recognizable instance is the transition from a drama program to a commercial or interstitial. Because of 'perceived volume', we hear sounds at different levels despite what a peak audio meter tells us. This is due to the difference in dynamic range between, for example, a drama with very loud and very quiet sequences; and a heavily compressed commercial or promo, which will sound extremely noisy and dense by comparison. Viewers were forced to keep adjusting the volume of their TV sets, which was annoying and inconvenient.


The problem has existed almost as long as TV itself. Over the years broadcast R&D engineers and product manufacturers worked on various possible solutions, producing algorithms and special audio meters that measure and display more than just peak levels. But it was not until 2006 that an international standard appeared, laying down targets for programs and interstitial material to be mixed so a consistent audio level could be maintained. This was ITU-R BS1770 (Algorithms to measure audio program loudness and true-peak audio level), which was followed a year later by ITU-R BS1771 (Requirements for loudness and true-peak indicating meters).


A new range of hardware meters based on 1770/1771 came on to the market, with the result that complaints about loudness dropped dramatically. Work to improve the situation further continued, with regional and national bodies such as the EBU and ATSC developing their own standards that expanded on the ITU protocols. The EBU's PLOUD working group published R 128 in 2010, followed that same year by ATSC A/85, which was enshrined in US law by the CALM (Commercial Advertisement Loudness Mitigation) Act. Other national loudness standards include ARIB TR-B32 in Japan and Australia's OP-59.


Since then software-based loudness systems have emerged so that checks can be made at any point in the distribution chain, rather than just at the mixing stage. Standards have also been updated on a regular basis to accommodate changes and developments in broadcasting. For example, a fourth version of R 128 was published in 2020 and included a new supplement (R 128 s2) for loudness in streaming. LogServer is compliant with R 128, as well as ATSC A/85 and OP-59. It is able to instantly carry out spot-checks on loudness complaints and then generate a full report.


Mediaproxy’s software also complies with regulations for secondary audio programs, multiple languages and descriptive video. This capability in LogServer makes it able to deal with the multiple channels and streams of audio that are now being transmitted by broadcasters. Broadcast sound has developed from mono to stereo and is now moving into the realms of immersive systems, particularly on streaming services such as Netflix and Amazon, which see them as complementing 4K video transmissions.


Dolby Atmos is the best known immersive audio format. It was originally developed for cinema but now also appears on Blu-ray Disc releases and selected films for streaming. Atmos is an object-based audio (OBA) system, as opposed to Dolby Digital, which has a traditional channel configuration (5.1 or 7.1). OBA has a foundation of channels but adds up to 128 'objects' that are linked to metadata, which encodes where a sound appears and what it does. This allows the creation of a fuller 3D audio 'picture', with height as well as width and depth.


Atmos is a standalone system but also forms part of Dolby AC-4, which is able to offer alternative languages, different commentaries for sporting events and audio description. AC-4 is classified as a Next Generation Audio (NGA) format, as is MPEG-H 3D Audio, which offers 3D immersive audio and has similar OBA capabilities for languages and personalization. Both AC-4 and MPEG-H 3D Audio are included in the DVB specification for Ultra HD TV. LogServer is able to monitor immersive content and what languages or commentaries are available, as well as checking for any problems in the audio streams.


Another important audio feature is network connectivity. Audio over IP (AoIP) is becoming the primary platform for this and many manufacturers are now producing equipment that incorporates it. There are several AoIP formats available, including Dante and RAVENNA. These are incompatible with each other but interoperability is provided through the AES67 standard, which allows devices using different AoIP protocols to work together. Mediaproxy supports AES67 and is able to monitor audio streams encoded in it.


New video formats, higher resolutions and multiple streaming services may have been the main focus for discussion and development in recent years, but audio is crucial to the modern broadcast experience. It also has to be monitored and logged to the same high standards as visuals to ensure the best possible viewing experience.


This article first appeared online with Fast&Wide - November 2020