The talents, MPEG and the master

Introduction

In the parable of the Talents the Gospel tells the story of a master who entrusts 5 talents (a large amount of money at that time) to one servant and 2 talents to another before leaving for a long travel. The first servant works hard and doubles his talents, while the second plays safe and buries the talents. When the master returns, he awards the first servant and punishes the second.

Thirty-one years ago, MPEG was given the field of standards for coding of moving pictures and audio to exploit. Now the master comes. To help him make the right judgement about the use of the the talents that he gave, I will briefly review the milestones reached in these years. Of course, I am not going to revisit all the MPEG standardisation areas developed in the last 31 years. There are several posts in this blog (see the list at the bottom of the page) and in the book A vision made real – Past, present and future of MPEG), I will just take some snapshots of the major achievements.

Making some media digital

Before MPEG-1 there had been attempts at making media digital, but MPEG-1 was the first standards that made the media really digital in consumer products: Video CD brought movies on CD, Digital Audio Broadcasting (DAB) made the first digital radio and MP3, well, that simply created the new music experience triggering a development that continues to this day. This was possible thanks to the vision that a global audio-video-systems standard would take over the world. It did.

Making television digital

MPEG-1 did not make all media digital, television was the major exception. This was an intricate world where politics, commercial interests, protection of culture and more had defied all attempts made by established standards organisations. MPEG applied its recipe and produced an effective MPEG-2 specification that added DSM-CC to support TV distribution on cable. Sharp vision, excellent technology and unstinting promotion efforts delivered the result.

Making media ICT friendly

MP3 encoding and decoding on PC was achieved in the early days of the standard, but an announcement by Intel that MPEG-2 video could be decoded in real time on their x86 chips made headlines. The real marriage between media and ICT – defined as IT + mobile – was the planned result of MPEG-4. Two video standards in sequence (Visual and AVC), the ultimate audio format (AAC in all its variations), the File Format (ISO Based Media File Format – ISOBMFF), Fonts (Open Font Format) and a lot of other standard technologies still largely in use today in spite of the fast-evolving technology scenario.

Media not just for humans

MPEG-7, conceived in the mid-1990’s, was a project ahead of its time.MPEG-7, conceived in the mid-1990’s, was a project ahead of its time. It was triggered by the vision that 500 TV channels would become available thanks to the saving of MPEG-2 on cable with the technology of the time. The idea was to enable the description of content – audio, video and multimedia – in the same bit-thrifty way as MPEG had done for MPEG-1/-2 and was doing for MPEG-4. Then descriptions would be distributed to machines to enable them to respond to human queries. Audio-Visual Description Profile (AVDP) is an example of how MPEG-7 is used in the content production world, but more is expected in the upcoming Video Coding for Machines work.

E-commerce of media

Around the turn of the millennium, there was an intense debate on how media could be handled in the new context enabled by MPEG standards. This had been triggered by the advent of Peer-to-Peer protocols that allowed new forms of distribution somehow at odds with practices and laws. With MPEG-21 MPEG developed a comprehensive framework and a suite of standards to enable e-commerce of media that respected the rights and interests of the parties involved. Some of these are the specification of: Digital Item (DI), identification of DIs and its components, protection of DIs, machine-readable languages to express rights and contracts, adaptation of DIs and more. Industry has taken pieces of MPEG-21, but not the entire framework yet.re. Industry has taken pieces of MPEG-21, but not the entire framework yet.

Standards for media combinations

At the beginning of the new millennium MPEG had collected enough standards that the following question was asked: how can we combine a set of content items each represented by MPEG standards or, when MPEG standards are not available, by other standards, in a standard way? This was the start of MPEG-A, a suite of standards Multimedia Application Formats (MAF). Examples are Surveillance AF, Interactive Music AF (IMAF), Augmented Reality AF ARAF), Common Media AF (CMAF) and Multi-Image (MIAF). CMAF is actually affecting millions of streaming devices today.

Systems-Video-Audio à la carte

With the main elements of the MPEG-4 standard in place, MPEG had the need for systems, video and audio standards without being able to define a unified standard. This was the birth of 3 standard suites: MPEG-B (Systems), MPEG-C (Video) and MPEG-D (Audio). Among the most relevant standards we mention Common encryption format (CENC) for ISOBMFF and MPEG-2 TS, Reconfigurable Video Coding (RVC) and Unified speech and audio coding (USAC). The last is the only standard that is capable to encode audio and speech with a quality superior to the best audio codec and the best speech codec.

Interacting with media

Media can be defined as virtual representations of audio and video information that match, hopefully in a faithful way, something that exists in the real world, or a representation of synthetically-generated audio and video information, or a mix of the two. MPEG started to tackle this issue in the middle of the first decade at the time Second Life offered an attractive paradigm for interaction with synthetically-generated audio and video information. MPEG developed MPEG-V, a framework and a suite of standards for the information flowing from sensors and to actuators and the characteristics of virtual world objects.

Getting media in any way

Broadcasting was the first system for mass distribution of media – audio and video. Originally, it was strictly one way, cable added return information, then the telecommunication networks provided the technical means to achieve full two-way distribution. With its MPEG-2 standard, MPEG provided the full stack from transport up. This was universally adopted by broadcasting, but the Internet Protocol (IP) was the transport selected for telecom distribution. With MPEG-H, MPEG provided a unified solution where content meant for one-way distribution can seamlessly distributed in a two-way fashion. With this Systems-Video-Audio based suite of standards MPEG has achieved unification of media distribution.

Facing an unpredictable internet

Probably most readers have never heard of the Asynchronous Transfer Mode (ATM), designed to transport fixed-size packets on a fixed route between two points before transferring data. ATM’s AAL1 could have guaranteed bandwidth, but had to give way to the leaner and cheaper IP. The successful digitatisation we live in is paid by unpredictability. You start with a good bandwidth between you and the source, but a moment later the bandwidth available is cut by half. A disaster for those who want to provide reliable services. MPEG-DASH is the standard that allows a consumer device to request (video, mostly) information of the appropriate bitrate matching the bitrate made available by the network at a given instant.

The immersive media dream

MPEG has dreamt for a quarter of century to immerse in media 😊. In the second half of the 1990’s MPEG developed the MPEG-2 Multiview Profile, the first attempt at providing the two eyes of a viewer with the kind of different information the eyes receive when they are hit by the light reflected by an object. The latest attempts were the Multiview and 3D extension attempts to HEVC. Technology is maturing, but many the context is far from stable as companies providing solution come and go. MPEG is developing standards in this slippery space based on 6 keypoints:

  1. Architecture for immersive services;
  2. Omnidirectional MediA Format (OMAF) for omnidirectional media applications (e.g. 360° video) and a basis for integration of other technologies;
  3. Immersive video starting from 3DoF+;
  4. Immersive audio (6DoF);
  5. Point Clouds providing a easy way to manipulate 3D visual objects;
  6. Network based Media Processing (NBMP) to allow a user to get the network to do some processing of their media.

Media devices are Things

The Internet of Things (IoT) paradigm is well known but how can we apply the general IoT paradigm to media? MPEG-IoMT (Internet of Media Things) is an MPEG standard suite providing interfaces, protocols and associated media-related information representations that enable advanced services and applications based on human to device and device to device interaction. IoMT will be the platform on which new standards such as Video Coding for Machines will be hosted.

More supple compression

MPEG video coding standards have been hugely successful. However, in certain domain, such as internet streaming adoption encounters non-technical difficulties. Essential Video Coding (EVC) is the standard that will yield excellent performance with the prospect of an easier licensing.

Compression for all

MPEG has developed an impressive number of technologies whose focus is on compression and transport of data. Some are strictly media-related. Others, however, have a more general applicability. That this is true and can be implemented is demonstrated by MPEG-G, a standard that allows efficient transport of DNA reads obtained by high-speech sequencing machines. MPEG-G compression is lossless and will allow savings on storage and transmission costs and in access to DNA information for clinical analyses.

The master returns

The master had a really long travel – 31 years – but has finally returned. Will he say to MPEG: “Well done, good and trustworthy servant; you have been trustworthy in a few things, I will put you in charge of many things; enter into the joy of your master” or will he say: “throw this lazy servant into the outer darkness, where there will be weeping and gnashing of teeth”?

Posts in this thread