The MPEG Metamorphoses


In past publications, I have often talked about how many times MPEG has changed its skin during its 3-decade long life. In this article I would like to add substance to this claim by giving a rather complete, albeit succinct, account. You can find a more detailed story at Riding the Media Bits.

The early years


MPEG started with the idea of creating a video coding standard for interactive video on compact disc (CD). The idea of opening another route to video coding standards had become an obsession to me because I had been working for many yeas in video coding research without seeing ant trace of consumer-level devices for what was touted as the killing application at that time: video telephony. I thought that if the manufacturing prowess of the Consumer Electronics (CE) industry could be exploited, that industry could supply telco customers with those devices so that telcos would be pushed into upgrading their networks to digital in order to withstand the expected high videophone traffic.

The net bitstream from CD – 1.4 Mbit/s – is close to the 1.544 Mbit/s of the primary digital multiplex in USA and Japan. Therefore it was natural to set a target bitrate of 1.5 Mbit/s as a token of the CE and telco convergence (at video terminal level).

At MPEG1 (1988/05) 29 experts attended. The work plan was agreed to be MPEG-1 at 1-1.5 Mbit/s, MPEG-2 and 1.5-10 Mbit/s and MPEG-3 at 10-60 Mbit/s (the numbering of standards came later).

For six months all activities happened in single sessions. However, 3 areas were singled out for specific activities: quality assessment (Test), complexity issues in implementing video codecs in silicon (VLSI) and characteristics of digital storage media (DSM) . The last activity was needed because CD was a type of medium quite dissimilar from telecom networks and broadcast channels, for which video coding experts did not have much familiarity.

In the following months I dedicated my efforts to quell another obsession of mine: humans do not generally value video without audio. The experience of the ISDN videophone where, because of organisational reasons, video was compressed by 3 orders of magnitude in 64 kbit/s and audio was kept uncompressed in another 64 kbit/s stream, pushed me into creating an MPEG subgroup dedicated to Audio coding. Audio, however, was not the speech used in videotelephony (for which there were plenty of experts in ITU-T), but the audio (music) typically recorded on CDs. Therefore an action was required lest MPEG end up like videoconference, with a state-of-the-art video compression standard but no audio (music) or with a quality non satisfactory for the target “entertainment-level” service.

The Audio subgroup was established at MPEG4 (1988/10) under the chairmanship of Hans Mussmann, just 7 months after MPEG1, while the Video subgroup was established at MPEG7 (1989/07), under the chairmanship of Didier Le Gall, about a year after MPEG1.

The other concern of mine was that integrating the audio component in a system that had not been designed for that could lead to some technical oversights that could be only belatedly corrected with some abominable hacks. Hence the idea of a “Systems” activity, initially similar to the H.221 function of the ISDN videophone (a traditional frame and multiframe-based multiplexer), but with a better performance because I expected it to be more technically forward looking.

At MPEG8 (1989/11) all informal activities were formalised into subgroups: Test (Tsuneyoshi Hidaka), DSM (Takuyo Kogure), Systems (Al Simon) and VLSI (Colin Smith).


Discussions on what would eventually become the MPEG-2 standard started at MPEG11 (1990/07). The scope of the still ongoing MPEG-1 project was nothing, compared to the ambitions of the MPEG-2 project. The goal of MPEG-2 was to provide a standard that would enable the cable, terrestrial TV, satellite television, telcos and the package media industries – worth in total hundreds of billion USD, to go digital in compressed form.

Therefore, at MPEG12 (1990/09) the Requirements group was established under the chairmanship of Sakae Okubo, the rapporteur of the ITU-T Specialists Group on Coding for Visual Telephony. This signalled the fact that MPEG-2 Video (and Systems) were joint projects. The mandate of the Requirements Group was to distil the requirements coming from the different industries into one coordinated set of requirements.

The Audio and Video subgroup had their minds split in two with one half engaged in finishing their MPEG-1 standards and the other half in initiating the work on the next MPEG-2 standard. This was just the first time MPEG subgroups had to split their minds.

In those early years subgroup chairs changed rather frequently. At MPEG9 (1990/02) Colin (VLSI) was replaced by Geoff. Morrison and the name of the group was changed to Implementation study Group (ISG) to signal the fact that not only hardware implementation was considered, but software implementation as well. At MPEG12 (1990/03) Al (Systems) was replaced by Sandy MacInnis and Hans (Audio) was replaced by Peter Noll.

MPEG29 (1994/11) approved the Systems, Video and Audio parts of the MPEG-2 standard and some of the subgroup chairs saw their mission as an accomplished one. The first move was at MPEG28 (1994/07) when Sandy (Systems) was replaced by Jan van der Meer to finalise the issues left over from MPEG-2.

The MPEG subgroups did a great job in finishing several pending MPEG-2 activities such as MPEG-2 Video Multiview and 4:2:2 profiles, MPEG-2 AAC, DSM-CC and more.

A new skin of coding

In the early years 1990s, MPEG-1 was not finished and MPEG-2 had barely started but talks about a new video coding standard for very low bitrate (e.g. 10 kbit/s) were already under way. The name eventually assigned to the project was MPEG-4, because the MPEG-3 standard envisaged at MPEG1 had been merged with MPEG-2 by bringing the upper bound of the bitrate range to 10 Mbit/s .

MPEG-4, whose title eventually settled to Coding of Audio-Visual Objects, was a completely different standard from the preceding two in that it aimed at integrating the world of audio and video, so far under the purview of broadcasting, CE and telecommunication, with the world of 3D Graphics, definitely within the purview of the Information Technology (IT) industry.

At MPEG20 (1992/11) a new subgroup called Applications and Operational Environments (AOE) was established under the chairmanship of Cliff Reader. This group took charge of developing the requirements for the new MPEG-4 project and spawned three groups inside it: “MPEG-4 requirements”, “Synthetic and Natural Hybrid Coding (SNHC) and “MPEG-4 Systems”.

The transition from the “old MPEG” (MPEG-1 and MPEG-2) and the “new MPEG” (MPEG-4) was quite laborious with many organisational and personnel changes. At MPEG30 Didier (Video) was replaced by Thomas Sikora and Peter (Audio) was replaced by Peter Schreiner At MPEG32 Geoff (ISG) was replaced by Paul Fellows and Tsuneyoshi (Test) was replaced by Laura Contin.

MPEG-4 Visual was successfully concluded thanks to the great efforts of Thomas (Video) and Laura (Test) and the very wide participation by experts. The foundations of the extremely successful AAC standards were laid down by Peter (Audio) and the Audio subgroup experts.

At MPEG34 (1996/03) C. Reader left MPEG and at MPEG35 (1996/07) a major reorganisation took place:

  1. The “AOE requirements” activity was mapped to the Requirements subgroup under the chairmanship of Rob Koenen, after a hiatus of 3 meeting after Sakae (Requirements) had left.
  2. The “AOE systems” activity was mapped to the Systems subgroup under the chairmanship of Olivier Avaro.
  3. The “AOE SNHC” activity became a new SNHC subgroup under the chairmanship of Peter Doenges. Peter was replaced by Euee Jang at MPEG49 (1999/10).

At MPEG 40 (1997/07) a DSM activity became a new subgroup with the name Delivery Multimedia Integration Framework (DMIF) under the chairmanship of Vahe Balabanian. DMIF addressed the problem of virtualising the distribution medium (broadcast, network and storage) from the Systems level by defining appropriate interfaces (API). At MPEG 47 (1999/03) Guido Franceschini took over with a 2 meeting tenure after which the DMIF subgroups was closed (1999/07).

At MPEG41 Peter (Audio) was replaced by Schuyler Quackenbush who since then has been running the Audio group for 23 years and is the longest-serving MPEG chair.

At MPEG46 (1998/12) Paul (ISG) was replaced by Marco Mattavelli. Under Marco’s tenure, such standards as MPEG-4 Reference hardware description, an extension to VHDL of the notion of Reference Software, and Reconfigurable Media Coding were developed.

The MPEG-4 standard is unique in MPEG history. MPEG-1 and -2 were great standards because they brought together establish large industries with completely different agendas, but MPEG-4 is the standard that bonded together the initial MPEG industries with the IT industry. The standard had big challenges and Chairs and experts dedicated enormous resources to the project to face them: video objects, audio objects, synthetic audio and video, VRML extensions, file format and more. MPEG-4 is a lively standard even today almost 30 years after we first started working on it and has the largest number of parts.


At MPEG33 (1996/01) the Liaison subgroup was created under the chairmanship of Barry Haskell to handle the growing network of organisations MPEG was liaising with (~50). At MPEG56 Barry, a veteran of the video coding old guard, left MPEG and at MPEG57 (2001/07) Jan Bormans took over and continued until MPEG71 (2005/01) when Kate Grant took over. The Liaison subgroup was closed at MPEG84 (2008/04). Today liaisons are coordinated at Chairs meeting, drafted by the relevant subgroup and reviewed by the plenary.

An early skin change

In 1996 MPEG started addressing MPEG-7, a media-related standard but with a completely different nature than the preceding three: it was about media description and their efficient compression. At MPEG48 (1999/07) it became clear that we needed a new subgroup that was called Multimedia Description Schemes (MDS) to carry out part of the work.

Philippe Salembier was put in charge of the MDS subgroup who was initially in charge of all MPEG-7 matters that did not involve Systems, Video and Audio. At MPEG 56 (2001/03) John Smith took over the position which he held until MPEG70 (2004/10) when Ian Burnett took over until the MDS group was closed at MPEG87 (2009/02).

The media description skin has had several revivals since then. One is Part 13 – Compact Descriptors for Visual Search (CDVS) standard in the first half of the 2010. Another is Part 15 – Compact Descriptors for Video Analysis (CDVA) standard developed in the middle-to-second half of the 2010. Finally Part 17 – Compression of neural networks for multimedia content description and analysis is preparing a basic compression technology for neural network-based media description.

Another video coding

At MPEG46 (1998/12) Laura (Test) was replaced by Vittorio Baroncini. At MPEG54 (2000/10) Thomas (Video) left MPEG and at MPEG56 (2001/03) Jens-Rainer Ohm was appointed as Video chair.

Vittorio brought the expertise to carry out the subjective tests required by he collaboration with ITU-T SG 16 restarted to develop the Advanced Video Coding (AVC) standard. At MPEG58 (2001/12) Jens was appointed as co-chair of a joint subgroup with ITU-T called Joint Video Team (JVT). The other co-chair was Gary Sullivan, rapporteur of the ITU-T SG 16 Video Coding Experts Group (VCEG). The JVT continued its work until well after the AVC standard was released at MPEG 64 (2003/03). Since then Gary has attended the chairs meetings as a token of the collaboration between the two groups.

Still media-related, but a different “coding”

At MPEG49 (1999/10) the many inputs received from the market prompted me to propose that MPEG develop a new standard with the following vision: “Every human is potentially an element of a network involving billions of content providers, value adders, packagers, service providers, resellers, consumers …”.

The standard was eventually called MPEG-21 Multimedia Framework. MPEG-21 can be described as the “suite of standards that enable media ecommerce”.

The MDS subgroup was largely in charge of this project which continued during the first decade of the 2000s with occasional revivals afterwards. Today MPEG-21 standards are handled by the Systems subgroup.

Under the same heading of “different coding” it is important to mention Open Font Format (OFF), a standard built on the request made by Adobe, Apple and Microsoft to maintain the OpenType specification. The word maintenance” in MPEG has a different meaning because OFF has had many extensions, developed “outside” MPEG in an open ad hoc group with strong industry participation and ratified by MPEG.

A standard of standards

In the early year 2000s MPEG could look back at its first decade and a half of operation with satisfaction: its standards covered video, audio and 3D Graphics coding, systems aspects, transport (MPEG-2 TS and MPEG-4 File Format) and more. While refinements on its already impressive assets were under way, MPEG wondered whether there were other areas it could cover. The answer was: the coding of “combinations of MPEG coded media”. That was the beginning  of a long series of 20 standards originally developed by the groups in charge of the individual media, e.g. Part 2 – MPEG music player application format was developed by the Audio subgroup and Part 3 – MPEG photo player application format was developed by the Video subgroup. Today all MPEG-A standard, e.g. the very successful Part 19 – Common Media Application Format, are developed by the Systems subgroup.

The mid 2000s

Around the mid 2000s MPEG felt that there was still a need for more Systems, Video and Audio standards, but did not have the usual Systems, Video and Audio “triad” umbrella it had had until then with MPEG-1, -2, -4 and -7. So it decided to create containers for those standards and called them MPEG-B (Systems), MPEG-C (Video) and MPEG-D (Audio).

MPEG also ventured in new areas:

  1. Specification of a media device software stack (MPEG-E)
  2. Communication with and between virtual worlds (MPEG-V)
  3. Multimedia service platform technologies (MPEG-M)
  4. Rich media user interfaces (MPEG-U)

Rob (Requirements) continued until MPEG58 (2001/12). He was replaced by Fernando Pereira until MPEG64 (2003/04) when Rob returned, holding his position until MPEG71 (2005/01) when Fernando took over again until MPEG82 (2007/10) when he left MPEG.

The Requirements subgroup is the “control board” of MPEG in the sense that Requirements gives proposals of standards the shape that will be implemented by the operational group after the Call for Proposals. Therefore the duo Rob-Fernando have been in the control room of MPEG for some 40% of MPEG life.

Vittorio (Test) continued until MPEG68 (2004/03) when he was replaced by T. Oelbaum who held the positions until MPEG81 (2007/07).

Olivier (Systems) kept his position until MPEG86 (2008/07) when he left MPEG to pursue his entrepreneurial ambitions. Olivier has been in charge of the infrastructure that keeps MPEG standards together for 13 years and is the third longest-serving MPEG chair.

Euee (SNHC) kept his position until MPEG59 (2002/03). He was replaced by M. Bourges-Sévenier who continued until MPEG70 (2004/10). Mikaël was then replaced by Mahnjin Han who continued until MPEG78 (2006/10). The SNHC subgroup has been producing valuable standards. However, they have had a hard time penetrating an industry that is content with less performing but freely-available standards.

The return of the triad

The end of the years 2000s signaled a major change in MPEG. When Fernando (Requirements) left MPEG at MPEG82 (2007/10), the task of developing requirements was first assigned to the individual groups. The experiment lasted 4 meetings but it demonstrated that it was not the right solution. Therefore, Jörn Ostermann was appointed as Requirements chair at MPEG87 (2009/02). That was just in time for the handling of the requirements of the new Audio-Video-Systems triad-based MPEG-H standard.

MPEG-H included the MPEG Media Transport (MMT) part, the video coding standard that eventually became High Efficiency Video Coding (HEVC) and 3D Audio. MPEG-H was adopted by thw ATSC as a tool to implement new forms of broadcasting services where traditional broadcasting and internet not only coexist but cooperate.

The Requirements, and then the Systems subgroups were also quickly overloaded by the other project called DASH aiming at “taming” the internet from an unreliable transport to one the end user device could adapt to.

The two Systems projects – MMT and DASH – were managed by Youngkwon Lim who took over from Olivier at MPEG86 (2008/10).

At MPEG87 (2009/01) the MDS subgroup was closed. At the same meeting, Vittorio resumed his role as chair of the Test subgroup, about on time for the new round of subjective tests for the HEVC Call for Evidence and Call for Proposals.

The Joint Collaborative Team on Video Coding between ITU-T and MPEG (JCT-VC) was established at MPEG92 (200/04) co-chaired by Gary and Jens as in the AVC project. At its peak, the VC group was very large and processed in excess of 1,000 documents per meeting. When the group was still busy developing the main (2D video coding) part of HEVC, 3D video coding became important and a new subgroup called JCT-3V (joint with ITU-T) was established at MPEG100. The 3V subgroup closed its activities at MPEG115 (2016/05), while the VC subgroup is still active, mostly in maintenance mode.

The recent years

In the first half of the years 2010 MPEG developed the Augmented Reality Application Format and developed the Mixed and Augmented Reality (MAR) Reference Model in a joint ad hoc group with SC 24/WG 9.

In 2016 MPEG kicked off the work on MPEG-I – Coded representation of immersive media. Part 3 of this is Versatile Video Coding (VVC), the latest video coding standard developed by the new Joint Video Experts Team (JVET) between ITU-T and MPEG established at MPEG114 (2016/02). It is expected to become FDIS at MPEG131 (2020/06).

The JVET co-chairs are again Jens and Gary. In the, regularly materialised, anticipation that JVET would be again overloaded by contributions, Jens was replaced as Video chair by Lu Yu at MPEG 121 (2018/01).

The Video subgroup is currently engaged in two 2D video coding standards of rather different nature Essential Video Coding (EVC and Low Complexity Enhancement Video Coding (LCEVC) and is working on the MPEG Immersive Video (MIV) project due to become FDIS at MPEG134 (2021/03).

MIV is connected with another exciting area that in this article we left with the name of SNHC under the chairmanship of Mahnjin. At MPEG79 (2007/01) Marius Preda took over SNHC from Mahnjin to continue the traditional SNHC activities. At MPEG89 (2009/06) SNHC was renamed 3D Graphics (3DG).

In the mid 2010 the 3DG subgroup started several explorations, in particular Point Cloud Compression (PCC) and Internet of Media Things (IoMT). The former has split into two standards Video-based (V-PCC) and Graphics-based (G-PCC). The latter has reached FDIS recently.

nother promising activity started at MPEG109 (2014/03) and has now become the Genomic Information Representation (MPEG-G) standard. This standard signals the intention to bring the benefits of compression to industries other than media who process other data types.


This article was a long overview of 32 years of MPEG life. The intention was not to talk about MPEG standards, but about how the MPEG organisation morphed to suit the needs of standardisation.

Of course, structure without people is nothing. It was not obviously possible to mention the thousands of experts who made MPEG standards, but I thought that it was my duty to record the names of subgroup chairs who drove their development. You can see a complete table of all meetings and MPEG Chairs here.

In recent years the MPEG structure has remained stable, but there is always room for improvements. However, this must be driven by needs, noth by ideology.

One possible improvement is to make the Genomic data coding activity a formal subgroup as a first step in anticipation of more standards to code other non-media data. The other is to inject more market awareness into the phase that defines the existence first and then the characteristics of MPEG standards.

But this is definitely another story.

National interests, international standards and MPEG

Having spent a considerable amount of my time in standardisation, I have developed my own definition of standard: “the documented agreement reached by a group of individuals who recognise the advantage of all doing certain things in an agreed way”. Indeed, I believe that, if we exclude some areas such as safety, in matters of standards the authority principle should not hold. Forcing free people to do things against their interest, is an impossible endeavour. If doing certain things in a certain way is not convenient, people will shun a standard even if it bears the most august credentials.

Medieval Europe was a place where my definition of standard reached an atomic level. However, with the birth of national centralised states and, later, the industrial revolution, national standards came to the fore. Oddly enough, national standards institutions such as the British Standards Institute (BSI), originally called Engineering Standards Committee and probably the first of its kind, were established just before World War I, when the first instance of modern globalisation took shape.

Over the years, national standards became a powerful instrument to further a country’s industrial and commercial interests. As late as 1989 MPEG had trouble displaying 625/50 video coding simulation results at a USA venue because import of 625/50 TV sets in the country was forbidden at that time (and no one had an interest in making such sets). This “protection of national interests” is the cause of the 33 pages of the ITU-R Report 624 – Characteristics of television systems of 1990 available here containing tables and descriptions of the different analogue television systems used at the time by the 193 countries of the United Nations.

The same spirit of “protecting national interests” informed the CCITT SGXV WG4 Specialists Group on Coding for Visual Telephony (that everybody at that time called the Okubo group) when it defined  the Common Intermediate Format (CIF) in Recommendation H.261 to make it possible for a 525/60 camera to communicate to a a 625/50 monitor (and between a 625/50 camera and a 525/60 monitor).

That solution was a “compromise” video format (actually not a real video format because it was used only inside the video codec) with one quarter of the 625/50 spatial resolution and one half the 525/60 temporal resolution. This was a typical political solution of the time (and one that 525/60 people later regretted because the spatial interpolation required by CIF was more onerous than the temporal interpolation in 625/50). Everybody (but me, who opposed the solution) felt happy because everybody had to “share the burden” when communicating across regions with different video formats.

International Standardisation is split in 3 – IEC, ISO and ITU – but IEC and ISO share the principle that standards for a technical area are developed by a Technical Committee (or a Subcommittee) managed by an international Secretariat funded and manned by a national standards organisation (so called National Body). Things in ITU are slightly different because ITU itself provides the secretariat whose personnel is provided by national administrations.

In the traditional context of standards being established by a national standards committee to protect the national interest, an international standards committee was seen as the place where national interests, as represented by their national standards bodies, had to be protected. Therefore, holding the secretariat of a committee was seen as a major achievement for the country that ran the secretariat. As an emblem of the achievement, the country had the right to nominate (in practice, appoint) the chairperson of the committee (in some committees this is rigorously enforced. In some others, things are taken more lightly).

That was then, but actually it is still so even now in many standardisation contexts. The case of CIF mentioned above shows that, in the area of video coding standards, then the prerogative of the ITU-T “for Visual Telephony”, the influence of national interests was still strong. MPEG, however, changed the sides of the equation. One of the first things that it did when it developed MPEG-1 Video was to define test sequences that were both 525/60 and 625/50 and then issued a Call for Proposals where respondents could submit coded sequences in one or the other format at their choice. MPEG did not use CIF but SIF, where the format was either a quarter of the spatial resolution and one half of the temporal resolution of 525/60 (i.e. 288 lines x 352 pixels) or a quarter of the spatial resolution and one half of the temporal resolution of 625/50 (i.e. 240 lines x 352 pixels).

By systematically defusing political issues and converting them to technical issues, MPEG succeeded in the impossible task of defining compressed media formats with an international scope. However, by kicking political issues out of the meeting rooms, MPEG changed the nature and role of the parent subcommittee SC 29’s chairmen and secretariat. The first yearly SC 29 plenary meetings lasted 3 days, but later the duration was reduced to 1 day and in some cases inhalf a day alla matters were handled.

One of the most contentious areas of standardisation (remember the epic battles on the HDTV production format of 1986 and before) was completely tamed and reduced to technical battles where experts assess the quality of the solution proposed and not how it is dressed in political clothing. This does not mean that the battles are not epic, but for sure they are rational.

I do not remember having heard complaints on the part of the industry regarding the de-politicised state of affairs in media coding standardisation. Therefore it is time to ask if we should not be dispensed from the pompous ritual of countries expressing national interests through national bodies in secretariats and chairs of international standards committees when in fact there are global industrial interests poorly mapped through a network of countries actually deprived of national interests.

Posts in this thread



Media, linked media and applications


In a technology space moving at an accelerated pace like the one MPEG has the task to develop standards for, it is difficult to have a clear plan for the future (MPEG has a 5-year plan, though).

Still, when MPEG was developing the Multimedia Linking Application Format (MLAF), it “discovered” that it had developed or was developing several standards – MPEG-7, Compact descriptors for visual search (CDVS), Compact descriptors for video analysis (CDVA) and Media Orchestration.

The collection of these standards (and of others in the early phases of conception or development, e.g. Neural Network Compression and Video Coding for Machines) that help create the Multimedia Linking Environment, i.e. an environment where it is possible to create a link between a given spatio-temporal region of a media object and spatio-temporal regions in other media objects.

This article explains the benfits brought by the MLAF “multimedia linking” standard also for very concrete applications.

Multimedia Linking Environment

Until a quarter of century ago, virtually the only device that could establish relationships between different media items was the brain. A very poor substitute was a note on a book to record a possible relationship of the place in the book where the note was written with content in the same or different books.

The possibility to link a place in a web page to another place in another web page, or to a media object, was the great innovation brought by the web. However, a quarter of century after a billion web sites and quadrillions of linked web pages, we must recognise that the notion of linking is pervasive one and not necessarily connected with the web.

MPEG has dedicated significant resources to the problem described by the sentence “I have a media object and I want to know which other related media objects exist in a multimedia data base” and represented in the MPEG-7 model depicted the figure below.

However, MPEG-7 is an instance of the more general problem of linking a given spatio-temporal region of a media object to spatio-temporal regions in other media objects.

These are some examples:

  1. A synthetic object is created out of a number of pictures of an object. There is a relationship between the pictures and the synthetic object;
  2. There is a virtual replica of a physical place. There is a relationship between the physical place and the virtual replica;
  3. A User is experiencing as virtual place in a virtual reality application. There is a relationship between the two virtual places;
  4. A user creates a media object by mashing up a set of media items coming from different sources. There is a relationship between the media items and the mashed up media object.

MPEG has produced MPEG-A part 16 (Media Linking Application Format – MLAF) specifies a data format called bridget that can be used to link any kinds of media. MPEG has also developed a number of standards that play an auxiliary role in the “ media linking” context outlined by the examples above.

  1. MPEG-7 parts 1 (Systems), 3 (Visual), 4 (Audio) and 5 (Multimedia) provide the systems elements, and the visual (image and video), audio and multimedia descriptions.
  2. MPEG-7 parts 13 (Compact descriptors for visual search) and 15 (Compact descriptors for video analysis) provide new generation image and video descriptors
  3. MPEG-B part 13 (Media Orchestration) provides the means to mash up media items and other data to create personal user experiences.

The MLAF standard

A bridget is a link between a “source” content and a “destination” content. It contains information on

  1. The source and the destination content
  2. The link between the two
  3. The information in the bridget is presented to the users who consume the source content.

The last information is the most relevant to the users because it is the one that enables them to decide whether the destination content is of interest to them.

The structure of the MLAF representation (points 1 and 2) is based on the MPEG-21 Digital Item Container implemented as a specialised MPEG-21 Annotation. The spatio-temporal scope is represented by the expressive power of two MPEG-7 tools and the general descriptive capability of the MPEG-21 Digital Item. They allow a bridget author to specify a wide range of possible associations and to be as precise and granular as needed.

The native format to present bridget information is based on MPEG-4 Scene description and application engine. Nevertheless, a bridget can be directly linked to any external presentation resource (e.g., an HTML page, an SVG graphics or others).

Bridgets for companion screen content

An interesting application of the MLAF standard is described in the figure below describing the entire bridget workflow.

    1. A TV program, scheduled to be broadcast at a future time, is uploaded to the broadcast server [1] and to the bridget Authoring Tool (BAT) [2].
    2. BAT computes and stores the program’s audio fingerprints to the Audio Fingerprint Server (AFS) [3].
    3. The bridget editor uses BAT to create bridgets [4].
    4. When the editor is done all bridgets of the program and the referenced media objects are uploaded to the Publishing Server [5].
    5. At the scheduled time, the TV program is broadcast [6].
    6. The end user’s app computes the audio fingerprint and sends it to the Audio Fingerprint Server [7].
    7. AFS sends to the user’s app ID and time of the program the user is watching [8].
    8. When the app alerts the user that a bridget is available, the viewer may decide to
      1. Turn his eyes away from the TV set to her handset
      2. Play the content in the bridget [9]
      3. Share the bridget to a social network [10].

    This is the workflow of a recorded TV program. A similar scenario can be implemented for live programs. In this case bridgets must be prepared in advance so that the publisher can select and broadcast a specific bridget when needed.

    Standards are powerful tools that facilitate the introduction of new services, such as companion screen content. In this example, the bridget standard can stimulate the creation of independent authoring tools and end-user applications.

    Creating bridgets

    The bridget creation workflow depends on the types of media object the bridget represents.

    Let’s assume that the bridget contains different media types such as an image, a textual description, an independently selectable sound track (e.g. an ad) and a video. Let’s also assume that the layout of the bridget has been produced beforehand.

    This is the sequence of steps performed by the bridget editor:

    1. Select a time segment on the TV program timeline and a suitable layout
    2. Enter the appropriate text
    3. Provide a reference image (possibly taken from the video itself)
    4. Find a suitable image by using an automatic images search tool (e.g. based on the CDVS standard)
    5. Provide a reference video clip (possibly taken from the video itself)
    6. Find a suitable video clip, possibly taken from the video itself, by using an automatic video search tool (e.g. based on the CDVA standard)
    7. Add an audio file.

    The resulting bridget will appears to the end user like this.

When all bridgets are created, the editor saves the bridgets and the media to the publishing server.

It is clear that the “success” of a bridget (in terms of number of users who open it) depends to a large extent on how the bridget is presented.

Why bridgets

Bridget was the title of a research project funded by the 7th Framework Research Program of the European Commission. The MLAF standard (ISO/IEC 23000-16) was developped at the instigation and with participation of members of the Bridget project.

At this page you will find more information on how the TVBridge application can be used to create, publish and consume bridgets for recorded and live TV programs.

Posts in this thread



Standards and quality


Quality pervades our life: we talk of quality of life and we choose things on the basis of declared or perceived quality.

A standard is a product, and as such may also be judged, although not exclusively, in terms of its quality. MPEG standards are no exception and the quality of MPEG standards has been a feature has considered of paramount importance since its early days.

Cosmesis is related to quality, but is a different beast. You can apply cosmesis at the end of a process, but that will not give quality to a product issued from that process. Quality must be an integral part of the process or not at all.

In this article I will describe how MPEG has embedded quality in all phases of its standard development process and how it has measured quality in some illustrative cases.

Quality in the MPEG process

The business of MPEG is to produce standards that process information in such a way that users do not notice, or notice in as a reduced ways as possible, the effect of that standard processing when implemented in a product or service.

When MPEG considers the development of a new standard, it defines the objective of the standard (say, compression of video of a particular range of resolutions), range of bitrates and functionality. Typically, MPEG makes sure that it can deliver the standard with the agreed functionality by issuing a Call for Evidence (CfE). Industry members are requested to provide evidence that their technology is capable to achieve part of all the identified requirements.

Quality is now an important, if not essential, parameter for making a go-no go decision. When MPEG assesses the CfE submissions, it may happen that established quality assessment procedures are found inadequate. That was the case of the call for evidence on High-Performance Video Coding (HVC) of 2009. The high number of submissions received required the design of a new test procedure: the Expert Viewing Protocol (EVP). Later on the EVP test method became ITU recommendation ITU-R BT-2095. While the execution of any other ITU recommendation of that time would require more than three weeks, the EVP allowed the complete testing of all the submissions in three days.

If MPEG has become confident of the feasibility of the new standard from the results of the CfE, a Call for Proposals (CfP) is issued with attached requirements. These can be considered as the terms of the contract that MPEG stipulates with its client industries.

Testing of CfP submissions allows MPEG to develop a Test Model and initiate Core Experiments (CE). These aim to achieve optimisation of a part of the entire scheme.

In most cases the result of CEs involves quality evaluation. In the case of CfP responses subjective testing is necessary because there are typically large differences between the different coding technologies proposed. However, in the assessment of CE results where smaller effects are involved, , objective metrics are typically, but not exclusively, used because formal subjective testing is not feasible for logistic or cost reasons.

When the development of the standard is completed MPEG engages in the process called Verification Tests which will produce a publicly available report. This can be considered as the proof on the part of the supplier (MPEG) that the terms of the contract with its customer have been satisfied.

Samples of MPEG quality assessment

MPEG-1 Video CfP

The first MPEG CfP quality tests were carried out at the JVC Research Center in Kurihama (JP) in November 1989. 15 proposals of video coding algorithms operating at a maximum bitrate of 1.5 Mbit/s were tested and used to create the first Test Model at the following Eindhoven meeting in February 1990 (see the Press Release).

MPEG-2 Advanced Audio Coding (AAC)

In February 1998 the Verification Test allowed MPEG to conclude that “when auditioning using loudspeakers, AAC coding according to the ISO/IEC 13818-7 standard gives a level of stereo performance superior to that given by MPEG-1 Layer II and Layer III coders” (see the Verification Test Report). This showed that the goal of high audio quality at 64 kbps per channel for MPEG-2 AAC had been achieved.

Of course that was “just” MPEG-2 AAC with no substantial encoder optimisation. More that 20 years of MPEG-4 AAC progress has brought down the bitrate per channel.

MPEG-4 Advanced Video Coding (AVC) 3D Video Coding CfP

The CfP for new 3D (stereo & auto-stereo) technologies was issued in 2012 and received a total of 24 complete submissions. Each submission produced 24 files representing the different viewing angle for each test case. Two sets of two and three viewing angles were blindly selected and used to synthesise the stereo and auto-stereo test files.

The test was carried out on standard 3D displays with glasses and auto-stereoscopic displays. A total of 13 test laboratories took part in the test running a total of 224 test sessions, hiring around 5000 non-expert viewers. Each test case was run by two laboratories making it a full redundant test.

MPEG-High Efficiency Video Coding (HEVC) CfP

The HEVC CfP covered 5 different classes of content covering resolutions from WQVGA (416×240) up to 2560×1600. For the first time MPEG introduced two set of constrains (low delay and random access) for different classes of target applications.

The HEVC CfP was a milestone because it requested the biggest ever testing effort performed by any laboratory or group of laboratories until then. The CfP generated a total of 29 submissions and 4205 coded video files plus the set of anchor coded files. Three testing laboratories took part in the tests that lasted four months and involved around 1000 naïve (non-expert) subjects allocated to a total of 134 test sessions.

A common test set of about 10% of the total testing effort was included to monitor the consistency of results from the different laboratories. With this procedure it was possible to detect a set of low quality test results from one laboratory.

Point Cloud Compression (PCC) CfP

The CfP was issued to assess how a proposed PCC technology could provide some 2D representations of the content synthesised using PCC techniques, resulting in some video suitable for evaluation by means of established subjective assessment protocols.

Some video clips for each of the received submissions were produced after an accurate selection of the rendering conditions. The video clips were generated using a rendering video tools. This was used to generate, under the same conditions, two different video clips for each of the received submissions: a rotating view of a fixed synthesised image and a rotating view of moving synthesised video clips. The rotations were selected in a blind way and the resulting video clips were subjectively assessed to rank the submissions.


Quality is what end users of media standards value as the most important feature. To respond to this requirements, MPEG has designed a standards development process that is permeated by quality considerations.

MPEG has no resources of its own. Therefore, sometimes it has to rely on the voluntary participation of many competent laboratories to carry out subjective tests.

The domain of media is very dynamic and, very often, MPEG cannot rely on established method – both subjective and objective – to assess the quality of compressed new media types. Therefore, MPEG is constantly innovating the methodologies it used to assess media quality.

Posts in this thread


How to make standards adopted by industry


There are many definitions of standard. In the Webster’s you find a definition of standard as “Something that is established by authority, custom or general consent as a model or example to be followed”, an oldish definition that thinks that people must be directed to their good. In the Encyclopaedia Britannica you find “(A technical specification) that permits large production runs of component parts that are readily fitted to other parts without adjustment”, a definition driven by the idea that manufacturing is helped by the availability of different but compatible suppliers. Closer to my view of standard is another Webster’s definition “a conspicuous object (as a banner) formerly carried at the top of a pole and used to mark a rallying point especially in battle or to serve as an emblem” driven by the idea that everybody can develop a standard but its adoption depends on how satisfactory the proposed standard is to its intended users.

In many cases a standard is the result of the effort spent by a group of people who believe their interests are best served by agreeing to do certain things in an certain way. Agreeing on a standard may require a big effort (in MPEG developing a standard may cost tens of millions of USD to participating companies), but that is nothing compared to the effort required by convincing “other people” that the standard is what they need.

In this article I will present some of the efforts that MPEG has done over its 30+ years to convince “other people” that MPEG standards are what they need.

Convincing other people to adopt a standard is a process

If you think that convincing other industries that, when an MPEG standard, you need a good a marketing effort to get it adopted, you are missing the point. Anybody can put together a decent technical standard. Convincing other industries is a process that accompanies the development of the standard, starting from the moment the idea of a new standard takes shape.

In the early 1990’s all instances of the broadcasting industry – terrestrial, satellite and cable – were technically convinced that digital television was superior to analogue television. There were two problems, however. The first problem was that in some countries the industry espoused digital as an ally why in other countries the industry rejected it as a threat. The second problem was that there were solutions here and there and some attempts at developing standards, but solutions were proprietary and attempts at standards often fraught with rivalries. MPEG had achieved some notoriety with its first (MPEG-1) standard but had to acquire a new credibility vis-à-vis an industry that, already at that time, was worth more that 100 B$ p.a. and was understandably cautious with its decisions.

MPEG succeeded to convince the broadcasting industry, even the reluctant segments of it, namely the European terrestrial broadcasting industry. The deal was to offer its Requirements group as the place where the individual industry segments could express their needs and see them influence the technical developments. Unlike the approach of other bodies where often there is a coalition of interests blocking the requests from other groups based on the mantra “I cannot support this because my business is negatively affected”, MPEG took the opposite approach. All requests were discussed to understand whether they were new or could be folded in previous requests. The space of technical solutions was partitioned in profiles and levels to accommodate requests without negatively affecting others. Finally, when the MPEG-2 standards was completed, MPEG carried out Verification Tests and showed that 6 Mbit/s yielded “composite quality” of standard definition TV and 8 Mbit/s yielded “component quality”.

Credibility is not granted for ever

In the mid-1990’s MPEG had achieved the impossible. It had brought together all segments of the television industry, the package media industry included, and was addressing the studio needs that it satisfied with its 4:2:2 profile. MPEG, however, did not intend to be just the technical arm of the television industry (which, by the way, included audio as well). MPEG intended to fully execute the mission implied by its title “Coding of moving pictures and audio” which meant the “information representation layer” for whatever application domain.

In the mid-1990’s the role that internet would play in the media distribution media was not clear at all and so was the role that mobile networks would play. It was clear, however, that other delivery mechanisms would play a role. These mechanisms were characterised by “low bitrate”, “best effort” etc.

In hindsight trying to extend the hard-won role in the broadcasting industry to this unknown land was a very bold move. That field was antithetic to what MPEG had done so far, namely high quality and guaranteed (to some extent) delivery. Emblematic was the hot discussion around the MPEG-2 transport packetisation that was opposed by old style experts accustomed to rely on frame structure. More important was the fact that new industries, represented by the ICT (Information and Communication Technologies) acronym would play a major role.

MPEG made a big effort to adapt to the new environment. For instance it developed the software copyright disclaimer. The disclaimer eventually became a modified BSD – Berkeley Software Distribution licence, where the modification is contained in an explicit disclaimer that software copyright release does not imply release of patents. Another effort was to develop the file format which became the cornerstone on which the MPEG role in the ICT world was built.

A track record of collaborations

In 30+ years of standards development MPEG has established cooperation with many standards bodies and industry fora. In this chapter I will review some of the most outstanding and fruitful collaborations.


MPEG has developed standards for broadcasting since its early days (DAB – Digital Audio Broadcasting was one application driving MPEG-1 Audio), Broadcasting continues to be a major customer to this dau. An indicative list of standards groups and industry fora MPEG interacts with is ABU – Asia-Pacific Broadcasting Union, ATSC – Advanced Television Systems Committee, Inc., DVB – Digital Video Broadcasting, EBU – European Broadcasting Union, ITU-R SG 6 – Broadcasting Service (terrestrial and satellite), ITU-T SG 9 – Television and sound transmission, SCTE – Society of Cable Telecommunications Engineers and DTG – Digital TV Group.

ATSC has adopted MPEG-2 Video, AVC and HEVC. In addition to these standards, DVB has also adopted MPEG-1 and MPEG-2 Audio and AAC. MPEG has referenced a DVB specification for its Media Orchestration standard (MPEG-B part 13). ITU-R and SCTE have adopted several MPEG standards.


MPEG-1 was driven by the idea of interactive audio-visual services at a bitrate that telcos used to call as primary rate (1.5/2 Mbit/s) that were expected to be offered by ADSL – Asymmetric Digital Subscriber Line. Intense interaction with that industry began with MPEG-2 Video which is common text, which means that MPEG-2 Video is verbatim the same as H.222 and H.262. The tight collaboration of MPEG with ITU-T SG 16 – Multimedia services and systems continued with AVC and HEVC, and continues with VVC. The 3 standards are “aligned text” which means that the standards are technically equivalent but not editorially the same. Other related standard such as MPEG-C Part 7 – Supplemental enhancement information messages for coded video bitstreams, and MPEG-CICP Part 2 – Video and Part 4 – Usage of video signal type code points are also aligned text. MPEG is liaising with ITU-T SG 12 – Performance, QoS and QoE

MPEG has an ongoing intense collaboration with 3GPP – the Third Generation Partnership Project, an international organisation issuing standard for the mobile industry. 3GPP has adopted many MPEG standards such as AVC, HEVC, AAC, MP4 File Formay and DASH. MPEG is also liaising with ETSI – European Telecommunication Standards Institute.

Other media-related areas

The world of media is quite articulated and MPEG takes care of establishing contacts, developing standards for or using standards from different environments.

In the area of audio, MPEG has a long-standing liaison with AES – Audio Engineering Society and with SMPTE. MPEG is referencing several SMPTE standards, e.g. those related to HDR – High Dynamic Range.

AVS – Audio and Video Coding Standard Workgroup of China is a group developing audio-visual compression standards for the Chinese market. MPEG has a liaison with AVS.

Immersive media is the future but it is unclear what the future will exactly be. MPEG has developed ARAF – Augmented Reality Application Format and has developed ISO/IEC 21858 – Information model for mixed and augmented reality (MAR) contents jointly with SC 24 – Computer graphics, image processing and environmental data representation.


Since the early 2000, MPEG has taken over the baton of the Open Type specification. Open Type was an open specification originally developed by Adobe, Apple and Microsoft. MPEG-OFF – Open Font Format is a standard that is universa;;y used wherever there are displays that are exprected to present fonts.

MPEG is liaising with SC 34 – Document Description and Processing Languages on the matter of fonts.

Information transport

When it developed MPEG-2 Video, MPEG had already the experience of the transport standard that it had developed for MPEG-1. MPEG-2 broadcasting applications could not rely on the assumption that the communication channel was error-free and MPEG had to develop a new standard that it called MPEG-2 Transport Stream (MPEG-2 Systems also defines another transport called MPEG-2 Program Stream aking to MPEG-1 Systems). MPEG-2 Systems is one of the most successful MPEG standards as it is used by broadcasting in all forms (ATSC, DVB, BDA and, before that DVD – Digital Versatile Disc etc.) and is used as a package in IPTV.

ATSC has adopted the full Audio-Video-Systems package offered by MPEG-H. The MPEG-H Systems layer is called MPEG Media Transport (MMT).

The Common Media Application Format (CMAF) is another successful transport standard.

transport standardMPEG has developed the MPEG-2 transport standard for sequences of JPEG 2000 and JPEG XS images.

Another successful media transport standard is MPEG-DASH. This has been adopted by 3GPP, ATSC, DVB and others.

Manufacturing industry

In addition to DAB, MPEG-1 was driven by the idea of a standard for audio-visual applications on CD – compact disc. This eventually was adopted by the industry under the name of Video CD. It was also driven by the idea of a new digital audio distribution format on CC – compact cassette. The CE – Consumer Electronic industry has tight contacts with MPEG via IEC TC 100 –  Audio, Video and Multimedia Systems and Equipment. MPEG is also working with CTA – Consumer Technology Association. MPEG liaises with BDA – Blu-ray Disc Association and BDA has adopted several MPEG standards such as AVC and HEVC.


MPEG has developed the 3 parts of MPEG-G – Genomic Information Representation jointly with WG 5 – Data processing and integration of TC 276 – Biotechnology. It is at the last stages of  approval of Parts 4 and 5 and is developing, again jointly with TC 276/WG 5, Part 6 – Genomic Annotation Representation.

In addition to TC 276, MPEG is liaising with TC 215 – Health Informatics and with GA4GH –Global Alliance for Genomics and Health.

Internet of Things

Internet of Things per se is no business for MPEG because SC 41 – Internet of Things is in charge of standardisation in this area. MPEG liaises with SC 41.

MPEG has identified a specific instance of Internet of Things that it calls Internet of Media Things. This considers the specific but important case of a thing that is a camera or a microphone, a display or a loudspeaker, a unit capable of analysing the media content etc. Part 1 of MPEG-IoMT – Architecture is an instance of the general IoT Architecture developed by SC 41.

Artificial Intelligence

Artificial Intelligence per se is no business for MPEG because SC 42 – Artificial Intelligence is in charge of it. MPEG liaises with SC 41.

MPEG has used an AI technology – Neural Networks – for MPEG-7 Part 15 – Compact Descriptors for Video Analysis (CDVA), is working on MPEG-NNR, Part 17 of MPEG-7 – Compression of neural networks for multimedia content description and analysis. MPEG also plans to make intense use of neural networks for its future Video Coding for Machines standard.

MPEG is also investigating the connections between its Network-Based Media Processing (NBMP) standard, released as FDIS at the January 2020 meeting, and Big Media. Again, MPEG has no business in Big Media, an area of work for SC 42, but NBMP is likely to become an instance of the general Big Media Reference Model developed by SC 42.


Data compression seems to have little to do with transportation, but this area of endeavour is more and more influenced by technologies mastered by MPEG. For instance, 3GPP is considering V2X (Vehicle-to-everything) communication, where information moves from a vehicle to any entity that may have a relationship with the vehicle, and vice versa. Specific forms of communication are: V2I (vehicle-to-infrastructure), V2N (vehicle-to-network), V2V (vehicle-to-vehicle), V2P (vehicle-to-pedestrian), V2D (vehicle-to-device) and V2G (vehicle-to-grid).

Audio-visual information is clearly a major user of any such communication forms and MPEG standards are and will be more and more the main sources. Two examples are G-PCC – Geometry-based Point Cloud Compression and VCM – Video Coding for Machines.

MPEG is liaising with ISO TC 22 – Road Vehicles and TC 204 – Intelligent Transport Systems.


Standards lubricates our complex society and allo it to function and make progress. Developing standards is easy, but making sure that standards are adopted is difficult.

MPEG has been successful with the latter because it takes a holistic, end-to-ed approach to standardisation where its partners and customers – standards bodies and industry fora – are parts of standard development.

Posts in this thread


MPEG status report (Jan 2020)


In the week of the 13th of January, the Free University of Brussels has hosted the 129th MPEG meeting . Two days (11-12) were dedicated to some 15 ad hoc group meetings and 6 days (7-12)to meetings of JVET, the joint MPEG-SG 16 group tasked to develop the VVC standard.

In this status report I will highlight some of the most relevant topics on which progress was made. The figure below captures the essence of the MPEG work plan as it resulted from the meeting.

Versatile Video Coding (VVC)

VVC (part 3 of MPEG-I) is being balloted and the ballot results are expected to be received at the July meeting so that MPEG can approve VVC as FDIS.

MPEG is now working on two related standards that are important for practical deployment: Carriage in MPEG-2 TS (Amendment 2 of MPEG-2 Systems) and Carriage in ISOBMFF (Amendment 2 of MPEG-4 part 15), both expected to be approved in January 2021.

Another activity around VVC is called Multi-Decoder Video Interface for Immersive Media (part 13 of MPEG-I). This aims to support the flexible use of media decoders, for example decoding only a subset of a single elementary stream. This feature is required for processing immersive media composed of a large number of elementary streams.

Essential Video Coding (EVC)

EVC (part 1 of MPEG-5) addresses the needs that have become apparent in some use cases, such as video streaming, where existing ISO video coding standards have not been as widely adopted as might be expected from their purely technical characteristics. EVC is still under ballot and results are expected to become available at the April 2020 meeting (MPEG 130).

The group in charge of EVC has started considering Carriage of EVC in MPEG Systems.

Low Complexity Enhancement Video Coding (LCEVC)

LCEVC will provide a standardised video coding solution that leverages other video codecs in a manner that improves video compression efficiency while maintaining or lowering the overall encoding and decoding complexity. LCEVC will reach DIS in April 2020.

MPEG Immersive Video (MIV) and Video-based Point Cloud Compression (V-PCC)

Part 12 of MPEG-I Immersive Video shares with Part 5 of MPEG-I Video-based Point Cloud Coding (V-PCC) the notion of projecting a 3D scene to a series of planes, compressing the 2-D visual information on the planes with off-the-shelf video compression standards and providing a means to communicate how a 3D renderer can use the information contained in the atlases (in the case of MIV) and the patches (in the case of PCC). Outstanding convergence of the two approaches has been reached.

V-PCC will reach FDIS in April 2020 and MIV in January 2021. Both will have extensions, the latter to enable the ambitious, but needed 6 degrees of freedom (6DoF) where user can move in 6 directions.

The MPEG-4 File Format is being extended to include V-PCC and G-PCC data.

Video Coding for Machines (VCM)

VCM is an exploration on a new type of video coding designed to provide efficient representation of video information where the user is not human but a machine, with possible support of viewing by humans. Possible use cases for VCM are video surveillance, intelligent transportation, automatic driving and smart cities.

MPEG has produced a Draft Call for Evidence designed to acquire information on the feasibility of a Video Coding for Machines standard. For this purpose MPEG has published a Call for Test Data for Video Coding for Machines. Test data will be used to assess the responses to the Call for Evidence.

Neural Network-based Audio-Visual Compression

VVC and EVC will support the media industry by providing more compression for transmission and storage. They are both the current endpoints of a compression scheme that dates back to the mid-1980’s. Similarly MPEG-H 3D Audio is the current endpoint of the compression scheme initiated in 1997 with MPEG-2 AAC.

Today, as a result of the demonstration provided in recent years that neural networks can outperform other “traditional” algorithms in selected areas, many laboratories are carrying out significant research on the use of neural networks for coding of audio and visual signals as well as point clouds.

MPEG is calling its members to provide information on this new area of endeavour.

MPEG Immersive Audio

MPEG has produced a Draft CfP for Immersive Audio. The actual CfP will be issued in April 2020 and submissions are requested for July 2020. FDIS is planned for January 2022.

Neural Network Compression for Multimedia

Neyral Networks are used for muktimedia applications, such as speech understanding and image recognition. Industry, however, is coming to the conclusion that the IT infrastructure can very well not be able to cope with the growth of users and that in many cases intelligence is best distributed to the edge. As the size of some of these networks is hundreds of GBytes or even TBytes, compression of neural networks can support distribution of intelligemce to potentially millions of devices. See the figure below for a view of NBMP.

MPEG is progressing its work on the Compression of neural networks for multimedia content description and analysis standard. This is expected to reach CD status in January 2021.

Network-Based Media Processing (NBMP)

NBMP reached FDIS in January 2020. The standard defines a framework for content and service providers to describe, deploy, and control media processing. The framework includes an abstraction layer deployable on top of existing commercial cloud platforms and able to be integrated with 5G core and edge computing. The NBMP workflow manager enables composition of multiple media processing tasks to process incoming media and metadata from a media source and to produce processed media streams and metadata that are ready for distribution to media sinks.

MPEG is exploring how NBMP can become an instance of the Big Media reference model developed by SC 42 Artificial Intelligence.

Compression of Genomic Annotations

At the January 2020 meeting MPEG received 7 submissions in response to the joint Call for Proposals that MPEG and ISO TC 276/WG 5 on the efficient representation of annotations to sequencing data resulting from the analysis pipelines MPEG meeting in Brussels. MPEG has started working on a set of core experiments with the goal to integrate the proposed technologies into a single standard specification capable of satisfying all identified requirements and support rich varieties of queries.

FDIS is expected to be reached in January 2021.

MPEG and 5G

MPEG compression standards are mostly designed to represent information in an abstract way. However, the great success of MPEG standards is also due to the effort MPEG spent in providing the means to convey compressed information. 5G is being deployed and MPEG is investigating if and how its standards can be affected by 5G/

MPEG-21 Contracts to Smart Contracts

Blockchains offer an attractive way to execute electronic contracts. The problem is that there are many blockchains each with their own way of expressing the terms of a contract. MPEG considers that MPEG-21 can be the intermediate language in which smart contracts for different blockchains can be expressed.

One application can be for the following use case. There is no way to deduce from a smart contract the clauses that the smart contract contains. Publishing the human readable contract alleviates the concern, but does not ensure that the clauses of the human readable contract correspond to the clauses of the smart contract.

The figure below describes how the other party of the smart contract can know the clauses of the smart contract in a human readable form.

Posts in this thread


MPEG, 5 years from now

MPEG will soon be 32. It has produced many important standards that have changed the industry and the lives of billions of people. What will MPEG be 5 years from now?

MPEG is strong

The MPEG scope is digital media: analysis, compression, transport and consumption of digital moving pictures and audio for broadcast, broadband, mobile and physical distribution. MPEG standards have an extremely wide range of customers belonging to all industries that need digital audio and video packaged for their needs. Because of this, the MPEG brand is universally recognised, even by the public at large.

It should also be noted that MPEG has started applying its expertise to compression and transport of other data. MPEG has started working on compression of DNA reads from high speed sequencing machines some 5 years ago. The field is very active and there is no doubt that in five years it will continue to be so.

Another strength is the MPEG organisation, a flat structure based on interacting units belonging to technology-centred competence centres coordinated by a Technical Coordination Committee. It is the result of a 30-year long natural process driven by the specific needs of the group. MPEG has a large and competent membership with a full ecosystem of experts contributing to the development of technologies needed by MPEG standards. It also has a consolidated and experienced leadership team.

A third strength is the vast network of client communities, typically represented by their standards organisations and/or industry fora. This network is the result of an effective communication policy that includes liaisons letters, press communiqués released at every meeting documenting the progress of work, the MPEG website providing general information, workshop on existing or planned standards, white papers describing the purpose of approved standards, video tutorials and more.

A fourth strength is the MPEG business model. MPEG develops the best standards using the best available technologies provided by industry and academia. Good standards remunerate good IP whose holders may choose to use the revenues to develop more good IP for future standards.

MPEG has weaknesses of its own

At a speech I gave at the 100th MPEG meeting held in Geneva in 2012, I said that the MPEG area of work was so wide, so important for humans and with so many opportunities that, if in 100 years MPEG did not exist any longer, it would not be because the MPEG mission had been exhausted, but because the MPEG leaders of the time had not been able to manage the work of the group properly.

I confirm what I said, but it is a reality that digital media is a maturing business area. The most critical needs, e.g. digital television and media on mobile, have been satisfied. More needs exist but with many unknowns. MPEG is populated by excellent experts and, in the past, their expertise could cope with the need to convert to digital or mobile experiences which users were already accustomed to. Today, however, we are facing the problem that available and possible technologies may be used to offer new products, services or applications. Unfortunately we do not know what are the new products, services or applications using those technologies.

A good example is provided by the 27 year old MP3 standard. It took several years before the excellent MP3 technology found the use we all know. Other MPEG standards have not been so lucky.

The weakness is then that MPEG is a technical group almost without participation of people who can contribute the information that, merged with the existing technical information, can guide the group to develop the right standards.

MPEG has become the reference group for digital media compression and transport. As I have written several times, MPEG standards enable products, services and applications that are worth more than 1.5 T$ p.a. MPEG – a a working group – has achieved such an importance when there are Technical Committees whose economic importance is orders of magnitude less than that of the MPEG working group. So far this did not make much difference because MPEG’s client industries were keen on getting good technologies for their affiliates. In the new competitive environment the MPEG lack of status matters.

Finally MPEG suffers from its very success. Its business model has been the engine of MPEG success, but its unreviewed business model matters and more.

MPEG is threatened

The fact that digital media is maturing has creates another weakness: there is increasing competition in digital media standards. MPEG standards remain the best, but there are other groups developing competing standards with features that are at odds with the MPEG business model. Because of this the share of the client industry adopting MPEG standards is no longer as high as it used to be.

So far MPEG standards have dominated the broadcasting and mobile domains and had to face competition in the “internet” domain. MPEG’s dominance is threatened by the fact that some industries may abandon MPEG standards not because they are best but because they cannot compete on other features that other groups provide. MPEG standards may become less relevant if MPEG stays exclusively with its traditional business model.

MPEG standards are synonymous of globalisation. The established broadcast business was local by definition, but MPEG standards provided an enabling compression technology ready for global adoption. The mobile industry started as local but was easily conquered by the benefits offered by the global MPEG standards. The threat today is that globalisation may no longer be MPEG’s ally because in the new environment the global MPEG brand may be replaced by some local brands.


Borrowing from and extending the actions identified by the MPEG Future Manifesto, MPEG has the following opportunities

  1. Enhance its links with the research and academic community to stimulate more research on technologies that are relevant to MPEG’s standardisation scope
  2. Review its business model, analyse its continuing effectiveness and relevance to industry, identify the need for changes and, if any identified, take action to implement them
  3. Step up communication with its client industries to identify any early needs for standards and obtain requirements
  4. Inject “market-awareness” into MPEG to sharpen the target of MPEG standards
  5. Enhance the value of MPEG standards by strengthening and expanding collaboration with other standards committees, e.g. by developing joint standards
  6. Increase promotion of MPEG standards with its client industries
  7. Dedicate sufficient resources to cover other areas in need of compressions by leveraging MPEG technologies and expertise
  8. Preserve and refine the organisation of MPEG.


Barring exogenous forces, in 5 years MPEG will still be around. However, the next 5 years will not be an easy ride. This does not mean that MPEG will not be able to deliver the results that its customers – industry and end users – expect.

Posts in this thread


The driver of future MPEG standards


It is not by chance that the MPEG Future Manifesto has the following action points at the top of its list (italics are mine)

  • Support and expand the academic and research community which provides the life blood of MPEG standards;
  • Enhance the value of intellectual property that make MPEG standards unique while facilitating their use.

Why are these two action points so important? Because the main ingredients of the successful MPEG recipe have been so far

  1. Developing the best performing standards by accessing the best technologies produced by academia and research;
  2. Augmenting the value of the intellectual property provided, thanks to its adoption in standards deployed to billions of users;
  3. Relying on facilitated access to essential patents required to implement MPEG standards.

So as to avoid any misunderstanding, I should explicitly state that MPEG decisions only affect item 1. of the list above because MPEG decisions are made solely on the basis of the technical merits of the technologies proposed and adopted after a scrutiny that follows an established process.

Item 2. is a just consequence of decisions made by MPEG. The “just” hides the fierce discussions that happen when a decision to accept or reject a proposed technology is made.

Item 3. is entirely in the hands of the market. Users wishing to acquire the right to use the essential patent of the standards may decide to use the services offered by patent pools or proceed in other ways.

In a string of articles A crisis, the causes and a solution, Can MPEG overcome its Video “crisis”?, Business model based ISO/IEC standards, IP counting or revenue counting? and Matching technology supply with demand I have

  1. Analysed the changes that have affected the ideal model of the numbered list above by the past quarter-of-a-century application;
  2. Highlighted the risks on the adequacy of the model entailed by the progressive change of the environment;
  3. Ventured to propose ideas for improving the situation.

The very fact that myarticles have generated significant controversy and that there is a continuous flow of readers of the articles, long time after they have been published, implies that indeed there is one (or more) issue with the model as it is applied today.

In this article I intend to review the situation trying to identify the issues that MPEG Future could address in implementing its Manifesto.

How MPEG works

The figure below claims to represent with some accuracy the way MPEG acquires the necessary technologies to develop a standard and how implementors of a standard can acquire the necessary rights.

When MPEG intends to develop a standard, it solicits requirements from the industry. Three segments of the industry can typically provide requirements: eventual users of the standard, implementers of the standard and providers of technology to develop the standard. Note that a specific company may belong to one or more than one industry, in the sense that one department of a company may belong to one industry and another department may belong to a different one.

MPEG receives (step 1)), assesses, refines and harmonises the requirements received. When these have reached sufficient maturity, a Call for Proposals (CfP) is issued (step 2). Companies belonging to the technology industry submit Proposals in response to the CfP (step 3). MPEG assesses the coding tools contained in the proposals and assigns them to specific Competence Centres, say, audio, video, 3D graphics, systems etc. (step 4) who adapt and refine the tools (possibly interacting between them) and adds the selected ones to those in the MPEG Toolkit (step 5) where all the technologies adopted in past standards are contained.

When the development of the standard is completed (step 6), companies claiming rights to the tools included in the standard may decide to join one or more than one patent pool or not to join any patent pool at all. Users of the standard (e.g. implementers and service providers) get use rights from patent holders (step 7) for use in their products, services or applications (step 8).

Albeit MPEG has no role in the last step, the perceived performance of MPEG as a provider of industry standards is highly dependent on how this step unfolds.

The troubles with the situation

What has been described is the result of a natural evolution over the past quarter of a century. In MPEG-2 and MPEG-4 AVC times there was a single patent pool with a limited number of companies not joining the pool. Today, we have the example of MPEG-H HEVC for which there are ~45 known patent holders, 3 known patent pools and a number of companies who have not joined any patent pool. Patent pool members make up ~ 2/3 of patent holders, while ~ 1/3 does not belong to any patent pool.

An assessment

There is no consensus in the industry that the situation described is problematic. Roughly speaking, technology companies say that the situation is just fine, while the client and implementation companies usually complain that getting a licence is “cumbersome”, “difficult” or ”expensive”.

Save for MPEG-1 or MPEG-2 Video, MPEG Video standards always had a competition. Some of those who intended to use the standard for internet streaming complained they were discouraged from using MPEG-4 Visual by some licence terms. Many turned to Real Network’s Real Video and Microsoft’s Windows Media Video. MPEG-4 AVC had competition from Google’s VP8 and VP9.

An industry forum called Alliance for Open Media (AOM) has released the specification of a video coding standard called AV1, that I call, but maybe I should not, “royalty free”. AV1 has an improved performance compared to HEVC but is less performing that Versatile Video Coding (VVC). Use of AV1 is getting significant traction in the industry for applications that used to be the purview of MPEG standards.

MPEG reacts

In the last 10 years, MPEG has tried to provide accessible solutions with its three projects for Option 1 video coding standards (Option 1 could, but should not, be considered as equivalent to “royalty free”).

All three projects have pracyically failed:

  • WebVC, the “baseline” profile of AVC, was dead at birth because some patent holders confirmed the Option 2 patent declaration (“my patents can be used, but at RAND conditions”) they had made for AVC.
  • Internet Video coding (IVC) is a published ISO standard with an attractive performance (comparable to AVC). However, it has the Damocles sword of some Option 2 declarations that do not provide any detail on what the infringing technologies are claimed to be. MPEG has declared its willingness to amend the IVC standard if appropriate information will be provided, but so far no information has been received.
  • Video Coding for Browsers (VCB) did not even reach publication because a company made an Option 3 patent declaration (“my patents may not be used for the standard”). Today it would no longer be possible to do so without identifying the infringed technologies. However, at the time the standard was developed, this was possible and MPEG could not remove the infringing technologies that had to be found in an initial list of 50 or so patents with hundreds of claims.

MPEG is currently developing MPEG-5 part 1 Essential Video Coding (EVC). The EVC CfP requested two-layer solutions where each layer had a nominal performance target with respect to AVC and HEVC. This helped the formation of a draft standard, to be promoted to FDIS in April 2020, that is expected to have a significant less complex IP context. The constarint of the CfP, however, is paid by lesser performance than VVC, a standard to be approved as an FDIS in July 2020. What will be the performance cost of the project design is not known today as verification tests for EVC and VVC have not been done yet.

Is this enough?

I cannot claim that what MPEG has done is the solution to what some perceive as a problem. The MPEG Option 1 video coding standards, for one reason or another, have failed. Without additional changes to the rules, it is hard to see how MPEG can offer an Option 1 Video Coding standard (but probably any other internally-developed media-related standard).

There is reason to believe that a standard with more performance than AV1, even if it has less performance than VVC but has a simplified access to the necessary IP will be a competitive proposal, but the jury of the EVC standard is still out and AOM may issue a more performing AV2 specification

The evolutions described above have not reached the endpoint, but MPEG needs a stable and shared approach to standardisation in an increasingly sophisticated IP environment. This is why MPEG Future has put this topic at the top of its priorities.


I think that MPEG Future should

  1. Make an objective analysis of the “difficulties” the current situation creates to users of MPEG standards;
  2. In case the existence of “difficulties” is confirmed, review the current rules and identify how they can be changed to achieve measurable results (rules have been changed once, so they could be changed a second time);
  3. Act to effect the identified changes to the rules.

Caius Julius Caesar said “Caesar’s wife must be above suspicions”.

Caius Julius Caesar said “Caesar’s wife must be above suspicions”. The guide to carry out the steps should not deny a priori the existence of a problem and not be suspected to be influenced by “the competition”.

Posts in this thread


The true history of MPEG’s first steps


Purely based on logic, MPEG should not exist.

In the 1980s the Galactic Empire of media standardisation was firmly in the hands of ITU (videocommunication and speech), IEC (audio and television) and ISO (photography and cinematography), not to mention its kingdoms and duchies – the tens of regional and national standards committees.

How could this little Experts Group, not even recognised in the ISO hierarchy, become the reference standards group of the media industry (and more to come)?

Like the Mule in Asimov’s Foundation, MPEG came to the fore with unstoppable force. In this article I will tell the (short) story of how the impossible happened.

Video coding in the 1970’s and 1980’s

I was hired to do research on videophone at CSELT, the research center of Telecom Italia (at that time STET/SIP) and in 1978 I joined the European Action COST 211. The Action’s goal was to specify, implement and trial a 2 Mbit/s videoconference system based on DPCM and Conditional Replenishment. In those years that was as much as sparse electronics and MSI chips could implement.

In 1984 the results of the project were standardised as ITU Recommendation H.120 – Codecs for videoconferencing using primary digital group transmission. A few industrial implementations were made (one from my lab prototype, at that time reduced to 2 16U racks), but the service had a limited use.

COST 211 was followed by COST 211 bis which became a European coordination forum for next ITU-T project. I attended the first few meetings of what was then called the Okubo group, from the name of its chairman. The group developed the specification that in 1988 became H.261 – Video codec for audiovisual services at p x 64 kbit/s.

I discontinued my participation because I did not agree with some of the decisions that were made by the group. My bête noire was the Common Intermediate Format (CIF). As the name implies, CIF was a video format defined as a sequence of frames with a resolution of 352×288 pixels, (1/4 of a PAL frame) at a rate of ~ 29.97 frames/s (the NTSC rate). I failed to convince my colleagues that defining video formats was history – an encoder issue. A decoder designed for flexibility could handle different spatial and temporal resolutions.

In 1984 the European Commission (EC) launched the trial version of its research program Research & development on Advanced Communication for Europe (RACE). I set up a consortium called Integrated Video Codec (IVICO) which was awarded a 1 year grant. The project, a scaled-up version of the COMIS projects, came up with a work plan based on a general coding architecture – DCT + motion compensation, at that time not a discounted solution.

At the end of the year funding of the project was discontinued, I believe that was done because a project whose aim was to design VLSI chips for a consumer market of video decoders was at odds with the EC policy of that time: analogue HD-MAC as the next television format, and digital for the next century (the one we live).

In December 1986 I went to Globecom, held in Houston, TX, to present a paper on the IVICO project. At the conference I met Hiroshi Yasuda and other University of Tokyo alumni I had known during my PhD there. Hiroshi invited me to attend the next meeting of the Joint ISO-CCITT Photographic Coding Experts Group (JPEG) in March.

Learning from JPEG

Going to JPEG meetings made sense to me because my group at CSELT was involved in another European project called Photographic Image Compression Algorithm (PICA). This had been launched by British Telecom to coordinate European participation in JPEG. So, in March 1987 I went to Darmstadt where the FTZ research centre of Deutsche Bundespost hosted the JPEG meeting that finalised the procedure to test proposals to be submitted in July 1987.

I was favourably impressed by the heterogeneous nature of the group. Unlike the various European standards groups I had attended since the late 1970s – all composed of telecommunication people – JPEG was populated by representatives of a wide range of international companies such as telecommunication operators (British Telecom, Deutsche Telekom, KDD, NTT), broadcasting companies (CCETT, IBA), computer manufacturers (IBM, Digital Equipment), terminal equipment manufacturers (NEC, Mitsubishi), integrated circuits (Zoran) and others.

The reality of a multi-industry standards group resonated well to me because the idea of a group on video coding standards not uniquely connected to telecommunications had been stuck in my head for a long while. I worked for a telecommunication company and I realised that the time scale of the telecom infrastructure did not match the scale of the media market and that the industry that provided equipment to the telecommunication industry followed the same reasoning as its customers.

In those years I was particularly attracted by the project announced by Philips called CD-i (compact disc interactive) that targeted interactive video on compact disc, hence for a bitrate of ~1.5 Mbit/s. I thought that consumer electronics (CE) should be the industry developing the video equipment that the telecom industry could use, if only the CE industry could be convinced that it was in their interest to develop products based on international standards, instead of proprietary solutions.

At the July meeting, hosted by Kjøbenhavns Telefon Aktieselskab (KTAS) in Copenhagen, one of the PICA proposals coordinated by Alain Léger of CCETT, with the participation of my group, performed the best and became the basis on which the well know JPEG standard was developed. At that meeting I proposed to Hiroshi, who was the convenor of SC 2/WG 8, of which JPEG was an Expert Group that he chaired, to create another Experts Group on Moving Pictures in WG 8.

The birth of MPEG

In January 1988, at another meeting hosted by KTAS in Copenhagen, WG 8 established two more Experts Groups: the Joint ISO-CCITT Binary Image Coding Experts Group (JBIG) and the ISO-only Moving Picture Coding Experts Group (MPEG). The latter had the mandate to develop standards for coded representation of moving pictures (audio and systems came later).

The first project concerned video coding at a bitrate of about 1.5 Mbit/s for storage and retrieval applications “on digital storage media”. It was the realisation of my idea of an international standard developed by the researchers of all industries, implemented by the consumer electronics industry and exploited by the telecommunication and broadcasting industries for audio-visual services.

Having achieved this result, I set up a European project called COding of Moving Images for Storage (COMIS) where I called European telecom companies, CE companies, chip manufacturers, Universities, etc. The intention was not only to create a forum for a more effective participation in MPEG, but also to give more visibility to the MPEG group. Hiroshi did the same in Japan with its Digital Audio Pictures Architecture (DAPA).

The first MPEG meeting took place in Ottawa in May 1988 with 29 people in attendance. At the 4th MPEG meeting in December 1988 in Hannover, the Audio group was established. In a few meetings, the Video, Test, Digital Storage Media, Systems, VLSI groups were also established. The responses to the MPEG-1 Call for Proposals were considered in November 1989 in Kurihama, hosted by JVC, where the attendance reached 99.

I am pleased to say that, at the following Eindhoven meeting (February 1989), MPEG introduced Source Input Format (SIF) obtained by sampling the 625 lines and 525 lines frames yielding 288×352 pixels and 240×352 pixels, respectively, at different frame rates. Simulation results could be shown with the 625 or 525 version of the format. Making decoders disconnected from the video format of the encoder has been an MPEG constant ever since.

The creation of SC 29

In Kurihama the Multimedia and Hypermedia Experts Group (MHEG), an ISO-only group,  was added to WG 8. This meant that in less than two years after adding JBIG and MPEG to JPEG, the membership of WG 8 was already exceeding 150.  I assessed that WG 8 was too small a container for all the activities that were booming and pushed Hiroshi to create a Subcommittee out of WG 8.

At the Washington, DC meeting of SC 2 in April 1991, a coalition of National Bodies approved the secession of WG 8 from SC 2 and SC 29 held its first meeting in November 1991, just after the MPEG-2 tests, where the MPEG attendance reached 200.


MPEG “happened” against all odds, like the Mule in Foundation. My idea of combining the manufacturing capability of the consumer electronics industry to the needs of telecom and broadcasting services succeeded. Today the CE industry of the MPEG-1 times no longer exists as it has largely merged with the IT industry/ However, thanks to its MPEG-4 project, MPEG has been able to continue to play the role of reference standards body also for the “extended” media industry.

The successful MPEG story is a reminder that standardisation has all to gain from fostering the birth of “Mules”. Within MPEG everybody has the chance to become the next Mule. They just have to bring a plan (and convince the rest of MPEG ;-).

Posts in this thread


Let’s put MPEG on trial


An Italian proverb says “those who do things make mistakes” to mean that only if you are inactive you don’t make mistakes. According to that proverb, then, MPEG, who has done a lot of things, must have done a lot of mistakes.

So, in this article I imagine there is a prosecutor making charges to MPEG and a defence attorney speaking on behalf of the defendant. Whoever wants to be part of the jury is welcome to join.

Charge #1: MPEG standards are losing market share

Prosecutor: In several markets the importance and adoption of MPEG standards is decreasing.

Attorney: MPEG standards has never been “catch all” solutions. Originally MPEG-1 Video went nowhere because interactive video went nowhere. MPEG-1 Audio layer 2 had a very slow adoption, but MP3 was a huge success. When some Far East markets needed disc-based video distribution without using DVD, MPEG-2 Video and Audio layer 2 became a huge success (1 billion Video CD players). MPEG-2 Video was universally adopted but Audio faced strong competition from proprietary solutions. The 4:2:2 profile of MPEG-2 Video faced strong competition from a market solution. Companies vying for video distribution on the nascent internet turned their back to MPEG-4 Visual and looked at proprietary solutions when they learned that they would have to pay for the amount of video streamed. A more market friendly licensing of MPEG-4 AVC gave a big boost to the adoption of the standards.

Conclusion? As the then candidate Bill Clinton said: “It’s the economy, stupid”.

That was a conclusion, but maybe too hasty. It is true that, with MPEG-2 Video and MPEG-4 AVC, MPEG standards have dominated the video distribution market. Things, however, are changing. The appearance of bigplayers on the internet distribution market has favoured the appearance of proprietary solutions. VP8, VP9 and AV1 had/have millions of users, while the penetration of MPEG-H HEVC is still low. Some, including myself, think that a confused HEVC licensing situation, not the usability and quality of HEVC, is the cause of this situation. MPEG has nothing to say or do about licensing or about the rules that originate the current state of affairs, but MPEG-5 EVC, a standard due to be approved in 4.5 months, can unlock the situation. EVC is expected to have significant more compression performance than HEVC (still less than VVC), but a more streamlined licencing landscape.

Obviously MPEG cannot be involved in the MPEG Future initiative, but one of its action points reads:

Enhance the value of intellectual property that make MPEG standards unique, while facilitating their use

Conclusion: MPEG has its hands tied but the market has not. The people of good will who are part of MPEG Future need not have their hands tied, either, when they operate within the initiative.

Charge #2: too many standards are less than successful

Prosecutor: in 30 years the defendant has produced some 180 standards, but the really successful ones out of these are maybe 1/6 of the total. This has been a waste of resources.

Attorney: it is a matter of where you stop. MPEG could have decided to place the bar of acceptance of new standard work higher and would thus have produced a smaller number of less successful standards. If it had done so, however, most likely MPEG could also not have produced some of its successful standards. Which is better: more attempts or lost opportunities?

One should not forget that, if making a market-ready product costs 1000, designing a product costs 100, making research for a product costs 10 and making a standard for that product costs 1. Companies should concentrate on the 1000s and 100s, not on the 1s.

This does not mean that MPEG has not been aware of the need to increase the number of successful standards. In 2014 MPEG did try to address the problem. It dreamed of a new structure where communication with the the world was not only at the end of the process (“look, this is the standard we’ve developed for you”) but also at the beginning (“look, we plan to develop this standard”).

Figure 1 – The MPEG organisation designed in 2014

The plan went nowhere. As it is perceived as a technical body, MPEG was unable to find the leader of the industry liaison subgroup, never mind the members.

The problem of raising the rate of successful standards, however, does not go away. The MPEG Future initiative has this as a key element of its proposal to make MPEG a subcommittee. Unlike the structure of Figure 1, however, the Industry Liaison and Requirements functions are no longer sequential. They operate in co-competition to provide proposals for action that take into account the unique positioning of MPEG as the body producing standards that are anticipatory but aim to address markets of million or billion products, services and applications. It does that by blending technology evolution trends (as assessed by Technical Requirements) against the expected market need (as assessed by Market Needs).

Figure 2 – The MPEG organisation proposed for MPEG as an SC

Conclusion: applying the English proverb “you cannot have your cake and eat it”, one can say that if you want to have many successful standards you must make many attempts. Nevertheless, injecting market awareness in the standards process will help.

Charge #3: the MPEG structure is too static

Prosecutor: In the last ten years, the organisation of MPEG has not changed appreciably.

Attorney: I am not sure that an organisation that changes its structure frequently is necessarily a healthy organisation. There are many companies out there who are in constant restructuring and their performance can hardly be described as excellent. In the last few years, however, MPEG has concentrated in making its organisation more flexible and suitable for the development of its standards.

Since its early days MPEG had Subgroups and, within it, Units addressing specific areas who are typically temporary, but can also be permanent. Today MPEG can claim to have a flat organisation whose elementary Units belong to specific Subgroups and carry out work for a particular standard interacting with other units under the supervision the Technical Coordination made up of Convenor and Subgroup Chairs. This is represented in Figure 3 where the letter before the Unit indicates the Subgroup the Unit belongs to.

Figure 3 – The flat MPEG organisation

This organisation is functional to the development of the highly integrated MPEG standards.

Conclusion: an organisational chart is typically viewed the sexy part of a company. In MPEG it is not necessarily the most important one.

Charge #4: MPEG does not collaborate with JPEG

Prosecutor: a sizeable part of MPEG deals with (moving) pictures and JPEG deals with (static) pictures. There are a lot of technical commonalities, still the two groups do not collaborate.

Attorney: Collaboration and brotherhood are nice words. Both, however, have to be put in a concrete setting to give them a practical meaning. Here I will only talk about the former.

The word collaboration can be easily applied to a group of researchers belonging to different organisations when they write a paper where each author brings different contributions.

The raison d’être of a standards committee is, well, … making standards. Therefore, “collaboration between two standards committees” can only be invoked when there is a need for a standard serving the constituencies of both committees. MPEG serves the industries engaged in the “distribution of moving pictures and audio on broadcast, broadband, mobile and physical media” while JPEG serves the industries engaged in “digital photography and distribution of still image sequences”. There is little intersection between the two as I am going to explain.

MPEG and JPEG may very well share some technologies, but the needs of their constituencies are different. For instance, JPEG is close to approving the FDIS of a standard for compression of light field images because its industries feel that they are ready to implement such a standard. On the other hand MPEG is just carrying out an exploration on dynamic light fields, because its industries are nowhere ready to do the same. If and when MPEG will develop a standard for dynamic light fields, it will most likely use a different capture format (JPEG has used plenoptic 1 in its standard and MPEG is using plenoptic 2 in its exploration). Therefore, little if anything of what has been developed by JPEG for its light field image compression standards, will be used by MPEG.

This is not a new story. In all 6 generations of MPEG moving picture coding standards no need or significant benefit has ever been found that justifies the adoption of a JPEG standard as the still picture coding mode of an MPEG standard. This assessment holds also for point clouds and, as said above, is expected to hold for other digital moving picture representation technologies such as light fields.

On the other hand MPEG has developed the ISO Base Media File Format (ISOBMFF), a standard used also by several non-ISO bodies such as 3GPP and AOM. Collaboration has happened because JPEG, too, is using ISOBMFF for its standards.

Collaboration need not be the exception. It is an undertaking to reach a common goal – when it exists. Otherwise it is just a way to further knowledge. This is certainly a useful thing, but not if it is done within a standards committee, because it becomes a distraction.

Collaboration is important and if it is to happen successfully, the following steps are needed:

  • Verify the joint industrial interest in the common project
  • Define requirements of the potential common standard
  • Draft the Terms of Reference for joint effort
  • Agree on
    • Work plan and timeline of the joint effort
    • Who lead(s) the work: just one or one from each committee
    • Independent or collocated meetings with either parent committee
    • Which committee provides the secretariat
    • How ballot comments are resolved

Conclusion: collaboration must be done for what it means ethymologically – doing work together.


The prosecutor and the defence attorney have made their pleas. Now it is for the jury to reach a verdict.

Posts in this thread