Can you “clone” MPEG?

Introduction

The publication of A vision made real – Past, present and future of MPEG has triggered many reactions coming from people in other industries. They are facing the same problem that MPEG started facing 31 years ago when it wanted to create digital media standards that were industry-agnostic and with a global scope.

My answer to this people has been that, indeed, the MPEG model can be exported to other domains where industries with different backgrounds, technologies, stakes and business models want to join forces and develop standards that enable new businesses.

But it won’t be like a piece of cake. Try to inject a new philosophy of work in an environment with its own history, people, relationships and the task is close to impossible. This is not because of technical reasons but because it is a matter of philosophy. MPEG had an easy (so to speak) task because it was a new group without history, with new people with a limited degree of relationships.

In this article I will examine some of the main aspects that one needs to care about to “clone” MPEG to make standards that are industry-agnostic, with a global scope and are intended to enable new businesses.

Going from here to there

The first task is to define what the common project is about. Clarifying the purpose of an undertaking is a good practice that should apply to any human endeavour. This good practice is even more necessary when a group of like-minded people work on a common project – a standard. When the standard is not designed by and for a single industry but by and for many, keeping this rule is vital for the success of the effort. When the standard involves disparate technologies, whose practitioners are not even accustomed to talk to one another, complying with this rule is a prerequisite.

When MPEG was established (late 1980’s) many industries, regions and countries had realised that the state of digital technologies had made enough progress to enable a switch from analogue to digital. Several companies had developed prototypes, regional initiatives were attempting to develop formats for specific countries and industries, some companies were planning products and some standards organisations were actually developing standards for their industries.

MPEG jumped in the scene at a time the different trials had not had the time to solidify. Therefore, it had a unique opportunity to execute its plan of an epochal analogue-to-digital transition of media.

Similarly, industries who wish to create standards that are industry agnostic need to understand what is exactly the world that they want to establish and share a common vision of it.

By the way, you need a business model

Of course, I am not talking about how a hypothetical group working for industry-agnostic standards is going to make money. I am talking about how companies participating in this common effort can develop standards that bring actual benefits to each of them in spite of their differences, market positioning etc.

In the case of MPEG the definition of the business model had to take into account the fact that industry and acad­emia had worked on video compression technologies for some 3 decades filing many patents (at that time they could already be counted by the thousands) which covered a wide range of basic video coding aspects. Video coding standard that are loosely called “royalty free” (in ISO language, for which only Option 1 patent declar­ations are made) were certainly possible but would probably have been unattractive because of their low performance compared with the state-of-the-art codecs.

Therefore, MPEG decided that it would develop standards with the best performance, without consideration of the IPR involved. Patent holders would get royalties from the use of MPEG standards widely adopted by the market. If a patent holder did not want to allow that to happen, they could make an Option 3 declaration and MPEG would remove the infringing tech­nologies.

The MPEG business model is certainly not a prerequisite for developing industry agnostic standards, but it has worked well for the industry. More than that, most patent holders have been and keep on re-investing the royalties they get from existing standards in more technologies for future standards. The MPEG “business model” has created a standard-producing machine (MPEG) that feeds itself with new technologies.

Standards must be industry-friendly

A primary – and obvious – goal of the effort is that the standards produced by the collaborating industries should serve the needs of the participating industries.

The following points describe the three main MPEG targets:

  1. Display formats: Since the appearance of television cameras and displays in the 1920’s, industry and govern­ments have created tens of television formats, mostly around the basic NTSC, PAL and SECAM families. Already in the late 1960’s, when the Picturephone service was deployed, the tradition was hard to die: AT&T invented a new 267-line format, with no obvious connection with any of the existing video formats. As MPEG wanted to serve all markets, it decided that it would just support any display format, leaving display formats outside MPEG standards.
  2. Serving one without encumbering others. An industry may like the idea of sharing the cost of an enabling technology but not at the cost of compromising their individual needs. MPEG standards share some basic technologies but provide the necessary flexibility to its many different users with the notion of Prof­iles (subsets of general interoperability) and Levels (grades of performance within a Profile).
  3. Standards apply only to decoders; encoders are only implicitly defined, and their implementation leaves ample margins of freedom. By restricting standardisation to the decoding functionality, MPEG extends the life of its standards and, at the same time, allows industry players to compete on the basis of their constantly improved encoders.

The conclusion is that standards are great because they enable interoperability but should leave meaningful room to individual participants to exercise their business. Even better, as MPEG did with standards defining only the encoder, this opens the way to research-enabled competition.

Standards for the market, not the other way around

Before MPEG, a company with a success­ful product would try to get a “standard” stamp on it, share the technology with its competitors and enjoy the economic benefits of its “standard” technology.

This process may still be in place for some industries but is not an option when different industries team up to define common industry-agnostic standards with a global scope.

Under the MPEG regime, companies do not wait for the market to decide which technology wins, an outcome that very often has little to do with the value of the technol­ogy or the product but wait for the “best stan­dard” to be developed based on a set of technologies each of which is collectively selected based on a priori defined criteria. Then the technology package – the standard – developed by MPEG is taken over by the industry.

In a multi-industry environment, standards must anticipate the future. The alternative is to stop making standards because if the body waits until market needs are clear, the market is already full of incompatible solutions and there is no room left for standards, certainly not industry-agnostic and with a global scope.

Anticipating market needs is in the DNA of MPEG standards. With each of its standards MPEG is betting that a certain standard technology will be adopted. This explains why some MPEG standards are extremely successful and other less so.

Integrated standards as toolkits

Today’s systems comprise many functions. Some users of the standards are keen to have the complete package of functions, while others want to keep the freedom to cherry-pick other solutions that hopefully fit in the package as shown in the figure.

If interfaces are kept, say the one between System B and System C, the complete system continues to work. Depending on the specific case, however, the level of performance (not the functionality) of the entire system may change and a degree of interoperability may be lost.

Most MPEG standards are composed of the 3 key elements – audio, video and systems – that make an audio-visual system and some, such as MPEG-4 and MPEG-I, even include 3D Graphic information and the way to combine all the media. However, the standards allow maximum usage flexibility:

  1. A standard can be directly used as complete solutions, e.g. like in VCD where Systems, Video and Audio are used
  2. The components of the standard can be used individually, e.g. like in ATSC A/53 where Systems and Video are from MPEG, and Audio is from and external source
  3. The standard does not specify a technology but only an interface to different implementations of the technology, e.g. like in the case of MPEG-I, for which MPEG will likely not standardise a Scene Description technology but just indicate how externally defined technologies can be plugged into the system
  4. A standard does not specify the solution but only the components of a solution, e.g. like in the case of Reconfigurable Video Coding (RVC) where a non-standard video codec can be assembled using an MPEG standard.

A multi industry effort must satisfy the needs of all customers, even those who do not want to use its standards in their entirety but other specifications as well.

Compete and collaborate

Competition is the engine of progress, but standards are the result of a collaboration. How to combine competition and collaboration?

MPEG favours competition to the maximum extent possible. This is achieved by calling for solutions that respondents must comprehensively describe, i.e. without black boxes, in order to qualify for consideration. MPEG experts, including other proponents, assess the merit of prop­osed technologies.

Exte­nding competition beyond a certain point, however, is counterproductive and prevents the group from reaching the goal with the best results.

MPEG develops and uses software platforms that assemble the candidate components selected by its experts – called Test Models – as the platforms where participants can work on improving the different areas of the Test Models.

Core Experiments is the tool that allows experts to improve the Test Model by adding step by step the software that implements the accepted technologies. A Core Experiment is “a technical experiment where the alter­nat­ives considered are fully documented as part of the test model, ensuring that the results of independent experimenters are consistent”.

MPEG mission is to provide the best standards to industry via competition, but MPEG standards should not be shielded from competition Probably the earliest example of application of this principle is provided by MPEG-2 part 3 (Audio). When backward compatibility requirements did not allow the standard to yield a performance comparable to that of algorithms not constrained by compatibility, MPEG issued a Call for Proposals and developed MPEG-2 part 7 (Advanced Audio Codec). Later the algorithms evolved and became the now ubiquitous MPEG-4 AAC. Had MPEG not made this decision, probably we would still have MP3 everywhere, but no other MPEG Audio standards. The latest example is Essential Video Coding (EVC), a standard not designed to offer the best performance, but a good performance with good licensability prospects.

Working on generic standards means that reasonable requests – say, the best unconstrained multichannel audio quality – cannot be dismissed. MPEG tried to achieve that with the technology it was working on – backward-compatible multichannel audio coding – and failed. The only way to respond to the request was to work on a new – competing – technology.

One step at a time

An obvious principle, but it is better to keep it in mind, is that one must fine tune the engine first before engaging in a car race. If in 1988 the newly born MPEG had proposed itself as the developer of an ambitious generic digital media technology standard applicable to all indus­tries on a global scale, the proposal would have been seen as far-fetched and most likely the initiative would have gone no­where.

Instead, MPEG started with a moderately ambitious project: a video coding standard for interactive applications on digital storage media (CD-ROM) at a rather low bitrate (1.5 Mbit/s) targeting the market covered by the video cassette (VHS/Beta) with the addition of interactivity.

Moving one step at a time has been MPEG policy for MPEG-1 and all its subsequent standards and so should do any effort comparable to MPEG’s.

Separate wheat from chaff

In human societies parliaments make laws and tribunals decide if a specific human action conforms to the law. In certain regulated environments (e.g. terrestrial broadcasting in many countries) there are standards and entities (authorised test laboratories) who decide whether a specific implementation conforms to the standard. MPEG has neither but, in keeping with its “industry-neutral” mission, it provides the technical means – namely, tools for conformance assessment, e.g. bitstreams and reference software – for industries to use in case they want to establish authorised test laboratories for their own purposes.

Providing the tools for testing the standard is vital in a multi-industry environment. The ecosystem is owned by all and should not be polluted by non-conforming implementations.

Technology is always on the move

The Greek philosopher Heraclitus is reported to have said: τὰ πάντα ῥεῖ καὶ οὐδὲν μένει (every­thing flows and nothing stays). The fate of any technology field today is that technologies not only do not stay but move fast and actually accelerate.

MPEG is well aware that the technology landscape is constantly changing, and this awareness informs its standards. Until HEVC – one can even say, including the upcoming Versatile Video Coding (VVC) standard – video meant a rectangular area (in MPEG-4, a flat area of any shape, in HEVC it can be a video projected on a sphere). The birth of immersive visual experiences is not without pain, but they are happening, and MPEG must be ready with solutions that take this basic assumption into account. This means that, in the technology scenario that is taking shape, the MPEG role of “anticipatory standards” is ever more important and challenging to achieve.

This has happened for most of its video and audio compression standards. A paradigmatic case of a standard addressing a change of context is MPEG Media Transport (MMT) that MPEG designed having in mind a broadcasting system for which the layer below it is IP, unlike MPEG-2 Transport Stream, originally designed for a digitised analogue channel (but also used for trans­port over IP as in IPTV).

Research for standards

The wild pace of technology progress requires an engine capable to constantly feed new technologies.

MPEG is not in the research business. However, without a world of researchers working with MPEG in mind there, would be no MPEG. The MPEG work plan promotes corporate/academic research because it pushes com­panies to improve their technologies to enable them to make successful responses to Calls for Proposals.

One of the reasons of MPEG success, but also of some of its difficulties, is that MPEG standardisation is a process closer to research than to product design.

Roughly speaking, in the MPEG standardisation process, research happens in two phases:

  1. In companies, in preparation for Calls for Evidence (CfE) or Calls for Proposals (CfP), that MPEG calls competitive phase
  2. In MPEG in what is called collaborative phase, i.e. during the development of Core Exper­iments (of course this research phase is still done by the companies, but in the framework of an MPEG standard under development).

The MPEG collaborative phase offers another opportunity to do more research. This has apparently a more lim­ited scope, because it is in the context of optimising a subset of the entire scope of the standard, but the sum of many small optimisations can provide big gains in performance. The shortcoming of this process is the possible introduction of a large number of IP items for a gain that some may well consider not to justify the added IP onus to complexity. With its MPEG-5 EVC project, MPEG is trying to see if a suitably placed lower limit to performance improvements can help solve the problems identified in the HEVC standard.

Standards as enablers, not disablers

A standard intended for use by many industries cannot be “owned” by a specific industry. Therefore MPEG, keeping faith to its “generic standards” mission, tries to accommodate all legitimate functional requirements when it develops a new standard. MPEG assesses each requirement for its merit (value of functionality, cost of implementation, possibility to aggregate the functionality with others etc.). Profiles and Lev­els are then used to partition the application space in response to specific industry needs.

The same happens if an industry comes with a legitimate request to add a functionality to an existing standard. The decision to accept or reject a request is only driven by the value brought by the proposal, as substantiated by use cases, not because an industry gets an advantage, or another is penalised.

Conclusions

This article has given some hints drawn from MPEG’s 30 years long experience to those who intend to undertake an effort to develop standards for a multi-industry environment.

It is a significant but doable task if the effort is supported by new people without common history because a completely new philosophy of work must be adopted.

It is a close to impossible task if the effort is supported by people who already had a common history. This is true also in the case that the effort is about cloning MPEG to develop digital media standards.

Posts in this thread

Who “owns” MPEG?

Introduction

The title of this article contains three elements: the verb own pre- and postfixed with a quotation mark which conveys the notion of having “a control of, a stake in or an influence on”; the acronym MPEG with its multiple meanings; and the pronoun who which represents several entities.

In this article I will try and clarify the nature of these three elements and draw some conclusions that may be useful for some impending decisions.

The different uses of the word “MPEG”

In MPEG: what it did, is doing, will do I have reported evidence that even the public at large widely knows the word “MPEG”. But what do they mean when they use it? Most often they intend some “media technologies” present in a device or used to create, store or transmit media content. Therefore, the word “MPEG” in this context is owned by all.

The domain mpeg.org is obviously connected to the word “MPEG”. As I have written in the same MPEG: what it did, is doing, will do, the domain is owned by an individual who is not using the domain, is not inclined to donate it and does not answer when asked how much will make him inclined. So mpeg.org is privately owned.

A small number of people use the word “MPEG” to indicate one of more of the ISO/IEC standards listed in ISO numbers of MPEG standards. In this context the ownership is clear because ISO and IEC own the copyright of the standard, as stated in the second page of all MPEG standards. But what about other rights?

A small but important number of people use the word MPEG to indicate the working group (WG) whose official name is ISO/ IEC JTC 1/SC 29/WG 11 (I still have to meet someone who is not in the ISO/IEC circle and knows the avronym). To understand what “ownership” means in this context we have to make a digression that I hope will be perceived as short.

The place of the MPEG WG in ISO

The MPEG WG is the green box in the ISO organisation chart of Figure 1. I know that I should also say something about IEC, but the explanation I am going to make will discourage a sufficient number of well-intentioned readers to read further and I do not want to risk losing them all.

Figure 1 – MPEG sits at the bottom of the ISO organisation

In the hierarchical organisation of Figure 1 there are plenty of opportunities to claim “ownership”.

  1. The General Assembly, made up of all standards organisations members of ISO called National Bodies (NB);
  2. The ISO Council, the core governance body made up of 20 NBs, the ISO Officers and the Chairs of 4 committees;
  3. The Technical Management Board (TMB), in charge of managing the structure and activities of the Technical Committee (TC), made up of a chair and 15 members from NBs;
  4. The Joint ISO/IEC Technical Committee 1 Information technology (JTC 1), the largest Techical Committee in ISO with 15 Advisory Groups, 2 Working Groups, and 22 Subcommittees (SC);
  5. The Subcommittee 29, in charge of Coding of audio, picture, multimedia and hypermedia information and has 2 Working Groups;
  6. WG 11 aka MPEG in charge of Moving Pictures and Audio.

“Ownership” of MPEG standards

ISO and IEC

The MPEG WG is populated by experts who are accredited by the ISO NBs they are members of. They are typically employes of a company or university but there are some students, consultants or simply individuals.

Experts attending MPEG produce standards using the process described in … ISO and IEC sell copies of the standards whose copyright they own. These can be purchased online from www.iso.org. Some standards (e.g. those common with ITU) can be freely downloaded.

Patent holders

The buyer of an MPEG standard does not necessarily have the right to practice the standard because of rights to essential patents that may be owned by third parties, in most cases employers of MPEG experts. Therefore, these third parties also “own” the MPEG standard in which they have essential patents. Since the very beginning, MPEG standards a large number of patent holders and patent pools were established to license packages of patents considered essential to practice MPEG standards. Therefore, patent pools, too, have a sort of “ownership”.

A court of law may declare that a patent is essential to practice a standard in a jurisdiction. This kind of declaration is available for a limited number o patents (compared to the number of patents granted). Therefore, the identity of those who really “own” something in an MPEG standard is typically wrapped in mystery.

ISO and IEC (and ITU when relevant) keep patent declarations spontaneously made by those who believe they have rights in the standards. Patent declarations are also made by those who submit technologies for consideration by MPEG, either as part of responses to a Call for Proposal or as independent submissions (e.g. when submitting results of a Core Experiment). ISO and IEC simply record those declarations and take no stance regarding their validity. Therefore, the actual “ownership” claimed by those declarations or agreements to recognise them has to be determined outside of ISO.

Ownership of reference software

MPEG prides itself to have the policy to develop two “versions” of most of its standards. One version is expressed in natural language (English, later translated into French and Spanish). The second version is expressed in a computer language and is called the reference software of the standard. Both versions have normative status.

In the second half of the 1990’s, MPEG developed the MPEG-4 “copyright disclaimer” whereby users could get a free license from ISO/IEC to use and modify individual modules of the reference software for products claiming conformance to the specific MPEG standard. Of course, the copyright disclaimer included the usual disclaimer about third parties’ rights.

In the mid 2000s MPEG adopted a slightly modified version of the Berkeley Software Distribution (BSD) available as MXM Licence), a licence originally used to distribute a Unix-like operating system. The licence simply says that the reference software may be used for anything the uses wishes, with some obvious exceptions. MPEG has added the usual disclaimer (the “modification” above) about possible third-party rights.

Ownership of conformance testing suites

The notion of “ownership”, possibly a more complicated one, is also applicable to conformance testing suites. These consist of bitstreams designed and produced by MPEG experts to test particularly critical parts of a decoder. They have a normative value and are declared to conform to a specific MPEG standard. Bitstreams can be downloaded from the ISO web site and used to check whether a decoder implementation conforms to the standard.

Ownership of test sequences

There are other parties who have some “ownership” of MPEG standards, the companies (in some cases individuals) who have contributed content – images, audio, video and point clouds – used to carry out evaluations of submissions or verification tests. The very first test sequences used by MPEG were the so-called “CCIR sequences” donated by CCIR (now ITU-R) and used in MPEG-1 and other Calls for Proposals. For years the entire video coding community widely used “Table tennis”, “Mobile and calendar” etc. More recently, rights holders have begun to license content to individual MPEG experts for the purpose of developing MPEG standards.

ISO’s is a different organisation chart

Many companies have similar organisation charts with similar boxes as those of Figure 1. They are populated in the real world by employees and managers often with the support of secretariats. Unlike companies, however, ISO does not have its own employees populating the boxes located under the Technical Management Board. Who populates them?

Working Groups are populated by experts sent by companies and chaired by convenors, normally coming from industry. The role of convenors is to facilitate consensus and to detect achievement of consensus. As part of their duty to convene experts, convenors may need to organise the work. This gives convenors a limited level of “ownership”.

Subcommittees and Technical Committees are populated by delegations of National Bodies made up of NB officers often complemented by industry representatives. SC and TC secretariats have a level of “ownership” because the ISO/IEC Directives assign to them several important tasks such as the management of document balloting, e.g. proposals of new work items or approval of standards under development. Committee chairs, too, have a level of ownership.

Another level of “ownership” comes from the fact that a Convenor is nominated by the SC secretariat and elected by NB delegates in the SC. The SC Chair is elected by NB delegates in the SC and confirmed by NB delegates in the TC.

National Bodies “own” ISO

As we have seen, National Bodies permeate the ISO structure at all levels. Decisions at TC and SC level are made by NB votes. It is a one country, one vote systems but, sometimes, some votes weigh more than others. Votes are cast on the basis of the opinion formed within the NBs, ostensibly to further “national interests”.

National Bodies usually have a structure mirroring that of ISO. Industrial interests of national companies which are members of the National Body, often subsidiaries of multinational companies, intermingle to determine national positions on matters to be decided at the international level.

Conclusions

In this article I have tried to identify and describe the different forms of MPEG-generated “ownership”. Clearly the most important “ownership” is represented by MPEG standards because they generate three revenue streams: from products/services/applications enabled by MPEG standards (companies) to satisfy end user needs, from those who practice MPEG standards (patent or, more generally, intellectual property holders) and from the sale of standards (ISO/IEC and National Bodies).

Unfortunately, this clearcut logic is polluted by the different forms of “ownership” that I have described above, especially the desire of National Bodies to hold committee secretariats and nominate chairs. These may be nice-looking elements in National Body panoplies, but have nothing to do with the intensity of the three streams, the raison d’être of a standards organisation.

The intensity of revenue streams exclusively depends on the productivity of the “machine” (the committee) that creates the standards. “Owners” acting according to logic should leverage – not meddle with – a working group whose standards enable a global turnover of devices worth more than 1 trillion USD and services revenues of ~230 billion USD for pay-tv alone (data of 2018).

It will be interesting to see if the next decisions will follow a revenue stream maximisation logic, a curio collection logic or a survival logic.

Posts in this thread

Which future for MPEG

Introduction

For three decades MPEG has designed multi-threaded work programs, cooperated with tens of organisations and developed close to 200 specifications and hundreds of amendments. MPEG standards have transformed the global media industry from analogue to digital, enabled participation of new industries in the media business, provided standards serving all industries without favours and ensured unrelenting expansion of the entire media business.

MPEG achievements look reassuring: in 2018 the global turnover of MPEG-enabled devices is worth more than 1 trillion USD per annum and the global Pay-Tv revenues are 228 billion USD per annum, without mentioning other industries.

Should MPEG rest and look forward to a bright future built on the successes achieved? It would be so easy to answer “tout va très bien madame la marquise” but my answer is no.

The MPEG landscape has changed

In the first 10 years of its existence MPEG ruled the field of audio and video coding. In the following 10 years some proprietary solutions popped up, but the field was still largely dominated by MPEG standards. In the last 10 years MPEG has seen proprietary solutions getting significant strength. Today large areas that used to be exclusively covered by MPEG standards are satisfied by other solutions and the path to satisfy the next needs is not at all clear.

We do not have the complete picture yet of the extent the market will turn its back to MPEG standards. Market will speak and market is right by definition. If we do not like it, it is just because we did not try hard enough.

The next few years will be very challenging. It will not be a time of “business as usual”. MPEG needs to rethink itself and take appropriate measures. This article lays down some ideas and presents a proposal.

MPEG is about compression

So far MPEG has produced 5 generations of video compression standards, each generation offering more compression and more features. More of the same is expected from the 6th generation (VVC). MPEG has produced an equivalent number of audio coding standards. Will industry keep on asking for more video compression? I would like to answer with a resolute yes as I believe that there will always be need for more compression of visual information, but not always and not necessarily of the “old way”. More importantly, the need will not always be satisfied by MPEG standards because “The Price Is Right” applies to compression standards, too.
The answer to the question “Do we need more audio compression?” is, at least in the current time frame, that the currently available compression engine (MPEG-H 3D Audio) is good enough but we need new standards for other non-compression features, e.g. 6 Degrees of Freedom (6DoF). In the future this trend will also apply to video, as 3DoF+ video moves in the same direction as 6DoF audio (see The MPEG drive to immersive visual experiences).
Point cloud compression definitely needs compression but the convergence between 3D visual information captured by means of traditional video capturing devices and Point Cloud is still a matter for investigation.

MPEG is also about systems aspects

Systems aspects have been the enabling factor of the success of many MPEG standards and the future will not make those aspect less, but more important. The trend toward immersive media will require an even deeper integration between compressed media and the systems aspects that permeate them.
This can be seen from the requirements that are being identified in an activity called “Immersive Media Access and Delivery” where four dimensions are identified:

  1. Time (the usual one)
  2. Space (how to retrieve just the media parts of interest)
  3. Quality (how to access portions of media with the quality desired)
  4. Object (how to access specific parts of specific objects of interest).

MPEG in not just about media

In the past 30 years MPEG has shown that it could address new domains of expertise and learn their language. The fact that today all digital media speak the same (technical) language, is also due to the efforts made by MPEG to understand the needs of different industries, convert them to requirements, develop the technologies and quantise the standards into profiles and levels. This workflow has been in operation for 27 years, starting from the moment MPEG invented profiles and levels, and consistently applied them to talk to different communities using the same language.
This does not mean that there are no challenges when we talk to a new industry. MPEG has spent more than 3 years talking to, and identifying and validating requirements with the genomic community before starting the development of the MPEG-G standard. This significant effort has paid off: three International Standards on Genomic Data Compression have been developed jointly with TC 276 and 3 more are in the pipeline.

Governance is important

MPEG achieved its results as a working group, i.e. as the lowest organisational unit in ISO/IEC that the ISO/IEC directives recommend to be “reasonably limited in size”. Rightly, the ISO/IEC Directives do not define the term “reasonable”, but in 1989 MPEG had already 100 members, in 1999 it had 300 members and in 2019 it has 1500 members 500 of which attend its quarterly meetings. For the work it has done, MPEG has been reasonably limited in size.

For 30 years MPEG has played a role much above its status and I do not think there should be complaints about the results. In normal conditions MPEG could continue to operate as a working group for another 30 years but, as I said above, these are not normal conditions.

MPEG should become a Subcommittee (SC). Why? Because an SC has a solid governance administered by delegates appointed by National Bodies under the leadership of a chair. On the other hand, the design of the organisation must be properly done, if we do not want to do more harm than good.

Design of the SC is important

To be successful, the organisation of the SC should be conservative because of the importance of the industries served by MPEG standards. Therefore, the organisation of the SC should leverage the existing successful MPEG organisation by retaining and strengthening:

  1. The existing well-honed and demonstrably effective MPEG organisation. MPEG has fed the global media industry with technology that has allowed its growth for the last 30 years. It would be irresponsible to do anything that jeopardises such a large industry, the millions of jobs that go with it and the billions of consumers.
  2. The MPEG collaborative stance with other bodies. The major reason of the success of MPEG standards is MPEG’s open collaboration stance with its many client industries as represented by their standard organisation or industry fora. Collaboration is a must for MPEG because compression is always part of a bigger system with many interactions with other components. However, what was good 30, 15 or even 5 years ago is not necessarily sufficient today.
  3. The match of new MPEG standards to market needs. MPEG has produced hugely successful standards. However, other standards are less so. This is inevitable for an organisation that develops anticipatory standards that sometimes target the next 5 years. MPEG’s ability to engage in standards that are better matches of market needs has to be enhanced because conditions have changed.
  4. The strong MPEG brand. The new organisation is an internal matter designed to give MPEG a better chance to face a complex situation and should not create confusion in the MPEG client industries.

Figure 1 represents the proposed organisation.

Figure 1 – Structure of Subcommittee “MPEG Coding of Moving Pictures and Audio”

Meeting design criteria

Criterion #1: the organisational chart of Figure 1 retains the current MPEG organisational structure where existing informal subgroups become (in italic the names of the existing MPEG entities):

Advisory Groups (AG) if they do not develop standards (orange blocks):

  1. Technical Requirements (Requirements)
  2. Liaison and Communication (Communication)
  3. Technical Coordination (Chairs meetings)

Working Groups (WG) if they develop standards (light green blocks):

  1. Systems Coding (Systems)
  2. Video Coding (Video)
  3. Audio Coding (Audio)
  4. 3D Graphics Coding (3D Graphics)
  5. Quality Assessment (Test)
  6. Genomic Data Coding (Genomic activity in Requirements)

Criterion #2: The chart preserves and extends MPEG’s collaboration stance: Joint Teams with ITU-T (JCT-VC and JVET). The new organisation will be able to establish JWGs to develop standards on well-identified common areas of standardisation, e.g. JPEG, SC 24, SC 41 (IoT), SC 42 (Artificial Intelligence), TC 276 (Bioinformatics).

Criterion #3: The SC will now be able to carry out activities with high strategic value by assigning to the Market Needs AG Technical Requirements AG the task to investigate 1) existing or new areas of work and 2) proposals for new areas of work. Both AGs will produce coordinated reports that will be used by the MPEG SC to make an informed decision on new work.

Criterion #4: The SC should be called “MPEG Coding of Moving Pictures and Audio” prefixing the word MPEG to the current title of ISO/IEC JTC 1/SC 29/WG 11 (MPEG).

Conclusions

In a few weeks industry will decide the future of MPEG.

Will industry decide to give itself a safe and prosperous path by adopting the organisation proposed in this article or will it opt for the Japanese saying “出る釘は打たれる” (The nail that sticks out gets hammered down)? Will industry allow the MPEG nail to keep on sticking out or will it hammer it down?

Stay tuned to this block for further news.

Posts in this thread

Why MPEG is part of ISO/IEC

Introduction

In July 1987 the plan to create a group that would develop industry-neutral standards was formed. But problem to be tackled was that the MPEG “digital baseband” (see The discontinuity of digital technologies) had to based on international international standards because they had to have global validity.

The question then was: where should those standards be developed? The answer is provided by the following sections:

  1. Standards describes the 3 international standards organisations;
  2. ISO and IEC standards describes the ISO structure and the ISO/IEC standardisation process;
  3. A home for MPEG describes how an independent home for MPEG was found.

Standards

Standards have a special place in industry because they represent convergence points where the parties involved, who typically are in competition, find it convenient to agree on a single solution.

Standards bodies exists at the international level:

  1. International Telecommunication Union (ITU) for matters related to telecommunication and broadcasting
  2. International Electrotechnical Commission (IEC) for electrotechnical matters
  3. International Organisation for Standardisation (ISO) for everything else.

ITU

The International Telecommunication Union (ITU) is the result of the 1934 merge between the International Telegraph Convention of 1865 and the International Radiotelegraph Convention of 1906, and today is an agency of the United Nations. This is reflected in the two main branches of the ITU: ITU-T and ITU-R. The former deals with standards for global telecommunications excluding radio communication because this is the purview of ITU-R.

IEC

The International Electrotechnical Commission (IEC) is a not-for-profit organisation founded in 1906. It develops International Standards in the fields of electrotechnology, e.g. power gener­ation, transmission and distribution to home appliances and office equipment, semiconductors, fibre optics, batteries, solar energy, nanotechnology and marine energy.

ISO

The International Organization for Standardization (ISO) is an international non-governmental standard-setting organisation founded in 1947 and composed of representatives from various national standards organizations.

ISO is well known for its family of quality management systems standards (ISO 9000), environ­mental management standards (ISO 14000) and Information Security Management Systems standards (ISO 27000). There are more than 20,000 ISO published standards.

ISO is a huge organisation whose technical branch is structured, as is the IEC’s, in Technical Committees (TC). The first 3 active TCs are: TC 1 Screw threads, TC 2 Fasteners and TC 4 Rolling bearings. The last 3 TCs in order of establishment are TC 322 Sustainable finance, TC 323 Circular economy and TC 324 Sharing economy.

Between these two extremes there is a large number of TCs, e.g., TC 35 Paints and varnishes, TC 186 Cutlery and table and decorative metal hollow-ware, TC 249 Traditional Chinese med­icine, TC 282 Water reuse, TC 297 Waste collection and transportation management, etc.

Most TCs are organised in working groups (WG). They are tasked to develop standards while TCs retain key functions such as strategy and management. In quite a few cases the area of res­ponsibility is so broad that a horizontal organisation would not be functional. In this case a TC may decide to establish Subcommittees (SC) which include WGs tasked develop standards.

Figure 1 is an organigram of ISO.

Figure  1 – ISO governance structure

ISO and IEC standards

The development process

ISO and IEC share the standard development process which can be summarised as follows:

  1. Submission and balloting of a New Work Item Proposal (NWIP) of a new project meant to lead to an International Standard (IS) or Technical Report (TR). The former contains normative clauses, the latter is informative
  2. Development of a Working Draft (WD, possibly several versions of it
  3. Balloting of the Committee Draft (CD, when the WD has achieved sufficient maturity)
  4. Balloting of the Draft International Standard (DIS, after resolving comments made by National Bodies)
  5. Balloting of the Final Draft International Standard (FDIS, after resolving comments made by National Bodies)

The last ballot is yes/no. No comments allowed.

Amendments (AMD) extend a standard. The same steps as above are carried out with the names Proposed Draft Amendment (PDAM), Draft Amendment (DAM) and Final Draft Amendment (FDAM).

If an error is discovered, a Corrigendum (COR) is produced. This only goes through two stages: Draft Corrigendum (DCOR) and Corrigendum (COR).

A Technical Report, a document without normative clauses, goes through two stages of approval: Proposed Draft Technical Report (PDTR) and Technical Report (TR).

Consensus

ISO/IEC mandates that in the development of stan­dards working groups operate based on consensus. This is defined as

General agreement characterised by the absence of sustained opposition to substantial issues by any important part of the concerned interests and by a process that involves seeking to take into account the views of all parties concerned and to reconcile any conflicting arguments.

NOTE — Consensus need not imply unanimity.

Patent policy

ISO, IEC and ITU share a common policy vis-à-vis patents in their standards. Using few im­precise but hopefully clear words (as opposed to many precise but unclear words), the policy is:

  1. It is good if a standard has no patents or if the patent holders allow use of their patents for free (with an “Option 1” declaration);
  2. It is accepted if a standard has patents, but the patents holders only allow use of their patents on fair and reasonable terms and non-discriminatory conditions (with an “Option 2” declaration);
  3. It is not permitted to have a standard with patents whose holders do not allow use of their patents (with an “Option 3” declaration).

A home for MPEG

When the MPEG idea took shape in July 1987, the selection of a home to implement the idea was the primary concern. The idea was spoilt for choices as shown by the list of international committees in  Table 1 that were created for various reasons – regulation or simply need for an independent technical reference – to cater to the needs of standards by the different industries.

Table  1 – Media-related standards committees (1980’s)

ITU-T Speech SG XV WP 1
Video SG XV WP 2
ITU-R Audio SG 10
Video SG 11
IEC Recording of audio SC 60 A
Recording of video SC 60 B
Audio-visual equipment TC 84
Receivers SC 12A and G
ISO Photography TC 42
Cinematography TC 36

Since MPEG was conceived to be industry-neutral, committees already developing standards in the “media” area were considered unsuitable because the represented “vested interests”. The choice fell on ISO TC 97 Data Processing who had SC 2 Character sets and Information Coding who included WG 8 Coding of Audio and Picture Information.

In 1987 ISO/TC 97 Data Processing merged with IEC/TC 83 Information technology equipment. The resulting (joint) technical committee was called ISO/IEC JTC 1 Information Technology. SC 2 with its WGs, including WG 8, became part of JTC 1. MPEG was established as an Experts Group on Moving Pictures of ISO/IEC JTC 1/SC 2/WG 8 in 1988.

Note that Experts Group is an organisational entity not officially recognised in the ISO organ­igram. In 1991 SC 2/WG 8 seceded from SC 2 and became SC 29. WG 8’s Moving Picture Experts Group (MPEG) became WG 11 Coding of audio, picture, multimedia and hypermedia information (but everybody in the industry, and even in the general public, calls it MPEG).

Posts in this thread

The discontinuity of digital technologies

Introduction

Last week I published as an article of this blog the Executive summary of my book A vision made real – Past, present and future of MPEG. This time I publish as an article the first chapter of the book about the four aspects of the media distribution business and their enabling tech­nologies:

  1. Analogue media distribution describes the vertical businesses of analogue media distribution;
  2. Digitised media describes media digitisation and why it was largely irrelevant to distribution;
  3. Compressed digital media describes how industry tried to use compression for distribution;
  4. Digital technologies for media distribution describes the potential structural impact of compressed digital media for distribution.

Analogue media distribution

In the 1980’s media were analogue, the sole exception being music on compact disc (CD). Different industries were engaged in the business of distributing media: telecom­mun­ication companies distributed music, cable operators distributed television via cable, terrestrial and sat­ellite broadcasters did the same via terrestrial and satellite networks and different types of busin­esses distributed all sort of recorded media on physical support (film, laser discs, compact cas­set­te, VHS/Betamax cassette, etc.).

Even if the media content was exactly the same, say a movie, the baseband signals that represented the media content were all different and specific of the delivery media: film for the theatrical vision, television for the terre­strial or satellite network or for the cable, a different format for video cassette. Added to these technological differences caused by the physical nature of the delivery media, there were often substantial differences that depended on countries or manufacturers.

Figure 1 depicts the vertical businesses of the analogue world when media distribution was a collection of industry-dependent distribution systems each using their own technologies for the baseband signal. The figure is simplified because it does not take into ac­count the country- or region-based differences within each industry.

Figure  1 – Analogue media distribution

Digitised media

Since the 1930’s the telecom industry had investigated digitisation of signals (speech at that time). In the 1960’s technology could support digitisation and ITU created G.711, the standard for digital speech, i.e. analogue speech sampled at 8 kHz with a nonlinear 8 bits quantisation. For several decades digital speech only existed in the (fixed) network, but few were aware of it because the speech did not leave the network as bits.

It was necessary to wait until 1982 for Philips and Sony to develop the Compact Disc (CD) which carried digital stereo audio, specified in the “Red Book”: analogue stereo audio sampled at 44.1 kHz with 16 bits linear. It was a revolution because consumers could have an audio quality that did nor deteriorate with time.

In 1980 a digital video standard was issued by ITU-R.  The luminance and the two colour-differ­ence signals were sampled at 13.5 and 6.75 MHz, respectively, at 8 bits per sample yielding an exceedingly high bitrate of 216 Mbit/s. It was a major achievement, but digital television never left the studio if not as bulky magnetic tapes.

The network could carry 64 kbit/s of digital speech, but no consumer-level delivery media of that time could carry the 1.41 Mbit/s of digital audio and much less the 216 Mbit/s of digital video. Therefore, in the 1960s studies on compression of digitised media begun in earnest.

Compressed digital media

In the 1980’s compression research yielded its first fruits:

  1. In 1980 ITU approved Recommendation T.4: Standardization of Group 3 facsimile terminals for document transmission. In the following decades hundreds of million Group 3 facsimile devices were installed worldwide because, thanks to compression, transmission time of an A4 sheet was cut from 6 min (Group 1 facsim­ile), or 3 min (Group 2 facsimile) to about 1 min.
  2. In 1982 ITU approved H.100 (11/88) Visual telephone systems for transmission of videocon­ference at 1.5/2 Mbit/s. Analogue videoconferencing was not unknown at that time because several com­panies had trials, but many hoped that H.100 would enable diffused business communication.
  3. In 1984 ITU started the standardisation activity that would give rise to Recommendations H.261: Video codec for audio-visual services at p x 64 kbit/s approved in 1988.
  4. In the mid-1980s several CE laboratories were studying digital video recording for magnetic tapes. One example was the European Digital Video Recorder (DVS) project that people ex­pected would provide a higher-quality alternative to the analogue VHS or Betamax video­cassette recorder, as much as CDs were supposed to be a higher-quality alternative to LP records.
  5. Still in the area of recording, but for a radically new type of application – interactive video on compact disc – Philips and RCA were independently studying methods to encode video signals at bitrates of 1.41 Mbit/s (the output bitrate of CD).
  6. In the same years CMTT, a special Group of the ITU dealing with transmission of radio and television programs on telecommunication networks, had started working on a standard for transmission of digital television for “primary contribution” (i.e. transmission between stu­dios).
  7. In 1987 the Advisory Committee on Advanced Television Service was formed to devise a plan to introduce HDTV in the USA and Europe was doing the same with their HD-MAC project.
  8. At the end of the 1980’s RAI and Telettra had developed an HDTV codec for satellite broad­casting that was used for demonstrations during the Soccer World cup in 1990 and General Instrument had showed its Digicipher II system for terrestrial HDTV broadcasting in the band­width of 6 MHz used by American terrestrial television.

 Digital technologies for media distribution

The above shows how companies, industries and standards committees were jockeying for a pos­ition in the upcoming digital world. These disparate and often uncorrelated initiatives betrayed the mindset that guided them: future distribution of digital media would have an arrangement similar to the one sketched in Figure 1 for analogue media: the “baseband signal” of each delivery medium would be digital, thus using new technology, but different for each industry and possibly for each country/region.

In the analogue world these scattered roles and responsibilities were not particularly harmful be­cause the delivery media and the baseband signals were so different that unification had never been attempted. But in the digital world unification made a lot of sense.

MPEG was conceived as the organisation that would achieve unification and provide generic, i.e. domain-independent digital media compression. In other words, MPEG envisaged the completely different set up depicted in Figure 2.

Figure  2 – Digital media distribution (à la MPEG)

 In retrospect that was a daunting task. If its magnitude had been realised, it would probably never have started.

Posts in this thread

A vision made real – Past, present and future of MPEG

Why this book?

In a generation, life of the large majority of human beings is incredibly different than the life of the generation before. The ability to  communicate made possible by ubiquitous internet and to convey media content to others made possible by MPEG standards can probably be mentioned among the most important factors of change. However, unlike internet about which a lot has been written, little is known about the MPEG group besides its name.

This book wants to make up for this lack of information.

It will talk about the radical transformation that MPEG standards wrought to the media distribution business by replacing a multitude of technologies owned by different businesses with a single technology shared by all; the environment in which it operates; the radically new philosophy that underpins this transformation; the means devised to put the philosophy into practice; the industrial and economic impact of MPEG standards; what  new standards are being developed; and what is the future that the author conceives for MPEG as an organisation that plays such an important industrial and social role.

Bottom line, MPEG is about technology. Therefore, the book offers an overview of all MPEG standards and, in particular, videoaudiomedia qualitysystems and data. This is for those more (but not a lot more) technology-minded.

Important – there are short Conclusions worth reading.

Leonardo Chiariglione

Table of Contents of A vision made real – Past, present and future of MPEG

Introduction of A vision made real – Past, present and future of MPEG

The impact of MPEG standards

I suppose that few visitors of this blog need to be convinced that MPEG is important because they have some personal experience of the MPEG importance. Again, I suppose not all visitors have full visibility of all the application areas where MPEG is important.

This article describes different application domains showing how applications have benefited from MPEG standards. The list is not exhaustive and the order in which applications are presented follows approximately the time in which MPEG enabled the application.

Digital Television for distribution

MPEG-2 was the first integrated digital television standard first deployed in 1994, even before the MPEG-2 standard was approved. While most countries have adopted MPEG-2 Video for their terrestrial broadcasting services, with one notable major exception, countries have made different selections of for the audio component.

MPEG-2 Transport Stream is the Systems layer of Digital Television. The Systems layer can carry the “format identifier”. In case the media (audio or video) carried by the Systems layer are different from MPEG, the format identifier indicates which of the registered formats is being actually used.

Digital Television exploits Digital Storage Media Command and Control (DSM-CC) to set up a network connection (used by CATV services) and the carousel to send the content of a slowly changing information source that each receiver that happens to “tune-in” can acquire after some time.

MPEG-4 AVC has replaced MPEG-2 Video in many instances because of its superior compression performance. MPEG-H HEVC is also being used in different countries especially for Ultra High Definition (UHD) distribution. HEVC has the advantage of providing better compression that AVC. Additionally it supports High Dynamic Range (HDR) and Wider Colour Gamut (WCG).

MPEG-B Part 9 provides a specification for Common Encryption of MPEG-2 Transport Streams.

MPEG-H part 1 MPEG Media Transport (MMT), replaces the original MPEG-2 Transport Stream. MMT is part of the ATSC 3.0 specification.

Digital Audio Broadcasting

In the mid-1990’s different European countries began to launch Digital Audio Broadcasting services based on the specifications of the Eureka 147 (EU 147) research project. EU 147 used MPEG-1 Audio Layer II as compressed audio format, in addition to other EU 147-proper specifications. The specification were widely adopted in other countries outside of Europe promoted by the non-government organisation WorldDAB.

In 2006 the DAB+ specifications were released. DAB+ includes HE-AAC v2 and MPEG surround (MPEG-D Part 1).

Technologically connected to DAB for the transport layer, but addressing video (AVC), is the Digital Multimedia Broadcasting (DMB) system developed by Korea for video transmission on mobile handsets.

Other audio services, such as XM, use HE-AAC.

Digital Audio

MP3 (MPEG-1 Audio Layer III) brought a revolution in the music world because it triggered new ways to distribute and enjoy music content. MP3 players continued the revolution brought about by the Walkman. Different versions of AAC continued that trend and triggered the birth of music distribution over the internet. Today most music is distributed via the internet using MPEG standards.

Digital Video for package media distribution

Video Compact Disc (VCD)

The original target of MPEG-1 – interactive video on compact disc – did not happen but, especially in Far East markets, VCD was a big success – probably 1 billion devices sold – anticipating the coming of the more performing but more complex MPEG-2 based DVD. VCD used MPEG-1 Systems, Video and Audio Layer II.

Digital Versatile Disc (DVD)

The widely successful DVD specification used MPEG-2 Video, MPEG-2 Program Stream and a selection of audio codecs for different world regions.

Blu-ray Disc (BD)

The BD specification makes reference to AVC and to Multiview Video Coding. MPEG-2 TS is used instead of MPEG-2 PS. Apparently, no MPEG audio codecs are supported.

Ultra HD Blu-ray

The specification supports 4K UHD video encoded in HEVC with 10-bit High Dynamic Range and Wider Colour Gamut.

Digital video for the studio

MPEG was born to serve the “last mile” of video distribution, but some companies requested to make a version of MPEG-2 targeting studio use. This is the origin of the MPEG-2 4:2:2 profile which only supports intraframe coding and a higher number of bits per pixels.

All standards following MPEG-2, starting from MPEG-4 Visual, have had a few profiles dedicates to use in the studio.

Not strictly in the video coding area is the Audio-Visual Description Profile (AVDP), defined in MPEG-7 Part 9. AVDP was developed to facilitate the introduction of automatic information extraction tools in media production, through the definition of a common format for the exchange of the metadata they generate, e.g. shot/scene detection, face recognition/tracking, speech recognition, copy detection and summarisation, etc.

Digital video

Repeating the “MP3 use case for video” was the ambition of many. MPEG-4 Visual provided the standard technology for doing it. DivX (a company) took over the spec and triggered the birth of “DVD-to-video file” industry that attracted significant attention for some time.

Video distribution over the internet

MPEG-4 Visual was the first video coding standard designed to be “IT-friendly”. Some companies started plans to deliver video over the then internet then growing (in bitrate). Those plans suffered a deadly blow with the publication of the MPEG-4 Visual licensing terms with the “content fee” clause.

The more relaxed AVC licensing terms favoured the development of MPEG-standard based internet-based video distribution. Unfortunately, the years lost with the MPEG-4 Visual licensing terms gave time to alternative proprietary video codecs to consolidate their position in the market.

A similar story continues with HEVC whose licensing terms are of concern to many not for what they say, but for what some patent holders do not say (because they do not provide licensing terms).

Not strictly in the video coding area, but extremely important for video distribution over the internet, is Dynamic Adaptive Streaming for HTTP. DASH enables a client to request a server to send a video segment of the quality that can be streamed on the bandwidth available at a particular time, as measured by client.

In the same space MPEG produced the Common Media Applic­ation Format (CMAF) standard. Several technologies drawn from different MPEG standards are restricted and integrated to enable efficient delivery of large scale, possibly protected, video applications, e.g. streaming of televised events. CMAF Segments can be delivered once to edge servers in content delivery networks (CDN), then accessed from cache by streaming video players without additional network backbone traffic or transmission delay.

File Format

To be “IT-friendly” MPEG-4 needed a file format and this is exactly what MPEG has provided

The MP4 File Format, officially called ISO Base Media File Format (ISO BMFF), was the MPEG response to the need. It can be used for editing, HTTP streaming and broadcasting.

MP4 FF contains tracks for each media type (audio, video etc.), with additional information: a four-character the media type ‘name’ with all parameters needed by the media type decoder. “Track selection data” helps a decoder identify what aspect of a track can be used and to determine which alternatives are available.

An important support to the file format is the Common Encryption for files provided by MPEG-B Part 7.

Posts in this thread

Still more to say about MPEG standards

In Is there a logic in MPEG standards? and There is more to say about MPEG standards I have made an overview of the first 11 MPEG standards (white squares in Figure 1). In this article I would like to continue the overview and briefly present the remaining 11 MPEG standards, including those what are still being developed. Using the same convention as before those marked yellow indicate that no work was done on them for a few years

Figure 1 – The 22 MPEG standards. Those in colour are presented in this article

MPEG-MAR

When MPEG begun the development of the Augmented Reality Application Format (ARAF) it also started a specification called Augmented Reality Reference Model. Later it became aware that SC 24 Computer graphics, image processing and environmental data representation was doing a similar work and joined forces to develop a standard called Mixed and Augmented Reality Reference Model with them.

In the Mixed and Augmented Reality (MAR) paradigm, representations of physical and computer mediated virtual objects are combined in various modalities. The MAR standard has been developed to enable

  1. The design of MAR applications or services. The designer may refer and select the needed components from those specified in the MAR model architecture taking into account the given application/service requirements.
  2. The development of a MAR business model. Value chain and actors are identified in the Reference Model and implementors may map them to their business models or invent new ones.
  3. The extension of existing or creation of new MAR standards. MAR is interdisciplinary and creates ample opportunities for extending existing technology solutions and standards.

MAR-RM and ARAF paradigmatically express the differences between MPEG standardisation and “regular” IT standardisation. MPEG defines interfaces and technologies while IT standardars typically defines architectures and reference models. This explains why the majority of patent declarations that ISO receives relate to MPEG standards. It is also worth noting that in the 6 years it took to develop the standard, MPEG developed 3 editions of its ARAF standard.

The Reference architecture of the MAR standard is depicted in the figure below.

Information from the real world is sensed and enters the MAR engine either directly or after being “understood”. The engine can also access media assets or external services. All information is processed by the engine which outputs the result of its processing and manages the interaction with the user.

Figure 2 – MAR Reference System Architecture

Based on this model, the standard elaborates the Entreprise Viewpoint with classes of actors, roles, business model, successful criteria, the Computational Viewpoint with functionalities at the component level and the Informational Viewpoint with data communication between components.

MM-RM is a one-part standard.

MPEG-M

Multimedia service platform technologies (MPEG-M) specifies two main components of a multimedia device, called peer in MPEG-M.

As shown in Figure 3, the first component is API: High-Level API for applications and Low Level API for network, energy and security. 

Figure 3 – High Level and Low Level MPEG-M API

The second components is a middleware called MXM that relies specifically on MPEG multimedia technologies (Figure 4)

Figure 4 – The MXM architecture

The Middleware is composed of two types of engine. Technology Engines are used to call functionalities defined by MPEG standards such as creating or interpreting a licence attached to a content item. Protocol Engines are used to communicate with other peer, e.g. in case a peer does not have a particular Technology Engine that another peer has. For instance, a peer can use a Protocol Engine to call a licence server to get a licence to attach to a multimedia content item. The MPEG-M middleware has the ability to create chains of Technology Engines (Orchestration) or Protocol Engines (Aggregation).

MPEG-M is a 5-part standard

  • Part 1 – Architecture specifies the architecture, and High and Low level API of Figure 3
  • Part 2 – MPEG extensible middleware (MXM) API specifies the API of Figure 4
  • Part 3 – Conformance and reference software
  • Part 4 – Elementary services specifies the elementary services provided by the Protocol Engines
  • Part 5 – Service aggregation specifies how elementary services can be aggregated.

MPEG-U

The development of the MPEG-U standards was motivated by the evolution of User Interfaces that integrate advanced rich media content such as 2D/3D, animations and video/audio clips and aggregate dedicated small applications called widgets. These are standalone applications embedded in a Web page and rely on Web technologies (HTML, CSS, JS) or equivalent.

With its MPEG-U standard, MPEG sought to have a common UI on different devices, e.g. TV, Phone, Desktop and Web page.

Therefore MPEG-U extends W3C recommendations to

  1. Cover non-Web domains (Home network, Mobile, Broadcast)
  2. Support MPEG media types (BIFS and LASeR) and transports (MP4 FF and MPEG-2 TS)
  3. Enable Widget Communications with restricted profiles (without scripting)

The MPEG-U architecture is depicted in Figure 5.

Figure 5 – MPEG-U Architecture

The normative behaviour of the Widget Manager includes the following elements of a widget

  1. Packaging formats
  2. Representation format (manifest)
  3. Life Cycle handling
  4. Communication handling
  5. Context and Mobility management
  6. Individual rendering (i.e. scene description normative behaviour)

Figure 6 depicts the operation of an MPEG-U widget for TV in a DLNA enviornment.

Figure 6 – MPEG-U for TV in a DLNA environment

MPEG-U is a 3-part standard

  • Part 1 – Widgets
  • Part 2 – Additional gestures and multimodal interaction
  • Part 3 – Conformance and reference software

MPEG-H

High efficiency coding and media delivery in heterogeneous environments (MPEG-H) is an integrated standard that resumes the original MPEG “one and trine” Systems-Video-Audio standards approach. In the wake of those standards, the 3 parts can be and are actually used independently, e.g. in video streaming applications. On the other hand, ATSC have adopted the full Systems-Video-Audio triad with extensions of their own.

MPEG-H has 15 parts, as follows

  1. Part 1 – MPEG Media Transport (MMT) is the solution for the new world of broadcasting where delivery of content can take place over different channels each with different characteristics, e.g. one-way (traditional broadcasting) and two-way (the ever more pervasive broadband network). MMT assumes that the Internet Protocol is common to all channels.
  2. Part 2 – High Efficiency Video Coding (HEVC) is the latest approved MPEG video coding standard supporting a range of functionalities: scalability, multiview, from 4:2:0 to 4:4:4, up to 16 bits, Wider Colour Gamut and High Dynamic Range and Screen Content Coding
  3. Part 3 – 3D Audio il the latest approved audio coding standards supporting enhanced 3D audio experiences
  4. Parts 4, 5 and 6 Reference software for MMT, HEVC and 3D Audio
  5. Parts 7, 8, 9 Conformance testing for MMT, HEVC and 3D Audio
  6. Part 10 – MPEG Media Transport FEC Codes specifies several Forward Erroro Correcting Codes for use by MMT.
  7. Part 11 – MPEG Composition Information specifies an extention to HTML 5 for use with MMT
  8. Part 12 – Image File Format specifies a file format for individual images and image sequences
  9. Part 13 – MMT Implementation Guidelines collects useful guidelines for MMT use
  10. Parts 14 – Conversion and coding practices for high-dynamic-range and wide-colour-gamut video and 15 – Signalling, backward compatibility and display adaptation for HDR/WCG video are technical reports to guide users in supporting HDR/WCC,

MPEG-DASH

Dynamic adaptive streaming over HTTP (DASH) is a suite of standards for the efficient and easy streaming of multimedia using available HTTP infrastructure (particularly servers and CDNs, but also proxies, caches, etc.). DASH was motivated by the popularity of HTTP streaming and the existence of different protocols used in different streaming platforms, e.g. different manifest and segment formats.

By developing the DASH standard for HTTP streaming of multimedia content, MPEG has enabled a standard-based client to stream content from any standard-based server, thereby enabling interoperability between servers and clients of different vendors.

As depicted in Figure 7, the multimedia content is stored on an HTTP server in two components: 1) Media Presentation Description (MPD) which describes a manifest of the available content, its various alternatives, their URL addresses and other characteristics, and 2) Segments which contain the actual multimedia bitstreams in form of chunks, in single or multiple files.

Figure 7 – DASH model

Currently DASH is composed of 8 parts

  1. Part 1 – Media presentation description and segment formats specifies 1) the Media Presentation Description (MPD) which provides sufficient information for a DASH client to adaptive stream the content by downloading the media segments from a HTTP server, and 2) the segment formats which specify the formats of the entity body of the request response when issuing a HTTP GET request or a partial HTTP GET.
  2. Part 2 – Conformance and reference software the regular component of an MPEG standard
  3. Part 3 – Implementation guidelines provides guidance to implementors
  4. Part 4 – Segment encryption and authentication specifies encryption and authentication of DASH segments
  5. Part 5 – Server and Network Assisted DASH specifies asynchronous network-to-client and network-to-network communication of quality-related assisting information
  6. Part 6 – DASH with Server Push and WebSockets specified the carriage of MPEG-DASH media presentations over full duplex HTTP-compatible protocols, particularly HTTP/2 and WebSockets
  7. Part 7 – Delivery of CMAF content with DASH specifies how the content specified by the Common Media Application Format can be carried by DASH
  8. Part 8 – Session based DASH operation will specify a method for MPD to manage DASH sessions for the server to instruct the client about some operation continuously applied during the session.

MPEG-I

Coded representation of immersive media (MPEG-I) represents the current MPEG effort to develop a suite of standards to support immersive media products, services and applications.

Currently MPEG-I has 11 parts but more parts are likely to be added.

  1. Part 1 – Immersive Media Architectures outlines possible architectures for immersive media services.
  2. Part 2 – Omnidirectional MediA Format specifies an application format that enables consumption of omnidirectional video (aka Video 360). Version 2 is under development
  3. Part 3 – Immersive Video Coding will specify the emerging Versatile Video Coding standard
  4. Part 4 – Immersive Audio Coding will specify metadata to enable enhanced immersive audio experiences compared to what is possible today with MPEG-H 3D Audio
  5. Part 5 – Video-based Point Cloud Compression will specify a standard to compress dense static and dynamic point clouds
  6. Part 6 – Immersive Media Metrics will specify different parameters useful for immersive media services and their measurability
  7. Part 7 – Immersive Media Metadata will specify systems, video and audio metadata for immersive experiences. One example is the current 3DoF+ Video activity
  8. Part 8 – Network-Based Media Processing will specify APIs to access remote media processing services
  9. Part 9 – Geometry-based Point Cloud Compression will specify a standard to compress sparse static and dynamic point clouds
  10. Part 10 – Carriage of Point Cloud Data will specify how to accommodate compressed point clouds in the MP4 File Format
  11. Part 11 – Implementation Guidelines for Network-based Media Processing is the usual collection of guidelines

MPEG-CICP

Coding-Independent Code-Points (MPEG-CICP) is a collecion of code points that have been assemnled in single media- and technology-specific documents because they are not standard-specific.

Part 1 – Systems, Part 2 – Video and Part 3 – Audio collelct the respective code points and Part 4 – Usage of video signal type code points contains guidelines for their use

MPEG-G

Genomic Information Representation (MPEG-G) is a suite of specifications developed jointly with TC 276 Biotechnology that allows to reduce the amount of information required to losslessly store and transmit DNA reads from high speed sequencing machines.

Figure 8 depicts the encoding process

An MPEG-G file can be created with the following sequence of operations:

  1. Put the reads in the input file (aligned or unaligned) in bins corresponding to segments of the reference genome
  2. Classify the reads in each bin in 6 classes: P (perfect match with the reference genome), M (reads with variants), etc.
  3. Convert the reads of each bin to a subset of 18 descriptors specific of the class: e.g., a class P descriptor is the start position of the read etc.
  4. Put the descriptors in the columns of a matrix
  5. Compress each descriptor column (MPEG-G uses the very efficient CABAC compressor already present in several video coding standards)
  6. Put compressed descriptors of a class of a bin in an Access Unit (AU) for a maximum of 6 AUs per bin

Figure 8 – MPEG-G compression

MPEGG-G currently includes 6 parts

  1. Part 1 – Transport and Storage of Genomic Information specifies the file and streaming formats
  2. Part 2 – Genomic Information Representation specified the algorithm to compress DNA reads from jigh speed sequencing machines
  3. Part 3 – Genomic information metadata and application programming interfaces (APIs) specifies metadat and API to access an MPEG-G file
  4. Part 4 – Reference Software and Part 5 – Conformance are the usual components of a standard
  5. Part 6 – Genomic Annotation Representation will specify how to compress annotations.

MPEG-IoMT

Internet of Media Things (MPEG-IoMT) is a suite of specifications:

  1. API to discover Media Things,
  2. Data formats and API to enable communication between Media Things.

A Media Thing (MThing) is the media “version” of IoT’s Things.

The IoMT reference model is represented in Figure 9

Figure 9: IoT in MPEG is for media – IoMT

Currently MPEG-IoMT includes 4 parts

  1. Part 1 – IoMT Architecture will specify the architecture
  2. Part 2 – IoMT Discovery and Communication API specifies Discovery and Communication API
  3. Part 3 – IoMT Media Data Formats and API specifies Media Data Formats and API
  4. Part 4 – Reference Software and Conformance is the usual part of MPEG stndards

MPEG-5

General Video Coding (MPEG-5) is expected to contain video coding specifications. Currently two specifications are envisaged

  1. Part 1 – Essential Video Coding is expected to be the specification of a video codec with two layers. The first layer will provide a significant improvement over AVC but significantly less than HEVC and the second layer will provide a significant improvement over HEVC but significantly less than to VVC.
  2. Part 2 – Low Complexity Video Coding Enhancements is expected to be the specification of a data stream structure defined by two component streams, a base stream decodable by a hardware decoder, and an enhancement stream suitable for software processing implementation with sustainable power consumption. The enhancement stream will provide new features such as compression capability extension to existing codecs, lower encoding and decoding complexity, for on demand and live streaming applications. The LCEVC decoder is depicted in Figure 18.

Figure 18: Low Complexity Enhancement Video Coding

That’s all?

Well, yes, in terms of standards that have been developed, are being developed or being extended, or for which MPEG thinks that a standard should be developed. Well, no, because MPEG is a forge of ideas and new proposals may come at every meeting.

Currently MPEG is investigating the following topics

  1. In advance signalling of MPEG containers content is motivated by scenarios where the full content of a file is not available to a player but the player needs to take a decision to retrieve the file or not. Therefore the player needs to have sufficient information to determine if it can/cannot play the entire content or only a part.
  2. Data Compression continues the exploration in search for non typical media areas that can benefit from MPEG’s compression expertise. Currently MPEG is investigating Data compression for machine tools.
  3. MPEG-21 Based Smart Contracts investigates the benefits of converting MPEG-21 contract technologies, which can be human readable, to smart contracts for execution on blockchains.

Posts in this thread

The MPEG work plan (March 2019)

Introduction

In Life inside MPEG I introduced the MPEG work plan. The clock in MPEG moves fast and that work plan is now obsolete. Here is a new re-formatted version of the MPEG work plan as of March 2019.

 The MPEG work plan at a glance

Figure 1 shows the main standards that MPEG has developed or is developing in the 2017-2023 period. The figure is organised in 3 main sections:

  • Media Coding (e.g. AAC and AVC)
  • Systems and Tools (e.g. MPEG-2 TS and File Format)
  • Beyond Media (currently Genome Compression).\

Figure 1 – The MPEG work plan (March 2019)

Disclaimer: dates in the figure and in the following are all planned.

 Navigating the areas of the MPEG work plan

The 1st column in Figure 2 gives the currently active MPEG standardisation areas. The first row gives the currently active MPEG standards. The non-empty white cells give the number of “deliverables” (Standards, Amendments and Technical Reports) currently identified in the work plan.

Figure 2 – Standards (S), Amendments (A) and Technical Reports (T)  in the MPEG work plan (as of March 2019)

Video coding

In the Video coding area MPEG is currently developing specifications for 4 standards: MPEG-H, -I, -5 and -CICP) and is conducting explorations in advanced technologies for immersive visual experiences.

MPEG-H

Part 2 – High Efficiency Video Coding 4th edition specifies a profile of HEVC that will have an encoding of a single (i.e. monochrome) colour plane and will be restricted to a maximum of 10 bits per sample, as done in past HEVC range extensions profiles, and additional Supplemental Enhancement Information (SEI) messages, e.g. fisheye video, SEI manifest, and SEI prefix messages.

MPEG-I

Part 3 – Versatile Video Coding, currently being developed jointly with VCEG, MPEG is working on the new video compression standard after HEVC. VVC is expected to reach FDIS stage in July 2020 for the core compression engine. Other parts, such as high level syntax and SEI messages will follow later.

MPEG-CICP

Part 4 – Usage of video signal type code points 2nd edition will document additional combinations of commonly used code points and baseband signalling.

MPEG-5

This standard is still awaiting approval, but MPEG has already obtained all technologies necessary to develop standards with the intended functionalities and performance from the Calls for Proposals (CfP).

  1. Part 1 – Essential Video Coding will specify a video codec with two layers: layer 1 significantly improves over AVC but performs significantly less than HEVC and layer 2 significantly improves over HEVC but performs significantly less than VVC.
  2. Part 2 – Low Complexity Video Coding Enhancements will specify a data stream structure defined by two component streams: stream 1 is decodable by a hardware decoder, stream 2 can be decoded in software with sustainable power consumption. Stream 2 provides new features such as compression capability extension to existing codecs, lower encoding and decoding complexity, for on demand and live streaming applications.

Explorations

MPEG experts are collaborating in the development of support tools, acquisition of test sequences and understanding of technologies required for 6DoF and lightfields.

  1. Compression of 6DoF visual will enable a user to move more freely than in 3DoF+, eventually, allowing any translation and rotation in space.
  2. Compression of dense representation of light fields is stimulated by new devices that capture light field with both spatial and angular light information. As the size of data is large and different from traditional images, effective compression schhemes are required.

Audio coding

In the Audio coding area MPEG is working on 2 standards (MPEG-D, and -I).

MPEG-D

In Part 5 – Uncompressed Audio in MP4 File Format, MPEG extends MP4 to enable carriage of uncompressed audio (e.g. PCM). At the moment MP4 only carries compressed audio.

MPEG-I

Part 4 Immersive Audio. As MPEG-H 3D Audio already supports a 3DoF user experience, MPEG-I builds upon it to provide a 6DoF immersive audio experience. A Call for Proposal will be issued in October 2019. Submissions are expected in October 2021 and FDIS stage is expected to be reached in April 2022. Even though this standard will not be about compression, but about metadata as for 3DoF+ Visual, we have kept this activity under Audio Coding.

3D Graphics Coding

In the 3D Graphics Coding area MPEG is developing two parts of MPEG-I.

  • Part 5 – Video-based Point Cloud Compression (V-PCC) for which FDIS stage is planned to be reached in October 2019.
  • Part 9 – Geometry-based Point Cloud Compression (G-PCC) for which FDIS stage is planned to be reached in January 2020.

The two PCC standards employ different technologies and target different application areas, generally speaking, entertainment and automotive/unmanned aerial vehicles,

Font Coding

In the Font coding area MPEG is working on a new edition of MPEG-4 part 22.

Part 22 – Open Font Format. 4th edition specifies support of complex layouts and additional support for new layout features. FDIS stage will be reached in April 2020.

Genome Coding

In the Genome coding area MPEG has achieved FDIS level for  the 3 foundational parts of the MPEG-G standard:

  • Part 1 – Transport and Storage of Genomic Information
  • Part 2 – Genomic Information Representation
  • Part 3 – Genomic information metadata and application programming interfaces (APIs).

In October 2019 MPEG will complete Part 4 – Reference Software and Part 5 – Conformance. In July 2019 MPEG will issue a Call for Proposals for Part 6 – Genomic Annotation Representation.

Neural Network Coding

Compression of this type of data is motivated by the increasing use of neural networks in many applications that require the deployment of a particular trained network instance to a potentially large number of devices, which may have limited processing power and memory.

MPEG has restricted the general field to neural networks trained with media data, e.g. for the object identification and content description, and is therefore developing the standard in MPEG-7 which already contains two standards – CDVS and CDVA – which offer similar functionalities achieved with different technologies (and therefore the standard should be classified under Media description).

MPEG-7

Part 17 – Compression of neural networks for multimedia content description and analysis MPEG is developing a standard that enable compression of artificial neural networks trained with audio and video data. FDIS is expected in January 2021.

Media Description

Media description is the goal of the MPEG-7 standard which contains technologies for describing media, e.g. for the purpose of searching media.

In the Media Description area MPEG has completed Part 15 Compact descriptors for video analysis (CDVA) in October 2018 and is now working on 3DoF+ visual.

MPEG-I

Part 7 – Immersive Media Metadata will specify a set of metadata that enable a decoder to provide a more realistic user experience in OMAF v2. The FDIS is planned for July 2021.

System support

In the System support area MPEG is working on MPEG-4 and -I.

MPEG-4

Part 34 – Registration Authorities aligns the existing MPEG-4 Registration Autorities to current ISO practice.

MPEG-H

In MPEG-H MPEG is working on

Part 10 – MPEG Media Transport FEC Codes. This is being enhanced with the Window-based FEC code. FDAM is expected to be reached in January 2020.

MPEG-I

Part 6 – Immersive Media Metrics specifies the metrics and measurement framework in support of immersive media experiences. FDIS stage is planned to be reached in July 2020.

Transport

In the Transport area MPEG is working on MPEG-2, -4, -B, -H, -DASH, -I and Explorations.

MPEG-2

Part 2 – Systems continues to be a lively area of work 25 years after MPEG-2 Systems reached FDIS. After producing Edition 7, MPEG is working on two amendments to carry two different types of content

  • Carriage of JPEG XS in MPEG-2 TS JPEG XS
  • Carriage of associated CMAF boxes for audio-visual elementary streams in MPEG-2 TS

MPEG-4

Part 12 – ISO Based Media File Format Systems continues to be a lively area of work 20 years after MP4 File Format reached FDIS. MPEG is working on two amendments

  • Corrected audio handling, expected to reach FDAM in July 2019
  • Compact movie fragment is expected to reach FDAM stage in January 2020

MPEG-B

In MPEG-B MPEG is working on two new standards

  • Part 14 – Partial File Format provides a standard mechanism to store HTTP entities and the partial file in broadcast applications for later cache population. The standard is planned to reach FDIS stage in July 2020.
  • Part 15 – Carriage of Web Resources in ISOBMFF will make it possible to enrich audio/video content, as well as audio-only content, with synchronised, animated, interactive web data, including overlays. The standard is planned to reach FDIS stage in January 2020.

MPEG-DASH

In MPEG-DASH MPEG is working on

  • Part 1 – Media presentation description and segment formats will see a new edition in July 2019 and will be enhanced with an Amendment on Client event and timed metadata processing. FDAM is planned to be reached in January 2020.
  • Part 3 – MPEG-DASH Implementation Guidelines 2nd edition will become TR in July 2019
  • Part 5 – Server and network assisted DASH (SAND) will be enriche by an Amendment on Improvements on SAND messages. FDAM to be reached in July 2019.
  • Part 7 – Delivery of CMAF content with DASH a Technical Report with guidelines on the use of the most popular delivery schemes for CMAF content using DASH. TR is planned to be reached in March 2019
  • Part 8 – Session based DASH operation will reach FDIS in July 2020.

MPEG-I

Part 2 – Omnidirectional Media Format (OMAF) released in October 2017 is the first standard format for delivery of omnidirectional content. With OMAF 2nd Edition Interactivity support for OMAF, planned to reach FDIS in July 2020, MPEG is extending OMAF with 3DoF+ functionalities.

Application Formats

MPEG-A ISO/IEC 23000 Multimedia Application Formats is a suite of standards for combinations of MPEG and other standards (only if there are no suitable MPEG standard for the purpose).  MPEG is working on

Part 19 – Common Media Application Format 2nd edition with support of new formats

Application Programming Interfaces

The Application Programming Interfaces area comprises standards that make possible effective use of some MPEG standards.

MPEG-I

Part 8 – Network-based Media Processing (NBMP), a framework that will allow users to describe media processing operations to be performed by the network. The standard is expected to reach FDIS stage in January 2020.

Media Systems

Media Systems includes standards or Technical Reports targeting architectures and frameworks.

IoMT

Part 1 – IoMT Architecture, expected to reach FDIS stage in October 2019. The architecture used in this standard is compatible with the IoT architecture developed by JTC 1/SC 41.

Reference Implementation

MPEG is working on the development of standards for reference software of MPEG-4, -7, A, -B, -V, -H, -DASH, -G, -IoMT

Conformance

MPEG is working on the development of standards for conformance of MPEG-4, -7, A, -B, -V, -H, -DASH, -G, -IoMT.

The MPEG standards

MPEG uses acronyms for its standards and industry knows them by them. Here you will find the full list of MPEG standards ordered by the 5-digit ISO numbers.

MPEG-1 ISO/IEC 11172 Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s

MPEG-2 ISO/IEC 13818 Generic coding of moving pictures and associated audio information

MPEG-4 ISO/IEC 14496 Coding of audio-visual objects

MPEG-7 ISO/IEC 15938 Multimedia content description interface

MPEG-21 ISO/IEC 21000 Multimedia Framework

MPEG-A ISO/IEC 23000 Multimedia Application Formats

MPEG-B ISO/IEC 23001 MPEG systems technologies

MPEG-C ISO/IEC 23002 MPEG video technologies

MPEG-D ISO/IEC 23003 MPEG audio technologies

MPEG-E ISO/IEC 23004 Multimedia Middleware

MPEG-V ISO/IEC 23005 Media context and control

MPEG-M ISO/IEC 23006 Multimedia service platform technologies

MPEG-U ISO/IEC 23007 Rich media user interfaces

MPEG-H ISO/IEC 23008 High efficiency coding and media delivery in heterogeneous environments

MPEG-DASH ISO/IEC 23009 Dynamic adaptive streaming over HTTP (DASH)

MPEG-I ISO/IEC 23090 Coded representation of immersive media

MPEG-CICP ISO/IEC 23091 Coding-Independent Code-Points

MPEG-G ISO/IEC 23092 Genomic Information Representation

MPEG-IoMT ISO/IEC 23093 Internet of Media Things

MPEG-5 ISO/IEC 23094 General Video Coding

Posts in this thread

Looking inside an MPEG meeting

Introduction

In There is more to say about MPEG standards I presented the entire spectrum of MPEG standards. No one should deny that it is an impressive set of disparate technologies integrated to cover  fields connected by the common thread of Data Compression: Coding of Video, Audio, 3D Graphics, Fonts, Digital Items, Sensors and Actuators Data, Genome, and Neural Networks; Media Description and Composition; Systems support; Intellectual Property Management and Protection (IPMP); Transport; Application Formats; API; and Media Systems.

How on earth can all these technologies be specified and integrated in MPEG standards to respond to industry needs?

This article will try and answer this question. It will do so by starting, as many novels do, from the end (of an MPEG meeting).

Let’s start from the end (of an MPEG meeting)

When an MPEG meeting closes, the plenary approves the results of the week, marking the end of formal collaborative work within the meeting. Back in 1990 MPEG developed a mechanism – called “ad hoc group” (AhG) – that would allow to continue a form of collaboration. This mechanism allows MPEG experts to continue working together, albeit with limitations:

  1. In the scope, i.e. an AhG may only work on the areas identified by the mandates (in Latin ad hoc means “for a specific purpose”). Of course experts are free to work individually on anything and in any way that please them;
  2. In the purpose, i.e. an AhG may only prepare recommendations – in the scope of its mandates – to be submitted to MPEG. This is done at the beginning of the following meeting, after which the AhG is disbanded;
  3. In the method of work, i.e. an AhG operates under the leadership of one or more Chairs. Clearly, though, the success of an AhG depends very much on the attitude and activity of its members.

On average some 25 AhGs are established at each meeting. There is not one-to-one correspondence between MPEG activities and AhGs. Actually AhGs are great opportunities to explore new and possibly cross-subgroup ideas.

Examples of AhG titles are

  1. Scene Description for MPEG-I
  2. System technologies for Point Cloud Coding (PCC)
  3. Network Based Media Processing (NBMP)
  4. Compression of Neural Networks (NNR).

What happens between MPEG meetings

An AhG uses different means to carry out collaborative work:  by using reflectors, by teleconferencing and by holding physical meetings. The last can only be held if they were scheduled in the AhG establishment form. Unscheduled physical meetings may only be held if there is unanimous agreement of those who subscribed to the AhG.

Most AhGs hold scheduled meetings on the weekend that precedes the next MPEG meeting. These are very useful to coordinate the results of the work done and to prepare the report that all AhGs must make to the MPEG plenary on the following Monday.

AhG meetings, including those in the weekend preceding the MPEG meeting, are not formally part of an MPEG meeting.

An MPEG meeting at a glance

Chairs meeting

MPEG chairs meet three times during an MPEG week:

  1. On Sunday evening to review the progress of AhG work, coordinate activities impacting more than one Subgroup and plan activities to be carried out during the week including the need for joint meetings;
  2. On Tuesday evening to assess the result of the first two days of work, review the work plan and time lines based on the expected outcomes and identify the need of new joint meetings;
  3. On Thursday evening to wrap up the expected results and review the preliminary results of the week.

Plenary meetings

During an MPEG week MPEG holds 3 plenaries

  1. On Monday morning: to make everybody aware of the results of work carried out since the last meeting and to plan work of the week. AHG reports are a main part of it as they are presented and, when necessary, discussed;
  2. On Wednesday morning to make everybody aware of the work done in all subgroups in the first two days and to plan work for the next two days;
  3. On Friday afternoon to approve the results of the work of Subgroups, including liaison letters, to establish new AhGs etc.

Subgroup, Breakout Group and Joint meetings

Subgroups start their meetings on Monday afternoon. They review their own activities and kick off work in their areas. Each subgroup assigns activities to breakout groups (BoG) who meet with their own schedules to achieve the goals assigned. Each Subgroup may hold other brief meetings to keep everybody in the Subgroup in sync with the general progress of the work.

For instance, the activities of the Systems Subgroups are currently: File format, DASH, OMAF, OMAF and DASH, OMAF and MIAF, MPEG Media Transport, Network Based Media Processing and PCC Systems.

The MPEG structure is designed to facilitate interactions between different Subgroups and BoGs from different Subgroups to discuss matters that affect different Subgroups and BoGs, because they are at the interface of MPEG subsystems, For example, the table below lists the joint meetings that the Systems Subgroup held with other Subgroups at the January 2019 meeting.

Table 1 – Joint meeting of Systems Subgroup with other Subgroups

Systems meeting with Topics
Reqs, Video, VCEG SEI messages in VVC
Audio, 3DG Scene Description
3DG Systems for Point Cloud Compression
3DG API for multiple decoders
Audio Uncompressed Audio
Reqs, JVET, VCEG Immersive Decoding Interface

NB: VCEG is the Video Coding Experts Group of ITU-T Study Group 16. It is not an MPEG Subgroup.

On Friday morning all Subgroups approve their own results. These are automatically integrated in the general document to be approved by the MPEG Plenary on Friday afternoon.

Advisors meeting

On Monday evening, an informal group of experts from different countries examines issues of general (non-technical) interest. In particular it calls for meeting hosts, reviews proposals of meeting hosts, makes recommendations of meeting hosts to the plenary etc.

A bird’s eye view of an MPEG meeting

Figure 1 depicts the workflow described in the paragraphs above, starting from the end of the N-1 th meeting to the end of the N-th meeting.

Figure 1 – A snapshot of MPEG works from the end of a meeting to the end of the next meeting

What is “done” at an MPEG meeting?

There are around 500 of the best worldwide experts attending an MPEG meeting. It is an incredible amount of brain power that is mobilised at an MPEG meeting. In the following I will try and explain how this brain power is directed.

An example – the video area

Let’s take as example the work done in the Video Coding area at the March 2019 meeting.

The table below has 3 columns:

  1. The standards on which work is done (Video has worked on MPEG-H, MPEG-I, MPEG-CICP, MPEG-5 and Explorations)
  2. The names of the activities and
  3. The types of documents resulting from the activities (see the following legend for an explanation of the acronyms).

Table 2 – Documents produced in the video coding area

Std Activity Document type
H High Efficiency (HEVC) TM, CE, CTC
I Versatile Video Coding (VVC) WD, TM, CE, CTC
3 Degrees of Freedom + (3DoF+) coding CfP, WD, TM, CE, CTC
CICP Usage of video signal type code points (Ed. 1) TR
Usage of video signal type code points (Ed. 2) WD
5 Essential Video Coding WD, TM, CE, CTC
Low Complexity Enhancement Video Coding CfP, WD, TM, CE, CTC
Expl 6 Degrees of Freedom (6DoF) coding EE, Tools
Coding of dense representation of light fields EE, CTC

Legend

TM: Test Model, software implementing the standard (encoder & decoder)

WD: Working Draft

CE: Core Experiment, i.e. definition of and experiment that should improve performance

CTC: Common Test Conditions, to be used by all CE participants

CfP: Call for Proposals (this time no new CfP produced, but reports and analyses of submissions in response to CfPs)

TR: Technical Report (ISO document)

EE: Exploration Experiment, an experiment to explore an issue because it si not mature enough to be a CE

Tools: other supporting material, e.g. software developed for common use in CEs/EEs

What is produced by an MPEG meeting

Figure 2 gives the number of activities for each type of activity defined in the legend (and others that were not part of the work in the video area). For instance, out of a total of 97 activities:

  1. 29 relate to processing of standards through the canonical stages of Committee Draft (CD), Draft International Standard (DIS) and Draft International Standard (FDIS) and the equivalent for Amendments, Technical Reports and Corrigenda. In other words, at every meeting MPEG is working on ~10 “deliverables” (i.e. standards, amendments, technical reports or corrigenda) in the approval stages;
  2. 22 relate to working drafts, i.e. “new” activities that have not entered the approval stages;
  3. 8 relate to Technologies under Consideration, i.e. new technologies that are being considered to enhance existing standards;
  4. 8 relate to requirements, typically for new standards;
  5. 6 relate to Core Experiments;
  6. Etc.

 

Figure 2 – Activities at an MPEG meeting

Figure 2 does not provide a quantitative measure of “how many” documents were produced for each activity or “how big” they were.  As an example, Point Cloud Compression has 20 Core Experiments and 8 Exploration Experiments under way, while MPEG-5 EVC has only one large CE.

An average value of activity at the March 2019 meeting is provided by dividing the number of output documents (212), by the number of activities (97), i.e. 2.2.

Conclusions

MPEG holds quarterly meetings with an attendance of ~500 experts. If we assume that the average salary of an MPEG expert is 500 $/working day and that every expert stays 6 days (to account for attendance at AhG meetings), the industry investment in attending MPEG meetings is 1.5 M$/meeting or 6 M$/year. Of course, the total investment is more than that and probably in excess of 1B$ a year.

With the meeting organisation described above MPEG tries to get the most out of the industry investment in MPEG standards.

Posts in this thread