The birth of an MPEG standard idea

From what I have published so far on this blog it should be clear that MPEG is an unusual ISO working group (WG). To mention a few, duration (31 years), early use of ICT (online document management system in use since 1995), size (1500 experts registered and 500 attending), organisation (the way the work of 500 experts work on multiple projects), number of standards produced (more than any other JTC 1 subcommittee), impact on the industry (1 T$ in devices and 0.5 T$ in services per annum).

In How does MPEG actually work? I have talked about the life cycle of an MPEG standard, depicted in Figure 1.

Figure 1 – The MPEG standard development process

However, that article does say much about the initial phase of the life cycle, i.e. the moment new ideas that will turn into standard are generated, which is not even identified in the figure. By looking into that moment, we will see again how the way new ideas for MPEG standards are generated, makes MPEG an unusual WG.

The structure of this article is

A round up of MPEG standards

The work standard is used to indicate both a series of standards (identified by the 5-digit ISO number) and a part of a standard (identified by 5 digits a dash and a number). In this article we will use “standard” for the former and “part of standard” for the latter,

MPEG-1 and 2

The idea of the first two MPEG standards was generated in 1988 when a new “Moving Picture Experts Group” was created in JTC 1/SC 2/WG 8. The original MPEG work items were (the acronym DSM stands for Digital Storage Media):

  1. Coding of moving pictures for DSM’s having a throughput of 1-1.5 Mbit/s (1988-1990)
  2. Coding of moving pictures for DSM’s having a throughput of 1.5-5 Mbit/s (1990-1992)
  3. Coding of moving pictures for DSM’s having a throughput of 5-60 Mbit/s (to be defined).

I was the one who proposed the first work item (video coding for interactive video on compact disc) later to be named MPEG-1 while Hiroshi Yasuda added the second (later to be named MPEG-2). In MPEG-2 the word DSM was kept in order not to upset other standards groups (but the eventual title of MPEG-2 became “Generic coding of moving pictures and associated audio information”). The third standard was little more than a placeholder. The time assigned to develop the standard was definitely optimistic. It took two more years than planned for both MPEG-1 and MPEG-2 to reach FDIS (and even so it was really a rush).

The third standard was the first and an early case of the birth of a new standard idea that was “miscarried” because the third work item was combined with the second. That is the reason why there is no MPEG-3, but there is MP3 which is just something else.

MPEG-4

MPEG-4 was born with a similar process and was motivated by the fact MPEG-1 and MPEG-2 targeted high bitrates. However, low bitrates, too, were important and not covered by MPEG standards. Eventually MPEG-4 became the container of foundational digital media technologies such as audio for internet distribution, file format and open font format.

MPEG-7

MPEG-7 had a different story. It was proposed by Italy to SC 29 in response to the prospects of users having to navigate 500 TV channels to find what they wanted to see. The study in SC 29 went nowhere, so MPEG took it over and developed a complete standard framework for media metadata.

MPEG-21

MPEG-21 was driven by the upheaval brought about by MP3 as exemplified by Napster. The response was to create a complete standard framework for media ecommerce. The framework included the definition of Digital Items, and standards for Right and Contact Expression Languages and much more

MPEG-A

MPEG-A was the result of an investigation carried out by the MPEG plenary. AS a result of the investigation MPEG realised that, beyond standards for individual media, it was necessary to also develop standard combinations of media encoded according to MPEG standards.

MPEG-B, MPEG-C and MPEG-D

MPEG-B, -C and -D were proposed by MPEG at a time the Systems-Video-Audio trinity appeared to be no longer a response to standards needs, while individual Systems, Video and Audio standards were still in demand. All three standards include parts that can be classified as Systems, Video and Audio. As a note the Systems, Video and Audio trinity is alive and kicking. Actually it has become a quternity with Point Clouds.

MPEG-E

MPEG-E was driven by the idea of providing the industry with a standard specifying the architecture and software components of a digital media consumption device.

MPEG-V

MPEG-V was probably the first standard that was not the result of a decision by MPEG to propose a new standard but the result of a individual proposals coming from two different directions: virtual worlds (the much touted Second Life of the mid-2000’s) and the enhanced user experience made possible by existing and new sensors and actuators. MPEG succeeded in developing a comprehensive standard framework for interaction of humans with and between virtual worlds.

MPEG-M

MPEG-M was influenced by work done within the Digital Media Project (DMP) to address a standard middleware for digital media. MPEG-M became an extensible standard framework that specifies the architecture with High Level API, the middleware composed of MPEG technologies with Low Level API. and the aggregation of services leveraging those technologies.

MPEG-U

MPEG-U was probably the first standard whose birth happened in a subgroup – Systems. Eventually MPEG-U became a standard for exchanging, displaying, controlling and communicating widgets with other entities, and for advanced interaction.

MPEG-H and DASH

MPEG-H and DASH were two standards born out of the strongly felt need to overhaul the then 15-year old MPEG-2 Transport Stream (TS). The result was that the market 1) strongly reaffirmed its confidence in MPEG-2 TS, 2) enthusiastically embraced MPEG-H (integrated broadcast-broadband distribution over IP) and widely deployed DASH (media distribution accommodating unpredictable variations of available bandwidth) inspired by and developed in collaboration with 3GPP.

MPEG-I

MPEG-I was the result of the drive of the industry toward immersive services and devices enabling them. This is currently the MPEG flagship project, now with 14 parts but certainly designed to compete withe the number of parts (34) of MPEG-4.

MPEG-CICP

MPEG-CICP was a “housekeeping” action with the goal to collect code points for non-standard specific media formats in a single place (actually 4 documents).

MPEG-G

MPEG-G resulted from the proposal of a single organisation for a framework standard for storage, compression and access of non-media data (DNA reads). The organisation proposed the activity, but of course the development of the standard was fully open and in line with the MPEG process of Figure 1.

MPEG-IoMT

MPEG-IoMT resulted from a single organisation proposing a framework standard for media-specific Things (as defined in the context of Internet of Things), i.e. cameras and displays.

MPEG-5

MPEG-5 resulted from the proposal of a group of companies who needed a video compression standard that addressed business needs of some use cases, such as video streaming, where existing ISO video coding standards have not been as widely adopted as might be expected from their purely technical characteristics. This requirement was not met by state of the art MPEG video coding standard and the proposal was, after much debate, discussed and accepted by the MPEG plenary.

Looking at the birth of parts

From the roundup above the reader may have gotten the impression that most MPEG standards are the result of a collective awareness of the need of a standard. As I have described above, this is largely, but not exclusively, true, of standards identified by the 5-digit ISO number. But if we look at the parts of MPEG standards, we get a picture that does not contradict the first statement but provides a different view.

In its 31 years of activity MPEG has produced parts of standards in the following areas: Video Coding, Audio Coding, 3D Graphics Coding, Font Coding, Digital Item Coding, Sensors and Actuators Data Coding, Genome Coding, Neural Network Coding, Media Description, Media Composition, Systems support, Intellectual Property Management and Protection (IPMP), Transport, Application Formats, Application Programming Interfaces (API), Media Systems, Reference implementation and Conformance.

In the following we will see how the nature of standard parts influences the birth of MPEG standards.

Video Coding

Video coding standards are mostly driven by the realisation that some of existing MPEG standards are no longer aligned with the progress of technology. This was the case of MPEG-2 Video because it promised more compression than MPEG-1 Video, MPEG-4 Visual because there was no MPEG Video Coding standard for very low (much below 1 Mbit/s) bitrates, MPEG-4 AVC was developed because it promised more compression than MPEG-4 Visual, MPEG-H HEVC was developed because it promised more compression than MPEG-4 AVC and MPEG-I VVC was developed because it promised more compression than MPEG-H HEVC.

Forty years of video coding and counting tells the full story of the MPEG video compression standards.

This is not the full story, though. Table 1 is a version of the table in More video with more features, slightly edited to accommodate recent evolutions. The table describes the functionalities, beyond the basic compression functionality, that have been added over the years to MPEG Video Coding standards. The birth of each of these proposals for new functionalities has been the result of much wrangling between those who wanted to add the functionalities because they believed the necessary technology was available and those who saw the technology was immature and not ready for a standard.

Table 1 – Functionalities added to MPEG video coding standards

Audio Coding

The companion Audio Coding standards had a much different evolution. Unlike Video, whose bitrate was and is so high and growing to justify new generations of compression standards, Audio Coding was driven to large extent by applications and functionality, with compression always playing a role. Thirty years of audio coding and counting talks about the first MP3, then of MPEG-4 AAC in all its shapes, and then of MPEG Surround, Spatial Audio Object Coding (SAOC), Unified Speech and Audio Coding (USAC), Dynamic Range Control (DRC), MPEG-H 3D Audio and the coming MPEG-I Immersive Audio. MPEG-7 Audio should not be forgotten even though compression is appl;ied to descriptors, not to audio itself. The birth of each of these Audio Coding standards is a history in itself, ranging from being a part of a bigger plan developed at MPEG Plenary level, to specific Audio standards agreed by the Audio group, to a specific functionality to be added to a standard either developed or being developed.

3D Graphics Coding

The 3DG group worked on 2D/3D graphic compression, and then on Animation Framework eXtension (AFX). In the case of 2D/3D graphic compression, the birth of the standards was the result of an MPEG plenary decision, but the standards kept on evolving by adding new technologies for new functionalities, most often at the instigation or individual experts and companies.

Talking of 3D Graphics Coding, I could quote Rudyard Kipling’s verse

Oh, East is East, and West is West, and never the twain shall meet

and associate West to Video Coding and East to 3D Graphics Coding (or vice versa). Indeed, it looked like Video and 3D Graphics Coding would comply with Kipling’s verse until Point Cloud Compression (PCC) came to the fore. Proposed by an individual expert and under exploration for quite some time, it suddenly became one of the sexiest MPEG developments merging, in an intricated way, Video and 3D Graphics.

We can indeed say with Kipling

But there is neither East nor West, Border, nor Breed, nor Birth,

When two strong men stand face to face, though they come from the ends of the earth!

Font Coding

Font Coding is again a new story. This time the standard was proposed by the 3 companies – Adobe, Apple and Microsoft – who had developed the OpenType specification. The reason was that it had become burdensome for them to maintain and expand the specification in response to market needs. The MPEG plenary accepted the request, took over the task and developed several parts of several standards in multiple editions. As participants in the Font Coding activity do not typically attend MPEG meetings, new functionalities are mostly added at the request of experts or companies on the email reflector of the Font Coding ad hoc group.

Digital Item Coding

The initiative to start MPEG-21 was taken by the MPEG plenary, but the need to develop the 22 parts of the standards were largely identified by the subgroups – Requirements, Multimedia Description Schemes (MDS) and Systems,

Sensors and Actuators Data Coding

The birth of MPEG-V was the decision of the MPEG plenary, but the parts of the standard kept on evolving at the instigation or individual experts and companies. Four editions of the standard were produced.

Genome Coding

Development of Part 1 Storage and Transport and Part 2 Compression of a Genome Coding standard was obviously a major decision of the MPEG plenary. The need for other parts of the MPEG-G standard, namely Part 3 Metadata and API, and Part 6 Compression of Annotations, was identified by the Requirements group working on the standard.

Neural Network Coding

Neural Network Coding was proposed at the October 2017 meeting. The MPEG plenary was in doubt whether to call this “compression of another data type” (neural networks) or something in line with its “media” mandate. Eventually it opted to call it “Compression of neural networks for multimedia content description and analysis”, which is partly what it really is, an extended new version of CDVS and CDVA with embedded compression of descriptors. Neural Network Compression (NNC) is now (October 2019) at Working Draft 2 and is planned to reach FDIS in April 2021. Experts are too busy working on the current scope to have the time to think of more features, but we know there will be more because the technologies in the current draft do not support all the requirements.

Media Description

As mentioned above, the MPEG plenary decided to develop MPEG-7 seeing the inability of SC 29 to jump on the opportunity of a standard that would describe media in a standard way to help user access the content of their interest. The need for the early parts of MPEG-7 were largely identified by the groups working on MPEG-7. The need for later standards, such as CDVS and CDVA, was identified by a company (CDVS) and by a consortium (CDVS). The need for the latest standard (Neural Network Compression) was identified by the MPEG plenary.

Media Composition

Until 1995 MPEG was really a group working mostly for Broadcasting and Consumer Electronics (but MP3 was a sign of things to come) and did not have the need for a standard Media Description. MPEG-4 was the standard that extended the MPEG domain to the IT world and the “Coding of audio-visual objects” title of MPEG-4 meant that a Media Description technology was needed.

The MPEG plenary took the decision to extend the Virtual Reality Mark-up Language (VRML) establishing contacts with that group. The MPEG plenary did the same when a company proposed a new W3C recommendation-based Media Description technology. A company proposed to develop the MPEG Orchestration standard. After much investigation and debate, the Requirements and Systems groups have recently come to the conclusion that the MPEG-I Scene Description should be based on an extension to Khronos’ glTF2.

Systems support

Systems support was the first non-video need after audio identified by the MPEG plenary a few months after the establishment of MPEG. Today Systems support standards are mostly, but not exclusively, in MPEG-B. This standard contains parts that were the result of a decision of the MPEG plenary (e.g. Binary MPEG format for XML and Common encryption), the proposal of the Systems group (e.g. Sample Variants and Partial File Format) or the proposal of a company (e.g. Green metadata).

Intellectual Property Management and Protection (IPMP)

IPMP parts appear in MPEG-2, MPEG-4 and MPEG-21. They were triggered by the same context that produced MPEG-21, i.e. how to combine the liquidity of digital content with the need to guarantee a return to rights holders.

The need for MPEG-4 IPMP was identified by the MPEG plenary, but MPEG-2 and MPEG 21 IPMP was proposed by the Systems and MDS groups, respectively.

Transport

Although partly already in MPEG-1, Transport technology in MPEG flourished in MPEG-2 with Transport Stream and Program Stream. The development of MPEG-2 Systems was a major MPEG plenary decision, as was the case for MPEG-4 System, that at the time included the Scene description technology. MPEG-2 TS has dominated the broadcasting market (and is even used by AOM). As said above, MPEG-H MPEG Media Transport (MMT) and DASH are two major transport technologies whose development was identified and decided by the MPEG plenary. All 3 standards have been published several times (MPEG-2 Systems 7 times) as a result of needs identified by the Systems group or individual companies.

Application Formats

The first Application Formats were launched by the MPEG Plenary and the following Application Formats by different MPEG groups. Later individual companies or consortia proposed several Application Formats. The Common Media Application Format (CMAF) proposed by Apple and Microsoft is one of the most successful MPEG standards.

Application Programming Interfaces (API)

MPEG Extensible Middleware (MXM) was the first MPEG API standard. The decision to do this was made by the MPEG plenary but the proposal was made by a consortium. The MPEG-G Metadata and API standard was proposed by the Requirements group. The IoMT API standard was proposed by the 3D Graphics group.

Media Systems

This area collects the parts of standards that describe or specify the architecture underpinning MPEG standards. This is the case of part 1 of MPEG-7, MPEG-E, MPEG-V, MPEG-M, MPEG-I and MPEG-IoMT. These parts are typically kicked off by the MPEG plenary.

Reference implementation and Conformance

MPEG takes seriously its statement that MPEG standards should be published in two languages – one that is understood by humans and another that is understood by a machine – and that the two should be equivalent in terms of specification. The reference software – the version of the specification understood by a machine – is used to test conformance of an implementation to the specification. For these reasons the need for reference software and conformance is identified by the MPEG plenary.

Conclusions

With the understanding that all decision are formally made by the MPEG plenary, the trigger of an MPEG decision happens at different levels. Very often – and more often as MPEG matures and its organisation more solid – the trigger is in the hand of an individual experts or group of experts, or of an MPEG subgroup.

Posts in this thread

 

More MPEG Strengths, Weaknesses, Opportunities and Threats

Introduction

In its MPEG and JPEG as SCs proposal, MPEG Future proposes that MPEG become a subcommittee to improve collaboration with other bodies, establish a clear reference in ISO for the digital media industry, enhance group’s governance and more. The obvious question to MPEG Future concerns MPEG’s adequacy for the new role. The first answer to this question is that, in its original proposal, the Italian National Body UNI has already carried out a SWOT (Strengths-Weaknesses-Opportunities-Threats) analysis.

In No one is perfect, but some are more accomplished than others I have started rewording, expanding and publishing that SWOT analysis and in this article I will continue the task.

The Italian National Body has identified the following Key Performance Indicators (KPI) of a standards committee like MPEG: Context, Scope of standards, Business model, Membership, Structure, Leadership, Client industries, Collaboration, Standards development, Standards adoption, Innovation capability, Communication and Brand.

In the article mentioned above I have dealt with Context, Scope of standards, Business model and in this one I will deal with Membership, Structure and Leadership.

Membership

I would like to identify four levels of members: those who actually attend MPEG meetings, those who are officially registered as members but do not attend, those who actually work in MPEG projects without being officially members and those who, even without being members, have their work significantly influenced by MPEG work plan and standards.

Membership – as defined above – is the most valuable MPEG asset. It is because of this that the MPEG Future Manifesto has identified “Support and expand the academic and research community which provides the life blood of MPEG standards” as the first of its actions.

Strengths

MPEG has a level 1-2 membership competent in all areas of scope, large in number (level 1 is 500 and level 2 is 1500 experts), from many different industries with a growing role of academia and global (>30 countries). Level 3 membership is estimated at a few thousand experts and level 4 is estimated at a few tens of thousands. There is a continuous flow of level 1-2 experts leaving and being replaced by new experts, from level 3 and even from level 4.

Another strength is the fact that many level 1 MPEG members are active members of other organisations. This multiple membership facilitates understanding of other committees’ work, needs and plans. Table 1 identifies customers (those MPEG provides standards to) and partners (those MPEG works with, e.g. to develop standards).

Table 1 – Main customers (C) and partners (P)

Read Standards and collaborations to know more about the way MPEG does work with other committees.

Weaknesses

The main weakness comes from the fact that the percentage of level 1 experts coming from companies directly using MPEG standards is shrinking. This is the result of a phenomenon that is entirely outside of MPEG control but alters MPEG’s traditional relationship with its industries.

Related to the same phenomenon is the fact that the percentage of experts working for Non-Performing Entities (NPE) is growing. Of course, all experts are motivated to develop the best possible standards, but the ultimate goal of experts is changing compared to the traditional experts’ goal.

Similar to the above phenomenon is the fact that the percentage of academic members, currently at about 25%, is growing. Of course, injection of valuable academic know how is good but again the ultimate goal of experts is changing compared to the traditional experts’ goal.

Of a completely different nature is the weakness generated by one of the strengths mentioned above: the large number of level 1 members. The ISO/IEC directives say that WGs should be “limited in size”. Limited is not defined but, when one sees ISO Technical Committees of couple hundred members, ISO Subcommittees of a few tens and MPEG (a working group) of 500, that MPEG has exceeded the “limited size” is more than a suspicion.

A final main weakness is a consequence of the fact that MPEG attracts the best experts but, being a working group, does not attract managers who care about the organisational sustainability of MPEG in a world of standards. No level 1 MPEG member attends JTC 1 meetings where important policy decisions may be made that affect MPEG’s work plan and execution.

Opportunities

One of the most strategically important opportunities is how to make best use of the enormous brain power that populates MPEG meetings and activities, influences research and attracts new members.

This can be achieved by exploiting the opportunities for new standards in the MPEG traditional media field. MPEG is working in several areas such as immersive media, neural network compression and video coding for machines that will require a large number of experts making substantial contributions. Additionally MPEG can offer new perspectives in compression of data other than media, e.g. genomics.

From the organisational viewpoint MPEG can comply with the ISO/IEC directives while keeping the MPEG ecosystem intact, e.g. by achieving subcommittee status.

Threats

The biggest threat is that the MPEG membership is not an asset that is granted for ever. Members at all levels can leave without being replaced because the MPEG work plan may lose its attraction, or its standards are no longer relevant or profitable. A related threat is an overshoot of the attractionm to new members that is unable to reward all members at all levels.

Another set of threats is caused by the current discussions on the future of MPEG which is shaking the confidence of industry and experts. A breakup of MPEG in disconnected working groups would dramatically affect MPEG’s ability to deliver its existing work plan. Even if delivery is assured, there will be no guarantee that the quality of standards will remain the same because the glue provided by the MPEG organisation and modus operandi will be lost.

Structure

The MPEG structure is another major asset of MPEG. It is at the root of the quality, usability and ultimate success of MPEG standards.

Strengths

The biggest strength of the MPEG structure is the fact that it has not been designed by committee but is the result of a 30-year long learning process. Figure 1 depicts the structure with an indication of the flow of activities.

Figure 1 – Today’s MPEG structure and workflow

MPEG can be defined as an ecosystem of interacting subgroups developing integrated standards and, over three decades, MPEG subgroups were created and disbanded (see here for the full story) because the ecosystem shifted in nature. The subgroups in operation today are the best match to the current conditions, but may well change if the programme of work will change.

These are the main components of the MPEG ecosystem

  1. Ad hoc groups (AhG) were created since the early days (1990) because experts needed an “official” environment to continue doing work outside MPEG meetings, with the understanding that “decisions” could only be made when MPEG is in session.
  2. Break-out groups (BoG) existed since the early days because even a single part of a standard could be too complex and the work had to be split in separate activities to be merged later by the subgroup in charge of that part of the standard.
  3. Joint meetings are possible because the expertise of the MPEG membership covers all areas needed by MPEG standards. Whenever an MPEG standard needs to interface with another standard or expose an interface it is possible to get the relevant people together, discuss and take action on the issues.
  4. Chairs meetings are the place where the general progress of work is reviewed and the need for interaction between the elements of the MPEG ecosystem are identified.
  5. Finally MPEG benefits from powerful ICT tool developed by Christian Tulvan of Institut Mines Télécom to support document management, session allocation, work plan etc.

A more complete analysis of the MPEG ecosystem is found at MPEG: vision, execution, results and a conclusion.

An important strength is given by the fact the processess described above have taken root over many years and are now deeply ingrained in the collective mindset of mPEG members.

A final important strength is that, while MPEG does not have a formal strategic planning function, this is actually implemented by the diffuse structure described above.

Weaknesses

The main weakness of the current MPEG structure is a reflection of the main strength as described above. MPEG has an enormous brain power with extremely high levels of technical excellence but has weak links with the market.

This does not mean that MPEG is unaware of the market. Its processes include the development of context and objectives for a new project, the development of use cases and the analysis of use cases to develop requirements. However, all this is done by technical experts who, as the case may be, occasionally wear the clothes of market guys.

Because of its enormous brain power, MPEG has been able to develop many standards, some of which are extremely successful and other less so. Therefore, while there is no compelling need to address this weakness  because MPEG standards are so successful, there is room for improvements.

Another significant weakness is the limitation in MPEG’s ability to initiate new collaborations with other committees because of MPEG’s inferior status in ISO.

Opportunities

MPEG Future’s MPEG and JPEG as SCs proposes that MPEG becomes a subcommittee with a new Market needs Advisory Group (AG). MPEG is a big thing. Can it be bigger? describes how the interaction between a technology driven Technical Requirements AG can compete and collaborate with the proposed Market needs AG to make more robust and justified proposals for new standards.

By becoming a subcommittee MPEG can also have more freedom to timely initiate collaborations with other committees and to actually establish formal collaborations with other ISO and IEC committees using the Joint Working Group )JWG) mechanism. Today, as a working group, MPEG may not formally do work with other committees if not by liaison.

Threats

The main threat is the possibility that, in the face of a large committee of 500 level 1 experts and 1500 level 2 experts, MPEG is simply broken up in its working groups or, worse, new working groups are created by recombining MPEG activities. Of course, given sufficient time and effort, a new different MPEG-like organisation may be created, but at the cost of delayed or inferior quality standards and without a guarantee that a committee-designed organisation will work as well as an organisation that is the result of a Darwinian process.

Leadership

By leadership here we mean the many people who hold a leadership position in MPEG: convenor, subgroup chairs, ad hoc group chairs and break-out group rapporteurs.

Strengths

Because of its oft-mentioned enormous brain power, MPEG is in the enviable position of being able to identify excellent leaders. Actually, MPEG does that for subgroup chairs but ad hoc group chairs and break-out group rapporteurs are very much the result of a bottom up process.

The main strength is given by the consolidated and experienced MPEG and subgroup leadership who is ready to delegate significant levels of authoomy to AhGs and BoGs, with the constraints imposed by the fact that formal adoption of technology is the competence of subgroups and ratification of decision is the competence of the MPEG plenary.

Weaknesses

The main weakness, going hand-in-hand with its main strength, is that leadership of MPEG and subgroups is rather static and that new leaders identified in AhG-BoG activities are not sufficiently put to good use.

Opportunities

With the implementation of MPEG Future’s MPEG and JPEG as SCs proposal, MPEG has the opportunity to introduce accelerated cycles of leadership regeneration in WGs, JWGs and AGs, and in a better management of the “unit” entity described in MPEG: vision, execution, results and a conclusion.

The MPEG Future proposal also incudes suggestions on new processes to nominate candidates as chair of the proposed new subcommittee by the Secretariat could be designed that preserve the ultimate authority of the Secretariat to decide in the framework of selections made by the MPEG community.

Threats

The MPEG structure and MPEG leadership are not disconnected entities. Today’s MPEG structure with an entirely new leadership would have a hard time to work smoothly and guarantee the delivery of the standards in the work plan with the quality that MPEG’s client industries expect. This does not mean that the leadership should stay static forever, simply that changes should be implemented in a progressive fashion.

Conclusions

In this article we have made a SWOT analysis of three of the most critical KPIs: membership, structure and leadership. MPEG is excellent in all three, but does have weaknesses. Opportunities for improvements are offered by MPEG Future’s MPEG and JPEG as SCs proposal, but threats are lurking.

Posts in this thread

 

The MPEG Future Manifesto

Communication makes us humans different. Media make communication between humans effective and enjoyable. Standards make media communication possible.

Thirty-two years ago, the MPEG vision was forming: make global standards available to allow industry to provide devices and services for the then emerging digital media so that humans could communicate seamlessly.

For thirty-two years the MPEG standards group has lived up to the MPEG vision: MPEG standards are behind the relentless growth of many industries – some of them created by MPEG standards. More than half the world population uses devices or accesses services, on a daily or hourly basis, that rely on MPEG standards.

The MPEG Future Manifesto claims that the MPEG mission is far from exhausted:

  • New media compression standards can offer more exciting user experiences to benefit consumers that the service, distribution and manufacturing industries want to reach, but also for new machine-based services;
  • Compression standards can facilitate the business or mission of other non-media industries and the MPEG standards group has already shown that this is possible.

Therefore, the MPEG Future Manifesto proposes a concerted effort to

  • Support and expand the academic and research community which provides the life blood of MPEG standards;
  • Enhance the value of intellectual property that make MPEG standards unique while facilitating their use;
  • Identify and promote the development of new compression-related standards benefitting from the MPEG approach to standardisation;
  • Further improve the connection between industry and users, and the MPEG standards group
  • Preserve and enhance the organisation of MPEG, the standards group who can achieve the next goals because it brought the industry to this point.

MPEG Future is a group of people, many of whom are MPEG members, who care about the future of MPEG. MPEG Future is open to those who support the MPEG Future Manifesto’s principles and actions.

You may:

  • Participate in the MPEG Future activities, by subscribing to the LinkedIn MPEG Future group https://bit.ly/2m6r19y
  • Join the MPEG Future initiative, by sending an email to info@mpegfuture.org.

Posts in this thread

 

What is MPEG doing these days?

It is a now a few months since I last talked about the standards being developed  by MPEG. As the group dynamics is fast, I think it is time to make an update about the main areas of standardisation: Video, Audio, Point Clouds, Fonts, Neural NetworksGenomic data, Scene description, Transport, File Format and API. You will also find a few words on three exploration that MPEG is making

  1. Video Coding for Machines
  2. MPEG-21 contracts to smart contracts
  3. Machine tool data.

Video

Video continues to be a very active area of work. New SEI messages are being defined for HEVC while there a high activity in VVC that is due to reach FDIS in July 2020. Verification Tests for VVC have not been carried out yet, but the expectation is that VVC will bring compression of video of about ~1000, as can be seen from the following table where bitrate reduction of a standard is measured with respect to that of the previous standard. MPEG-1 bitrate reduction with respect to uncompressed video. VVC bitrate reduction is estimated.

Standard Bitrate reduction Year
MPEG-1 Video -98% 1992
MPEG-2 Video -50% 1994
MPEG-4 Visual -25% 1999
MPEG-4 AVC -30% 2003
MPEG-H HEVC -60% 2013
MPEG-I VVC -50% 2020

Compression of 1000 is obtained by computing the inverse of 0.02*0.5*0.75*0.7*0.4*0.5.

SEI messages for VVC are now being collected in MPEG-C Part 7 “SEI messages for coded video bitstreams”. The specification of SEI messages is generic in the sense that the transport of SEI messages can be effected both in the video bitstream or at the Systems layer. Care is also taken to make messages transport possible on previous video coding standards.

MPEG CICP (Coding-Independent Code-Points) Part 4 “Usage of video signal type code points” has been released. This Technical Report provides guidance on combinations of video properties that are widely used in industry production practices by documenting the usage of colour-related code points and description data for video content production.

MPEG is also working on two more “traditional” video coding standard, both included in MPEG-5.

  1. Essential Video Coding (EVC) will be a standard video coded that addresses business needs in some use cases, such as video streaming, where existing ISO video coding standards have not been as widely adopted as might be expected from their purely technical characteristics. EVC is now being balloted as DIS. Experts working on EVC are actively preparing for the Verification Tests to see how much “addressing business needs” will cost in terms of performance.
  2. Low Complexity Enhancement Video Coding (LCEVC) will be a standardvideo coded that leverages other video codecs yo improves video compression efficiency while maintaining or lowering the overall encoding and decoding complexity. LCEVC is now being balloted as CD.

MPEG-I OMAF already supports (2018) 3 Degrees of Freedom (3DoF), where a user’s head can yaw, pitch and roll, but the position of the body is static. However, rendering flat 360° video, i.e. supporting head rotations only, may generate visual discomfort especially when rendering objects close to the viewer.

6DoF enables translation movements in horizontal, vertical, and depth directions in addition to 3DoF orientations. The translation support enables interactive motion parallax giving viewers natural cues to their visual system and resulting in an enhanced perception of volume around them.

MPEG is currently working on a video compression standard (MPEG-I Part 12 Immersive Video – MIV) that enables head-scale movements within a limited space. In the article On the convergence of Video and 3D Graphics I have provided some details of the technology being used to achieve the goal, comparing it with the technology used for Video-based Point Cloud Compression (V-PCC). MIV is planned to reach FDIS in October 2020.

Audio

Audio experts are working with the goal to leverage MPEG-H 3D Audio to provide a full 6DoF Audio experience, viz. where the user can localise sound objects in horizontal and vertical planes, and perceive sound objects’s loudness changes as a user moves around an audio object, sound reverberation as in a real room and occlusion when a physical object is interposed between a sound source and a user.

The components of the system to be used to test proposals are

  • Coding of audio sources: using MPEG-H 3D Audio
  • Coding of meta-data: e.g. source directivity or room acoustic properties
  • Audio and visual presentations for immersive VR worlds (correctly perceiving a virtual audio space without any visual cues is very difficult)
  • Virtual Reality basketball court where the Immersive Audio renderer makes all the sounds in response to the user interaction of bouncing the ball and all “bounce sounds” are compressed and transmitted from server to client.

Evaluation of proposals will be done via

  • Full, real-time audio-visual presentation
  • Head-Mounted Display for “Unity” visual presentation
  • Headphones and “Max 8” for audio presentation
  • Proponent technology will run in real-time in Max VST3 plugin.

Currently this is the longest term MPEG-I project as FDIS is planned for January 2022.

MPEG Immersive Video and Audio share a number of features. The most important is the fact that both are not “compression standards”, in the sense that they use existing compression technologies on top of which immersive features are provided by metadata that will be defined by Immersive Video (part 12 of MPEG-I) and Immersive Audio (part 5 of MPEG-I). MPEG-I Part 7 Immersive Media Metadata will specify additional metadata coming from the different subgroups.

Point Clouds

Video-based Point Cloud Compression is progressing fast as FDIS is scheduled for January 2020. The maturity of the technology, suitable for dense point clouds (see, e.g. https://mpeg.chiariglione.org/webtv?v=802f4cd8-3ed6-4f9d-887b-76b9d73b3db4) is reflected in related Systems activities that will be reported later.

Geometry-based Point Cloud Compression, suitable for sparse point clouds (see, e.g. https://mpeg.chiariglione.org/webtv?v=eeecd349-61db-497e-8879-813d2147363d) is following with a delay of 6 months, as FDIS is expected for July 2020.

Fonts

MPEG is extending MPEG-4 Part 22 Open Font Format with an amendment titled “Colour font technology and other updates”.

Neural Networks

Neural Networks are a new data type. Strictly speaking is addressing the compression of Neural Networks trained for multimedia content description and analysis.

NNR, as MPEG experts call it, has taken shape very quickly. First aired and discussed at the October 2017, a Call for Evidence (CfE)  was issued in July 2018 and a Call for Proposal (CfP) issued in October 2018.  Nine responses were received at the January 2019 meeting that enabled the group to produce the first working draft in March 2019. A very active group is working to produce the FDIS in October 2020.

Read more abour NNR at Moving intelligence around.

Genomic data

With MPEG-G parts 1-3 MPEG has provided a file and transport format, compression technology, metadata specifications, protection support and standard APIs for the access of sequencing data in the native compressed format. With the companion parts 4 and 5 reference software and conformance, due to reach FDIS level in April 2020, MPEG will provide a software implementation of a large part of the technologies in parts 1 to 3 and the means to test an implementation for conformity to MPEG-G.

January 2020 is the deadline for responding to the Call for Proposals on Coding of Genomic Annotations. The call is in response to the need of most biological studies based on sequencing protocols to attach different types of annotations, all associated to one or more intervals on the reference sequences, resulting from so-called secondary analyses. The purpose of the call is to acquire technologies that will allow to provide a compressed representation of such annotation.

Scene description

MPEG’s involvement in scene description technologies dates back to 1996 when it selected VRML as the starting point for its Binary Format for Scenes (BIFS). MPEG’s involvement continued with MPEG-4 LASeR, MPEG-B Media Orchestration and MPEG-H Composition Information.

MPEG-I, too, cannot do without a scene technology. As for the past, MPEG will start from an existing specification – glTF2 (https://www.khronos.org/gltf/) – selected because it is an open, extensible, widely supported with many loaders and exporters and enables MPEG to extend glTF2 capabilities of for audio, video and point cloud objects.

The glTF2-based Scene Description will be part 14 of MPEG-I.

Transport

Transport is a fundamental function of real-time media and MPEG continues to develop standards, not just for its own standards, but also for JPEG standards (e.g. JPEG 2000 and JPEG XS). This is what MPEG is currently doing in this vital application area:

  1. MPEG-2 part 1 Systems: a WD of an amendment on Carriage of VVC in MPEG-2 TS. This is urgently needed because broadcasting is expected to be a good user of VVC.
  2. MPEG-H part 10 MMT FEC Codes: an amendment on Window-based Forward Error Correcting (FEC) code
  3. MPEG-H part 13 MMT Implementation Guidelines: an amendment on MMT Implementation Guidelines.

File format

The ISO-based Media File Format is an extremely fertile standards area that extends over many MPEG standards. This is what MPEG is doing in this vital application area:

  1. MPEG-4 part 12 ISO Base Media File Format: two amendments on Compact movie fragments and EventMessage Track Format
  2. MPEG-4 part 15 Carriage of NAL unit structured video in the ISO Base Media File Format: an amendment on HEVC Carriage Improvements and the start of an amendment on Carriage of VVC, a companion of Carriage of VVC in MPEG-2 TS
  3. MPEG-A part 19 Common Media Application Format: the start of an amendment on Additional media profile for CMAF. The expanding use of CMAF prompts the need to support more formats
  4. MPEG-B part 16 Derived Visual Tracks in ISOBMFF: a WD is available as a starting point
  5. MPEG-H part 12 Image File Format: an amendment on Support for predictive image coding, bursts, bracketing, and other improvements to give HEIF the possibility to store predictively encoded video
  6. MPEG-DASH part 1 Media presentation description and segment formats: start of a new edition containing CMAF support, events processing model and other extensions
  7. MPEG-DASH part 5 Server and network assisted DASH (SAND): the FDAM of Improvements on SAND messages has been released
  8. MPEG-DASH part 8 Session based DASH operations: a WD of Session based DASH operations has been initiated
  9. MPEG-I part 2 Omnidirectional Media Format: the second edition of OMAF has started
  10. MPEG-I part 10 Carriage of Video-based Point Cloud Compression Data: currently a CD.

API

This area is more and more being populated with MPEG standards

  1. MPEG-I part 8 Network-based Media Processing is on track to become FDIS in January 2020
  2. MPEG-I part 11 Implementation Guidelines for NBMP is due to reach TR stage in April 2020
  3. MPEG-I part 13 Video decoding interface is a new interface standard to allow an external application to provide one or more rectangular video windows from a VVC bitstream.

Explorations

Video Coding for Machines

MPEG is carrying out explorations in areas than may give rise to future standards: 6DoF, Dense Light Fields and Video Coding for Machines (VCM). VCM is motivated by the fact that, while traditional video coding aims to achieve the best video/image under certain bit-rate constraints having humans as consumption targets, the sheer quantity of data being/to be produced by connected vehicles, video surveillance, smart cities etc. makes the traditional human-oriented scenario inefficient and unrealistic in terms of latency and scale.

Twenty years ago the MPEG-7 project started the development of a comprehensive set of audio, video and multimedia descriptors. Other parts of MPEG-7 have added other standard descriptions of visual information for search and analysis application. VCM may leverage that experience and frame it in the new context of expanded use of neural networks. Those interested can subscribe to the Ad hoc group on Video Coding for Machines at https://lists.aau.at/mailman/listinfo/mpeg-vcm and participate in the discussions at mpeg-vcm@lists.aau.at.

MPEG-21 Based Smart Contracts

MPEG has developed several standards in the framework of MPEG-21 media ecommerce framework addressing the issue of digital licences and contracts. Blockchain can execute smart contracts, but is it possible to translate an MPEG-21 contract to a smart contract?

Let’s consider the following use case where User A and B utilise a Transaction system that interfaces with a Blockchain system and a DRM system. If the transaction on the Blockchain system is successful, DRM System authorises User B to use the media item.

The workflow is

  1. User A writes a CEL contract and a REL licence and sends both to User B
  2. User B sends the CEL and the REL to a Transaction system
  3. Transaction system translates CEL to smart contract, creates token and sends both to Blockchain system
  4. Blockchain system executes smart contract, records transaction and notifies Transaction system of result
  5. If notification is positive Blockchain system translates REL to native DRM licence and notifies User A
  6. User A sends media item to User B
  7. User B requests DRM system to use media item
  8. DRM system authorises User B

In this use case, Users A and B can communicate using the standard CEL and REL languages, while Transaction system is tasked to interface with Blockchain system and DRM system.

A standard way to translate MPEG-21 contracts to smart contracts will ensure users that the smart contract executed by a block chain corresponds to the human-readable MPEG-21 contract.

Those interested in exploring this topic can subscribe to the Ad hoc group on MPEG-21 Contracts to Smart Contracts at https://lists.aau.at/mailman/listinfo/smart-contracts and participate in the discussions at smart-contracts@lists.aau.at.

Machine tools data

Mechanical systems are become more and more sophisticated in terms of functionalities but also in terms of capability to generate data. Virgin Atlantic says that a Boeing 787s may be able to create half a terabyte of data per flight. The diversity of data generated by an aircraft makes the problem rather challenging, but machine tools are less complex machines that may still generate 1 Terabyte of data per year. The data are not uniform in nature and can be classified in 3 areas: Quality control, Management and Monitoring.

There are data available to test what is means to process machine tool data.

Other data

MPEG is deeply engaged in compressing two strictly non-media data: Genomic and Neural Networks, even though the latter is currently considered as a compression add-on to multimedia content description and analysis. It is also exploring compression of machine tool data.

The MPEG work plan

The figure graphically illustrates the current MPEG work plan. Dimmed coloured items are not (yet) firm elements of the workplan.

 

Posts in this thread

MPEG is a big thing. Can it be bigger?

Introduction

Having become the enabler of a market of devices and services worth 1.5 T$ p.a., MPEG is a big achievement, but is that a climax or the starting point aiming at new highs?

This is a natural question to ask for a group that calls itself “MPEG Future”. The future is still to be written and the success of MPEG will largely depend on the ability of those who attempt to write it.

This article will try and analyse the elements of an answer though the following steps:

  1. The MPEG machine as it has been running for several years
  2. The key success factors
  3. The situation today
  4. What’s next to make MPEG bigger.

The MPEG machine: input, processing and output

To understand if MPEG can be bigger the first thing to do is to understand how could MPEG reach this point. I will start doing that by considering a simplified model of the MPEG standards ecosystem that I believe is what has made MPEG the big thing that we know (Figure 1.

Figure 1: The current MPEG standard ecosystem

The MPEG machine has three iterative phases of operation

  1. MPEG receives inputs from 3 sources:
    1. MPEG members
    2. Partners, i.e. committees who may be interested in developing a joint standard
    3. Customers, typically committees or industry associations who may need a standard in an area for which MPEG has expertise.
  2. MPEG processes inputs and may decide
    1. To start an exploration by studying use cases and requirements, or by exploring technologies
    2. To develop a new MPEG standard, if the exploration is successful
    3. To extend or correct an existing standard.
  3. MPEG generates outputs
    1. To the industry at large, by announcing its work plan, milestones of the work plan such as Calls for Proposal, events such as workshops, results of verification tests etc.
    2. To communicate to partners and customers about how MPEG is handling their inputs or to seek their opinion or to propose new initiatives
    3. To inform partners and customers of the progress of its standard and eventually making standards available.

MPEG’s key success factors

These are the main success factors of MPEG’s handling of the business of standardisation.

  1. Search for customers. MPEG started with the vision of “digital media standards for a global market” but it did not have – and still does not have – a “constituency” whose interest it was expected to further. It assembled the expertise required to implement its vision, but needed to find buyers for its standards. Finding customers is the main element of MPEG’s DNA.
  2. Customer care. Each industry has its own requirements, some shared with others. MPEG needed to find both the common denominator of requirements and the industry-specific requirements, and to design solutions where all industries could operate unencumbered by the requirements of others.
  3. Integrated standards. MPEG has been able to develop complete digital media solutions without leaving its customers struggling with the task of making different pieces from different sources work together. Still the single parts can be individually used.
  4. The role of research. MPEG has been a magnet attracting the best researchers in digital media from both industry and academia and is influencing many research programs.
  5. New customers without losing old ones. With MPEG-2, MPEG had “acquired” the broadcasting industry, but with MPEG-4 it acquired the IT and mobile industry. MPEG succeeded in providing the same standards to both industries even though they were more and more in competition. This has continued in other areas of MPEG standardisation.
  6. Strategic plans. MPEG has developed its program of work through collaboration with its client industries. MPEG does not have a centralised “strategic planning” function, but this function is part of its modus operandi.
  7. Business model. Companies participating in MPEG know that good technologies are rewarded by royalties and that they should invest in technologies for future standards.

The situation today

The MPEG operation has seen many years of successes, but the context has greatly changed.

  1. MPEG has an impressive portfolio of standards actively used by a global number of loyal customers.
  2. The media industry has greatly expanded in scope and its members are becoming more diverse.
  3. MPEG values the requirements of its customers but there are so many technologies fighting for dominance
  4. There is an increasing percentage of MPEG members who come from research/academia or are NPEs
  5. Acquiring new customers in new areas is getting more and more onerous
  6. Many work items in the strategic plan heavily depend on technologies whose development path is unclear
  7. The MPEG business model is still an asset, but may no longer serve the needs of a significant part of MPEG customers.

And now, what?

MPEG Future strives to facilitate the creation of a new environment that will enable the development of standards for media compression and distribution and their adoption for ever more pervasive media-related user experiences.

What should be the principal axes of the new MPEG age advocated by MPEG Future?

Technology? Sure, mastering technology for top performing standards remains important to MPEG, but commonality and synergies of technologies is not the issue. MPEG has a large and dedicated group of experts who explore the implications of new technologies (more is better, but it is not the issue).

Market? Sure, market is important. Companies may be reluctant to talk too much about market but making standards that are driven by research and academia is not the way to go.

The principal axis of the next phase of MPEG work should focus on how market players want to package technologies – that MPEG Future obviously advocates to be standard – to serve market needs.

MPEG Future envisages that a new group called Market Needs be created in MPEG in its new status of subcommittee, next to the existing Technical Requirements group. The latter should continue to explore the technology side of new ideas, while the former should monitor the relevance of new ideas as enabled by technology, to market reality. The new form of the MPEG standards ecosystem is depicted in Figure 2.

Figure 2: The neW MPEG standard ecosystem

There are two main challenges for the new MPEG standards ecosystem

  1. A Market Needs group populated with industry leaders
  2. A modus operandi where inputs from Market Needs to Technical Requirements enrich the technical exploration and results from Technical Requirements to Market Needs are used to strengthen the market value of the a new idea.

MPEG is well placed to create an effective Market Needs group because of its network of partners and customers and is well placed to extend its modus operandi with an effective Market Needs – Technical Requirements interaction. After all MPEG has spent its last 30 years incorporating new communities. The Market Needs community is of a new type, but this makes the challenge all the more enticing…

Posts in this thread