Design and build the future of MPEG

Established in 1988, the Moving Picture Experts Group (MPEG) has produced some 180 standards in the area of efficient distribution on broadcast, broadband, mobile and physical media, and consum­ption of digital moving pictures and audio, both conventional and immersive, using analysis, compression and trans­port technologies. It has also produced 5 standards for efficient storage, processing and delivery of genomic data.

With its membership of ~1500 registered and ~600 attending experts, MPEG has produced more standards than any other JTC 1 subcommittee. The standards enable a manufacturing and service business worth more than 1.5 T$ p.a.

We expect that in the future industry will require more MPEG standards for the following areas

  1. Rectangular video coding. As ~80% of world communication traffic is video-based, new more efficient video coding standards will continue to be of high interest. These will likely incor­porate new video compression technologies, driven by extensive research both in algo­rithm and hardware domain, e.g. in the area of Machine Learning. The market will also demand standards with higher quality user experience such as high-fidelity near-live video coding.
  2. Audio coding. As for video, future audio coding standards could be in the wake of the MPEG Audio coding family, be Machine Learning based or a be mix of the two. High fidelity near live audio coding standards will also be required to provide new forms of user experience.
  3. Immersive media will require coding standards that allow a user to move unres­tricted in a 3D space, for Point Clouds for human and machine use, for realistic meshes and textures and for light fields. Immersive audio coding standards providing an even more intensely satisfying user experience than the MPEG-I Audio standard currently being developed will also be required.
  4. 3D scene representation standards will be needed to recreate realistic 3D scenes populated by media objects en­coded with other MPEG immersive media coding standards. It is expected that standards for effective human interaction with media will also be required.
  5. Quality of experience will likely differentiate the market of products and services through the use of sophis­ticated technologies that reduce the amount of information needed to transmit future media types. MPEG must support the manufacturing and service industries by offering standards with features beyond those of the currently available standards, such as machine analytics for quality measure.
  6. Video Coding for Machines (VCM) is a new area of endeavour primarily designed to serve remotely located machines and, potentially, human users as well. Audio Coding for Machines will likely be a necessary complement to VCM. Machine vision standards, e.g. for object classification, will continue the tradition of CDVS (image search) and CDVA (video analysis) standards and will mostly be based on similar technologies as VCM. Internet of Media Things standards will also serve as a framework for the above standards.
  7. Application formats will be required by industry to facilitate the use of the wide range of available and future MPEG basic media coding standards by configuring MPEG technologies to suit application needs.
  8. Transport of compressed media standards have always been enablers of the use of MPEG media coding standards and we expect the need for them will continue. Ways to cope with emerging constraints on video and audio delivery over internet, such as those witnessed during the Covid-19 emergency, should be considered. New networks, e.g. 5G and other future networks, will likely drive the requests and MPEG should actively work together with the relevant bodies.
  9. Compression of genomic data, initiated 5 years ago in collaboration with TC 276, has shown that MPEG technologies could be successfully applied to new non-media areas. Standards will continue to be produced to respond to this new client industry.
  10. MPEG has realised that Compression of other data is a broad new area that can benefit from its approach to standardisation in support to new industries such as Medical, Industry 4.0, Automotive etc. As MPEG does not obviously have domain specific competence, it plans to identify the needs and develop standards for these new areas jointly with the relevant committees.

The above is an ambitious program of work in an expanding area of standardisation that heavily depends on sophisticated and evolving technologies. MPEG is confident it will be able to execute this program because it has learnt how to foster and actually fostered innovation with its standard and because, since its early days, it has innovated the way a standards committee operates, and developed a set of procedures and operations that MPEG calls its “modus operandi”.

Below we will examine how MPEG’s “modus operandi” will be exercised and adapted to the new challenges of the work program.

No one controls MPEG Connection with industries
Connection with academia Technology-driven standards meeting market needs
Standards with and for other committees Plans for future common standards
Standard (co-)development Effectiveness of MPEG role
Role of ICT tools in standardisation  

No one controls MPEG. The key differentiating element is that no entity or industry “controls”  MPEG. At all levels, proposals follow a bottom-up path and are accepted, not because of their origin but because of their value, by a mostly flat internally dynamic organisation that fosters the constructive flow and interaction of ideas. The manag­ement monitors the activities of the different organisational units, identifies points of conver­gence, interaction or contention, and creates opportunities to resolve issues. This element of the modus operandi must be preserved.

Connection with industries. The success of MPEG standards is largely based on the establishment of efficient connections with the client industries. MPEG will continue to interact with them providing solutions to their cus­tomers’ needs. However, we expect that formal liaisons may soon become insufficient because MPEG standards heavily depend on rapidly evolving techno­logies, such as capture and presentation, which have a strong impact on its standards but have a scope outside of the MPEG purview. MPEG will need to identify and activ­ate relationships with the industries who are developing the new media technologies that will drive the development of its new standards. At the same time MPEG should enhance its relationships with the content industry that provides the ultimate user experience using MPEG standards, again using a network with varying capabilities to address devices that could have graded tiers of computational complexity.

Connection with academia. Another reason for the success of MPEG standards is MPEG’s strong links with academia and research organisations, an area that accounts for ~¼ of its total mem­bership and contributes to sustaining the vitality of its standards activities. MPEG has always conducted exploration on promising new technologies, before standardisation can begin. However, the number of directions that require explorations are growing fast and MPEG resources may not be sufficient in the future to assess the potential of all candidate technologies. MPEG will need to establish more organic links with academia and create more opportunities to stimulate, get input from and possibly collaborate with academia.

Technology-driven standards meeting market needs. MPEG’s modus operandi has been inspired by applications as supported by technology and we think that in the future this must continue. However, it is imperative that this be complemented by a form of internal competition with new MPEG members who are market-driven. The goal is to strike the right balance between tech­nol­ogy-push and market-pull, i.e. to be able to identify targets that are both technology-driven and attractive to the market. MPEG should create a market needs group who dialectically interacts with the requirem­ents group to fortify proposals for new work, keeping timely delivery of its standards as a high priority issue. MPEG should better assert its active presence on all aspects of the standardisation workflow: industry, market, research and academia.

Standards with and for other committees. MPEG’s modus operandi has allowed MPEG to manage the development of technology-heavy stan­dards, some of which have involved hundreds of person-years. Most standards have been developed with internal resources. However, other standards have been developed jointly with entities outside MPEG at different level of integration: e.g. with ITU-T SG 16, with SC 24 for augmented reality, with ISO TC 276 for genomics etc. Several MPEG standard extensions have been developed in response to the need of other committees (e.g. JPEG, 3GPP, SCTE etc.). This fluid interaction must be enhanced by broadening MPEG’s operation, as media technologies interact more and more with other technologies in products and services and use of compression spreads to more domains.

Plans for future common standards. We envisage the need for MPEG to continue to develop common standards with other committees, in partic­ular with ITU-T SG 16 on video coding and with TC 276 Biotechnologies on gen­omic data compression. Augmented Reality will provide opportunities to develop common standards with JTC 1/SC 24. We also see more work with other commi­ttees as an essential component of MPEG future. Examples are JTC 1/SC 41 on a new phase of Internet of Media Things (IoMT), JTC 1/SC 42 on Machine Learning for compression and Big Data, starting from MPEG’s Network-Based Media Processing (NBMP) standard, and TC 215 Health Informatics on data compression. In a not-so-distant future MPEG should seek new standardisation opportunities in the area of Industry 4.0 and automotive. It is an ambitious program that can be implemented using MPEG experience and representation.

Standard development models. MPEG has learned that effective standardisation can be achieved if experts work shoulder-to-shoulder. How can this be achieved with an expanded scope of work and increased number of collaboration parties? One answer lies in the experience MPEG has gathered from its long history of online collaborations and the fully online 130th meeting attended by ~600 experts. In normal conditions, a large part of MPEG meetings will continue to be held face-to-face. However, many standards can and should be developed jointly with other organisations mostly via online meetings. These can be easily integrated with the rest of the MPEG workflow because they deal with specific topics.

Effectiveness of MPEG role. MPEG will need to develop new metrics, beyond the number of standards produced (MPEG is already the largest producer of standards in JTC 1) and apply them in a regular fashion to revise the effectiveness of its role. While MPEG standards continue to have loyal customers, their importance in other domains is less significant. The past is often so sticky that MPEG must make sure that its past successes do not make it an “also-ran” in the future. This does not mean that MPEG must renege its practices, but that it must adapt its business model to the new challenges. MPEG needs to reassert its mission to develop the best standards satisfying agreed requir­ements, thus continuing to provide opportunities to remunerate good IP for good standards. At the same time, MPEG should develop a strategy to handle the growing competition to its standards. One way is by enhancing the role of reference software, as an equivalent normative version of MPEG standards. Another is to envisage conditions that may facilitate expeditious development of licensing terms by the market.

Role of ICT tools in standardisation. Many aspects of the MPEG modus operandi heavily rely on ICT support. In 1995 MPEG was probably the first committee of its size and structure to adopt electronic document upload and distribution. In the last 15 years MPEG had the privilege to benefit from the support of Institut Mines Telecom and especially Christian Tulvan who has continuously designed new solutions that allow MPEG to be a fluid and flat organisation where every member can contribute by knowing what happens where. MPEG expects to be able to rely on their unique support to implement more features that will enhance the quality and quantity of its standards through enhanced and qualified participation.

Posts in this thread

 

 

Another view at MPEG’s strengths and weaknesses

Introduction

In No one is perfect, but some are more accomplished than others I have started a 360 ⁰ analysis of MPEG strengths, weaknesses, opportunities and threats considering the Context in which MPEG operates, the Scope of MPEG standards and the Business model of the industries using MPEG standards. Subsequently, in More MPEG Strengths, Weaknesses, Opportunities and Threats I have considered the Membership, the organisational Structure and the Leadership.

In this article I would like to continue the SWOT analysis and talk about MPEG’s Client industries, Collaboration with other bodies and the Standards development process.

Client industries

Strengths

MPEG can claim as a strength something that others may consider a weakness. When MPEG was created, MPEG did not have a reference industry. It did create one little by little starting from Telecom and Consumer Electronics (CE), the first two industries members of which supported the MPEG-1 standard. Broadcasting became another client industry with MPEG-2 and IT another with MPEG-4. Industry participation followed with mobile telecom, VLSI and more.

Why do I call this a strength? Because MPEG did not have a “Master” dictating what it should do, but many masters. With this MPEG was able to develop standards that were abstracted from the specific needs of one industry and was able to concentrate on what was common to all industries served.

MPEG was always very aggressive in looking for customers. When you do something it is always better to talk with those who have clear ideas about what the something can do for them. This effort to reach out has paid because the standards were “better” (they provided the intended features) and the customers were ready to adopt them. 3GPP, ARIB, ATSC, BDA, CTA, DVB, ETSI, ITU-T/R, TTA are just dome of the organisation MPEG has developed standards for.

The result is that today MPEG has an array of industries following its lead.

Weaknesses

Success is a hard master. If you don’t pile up more successes than before you seem to fail.

The MPEG-4 Visual licence signalled to the IT world that the business model was still driven by the old paradigm and that IT companies needed to find something else more familiar to them. The web was the perfect place where this could happen and did happen.

Interestingly, the old categories of Telecom, CE, Broadcasting, IT gradually came to lose their value: Telecom had a lot to share with IT and needed CE, and CE continued to serve Broadcasting with its devices, but added mobile and morphed to something hard to distinguish from a manufacturer of IT devices.

MPEG had its hands largely tied in this global transformation and the result is that a significant proportion of what MPEG can claim to be its client industry is not (or no longer) following its lead.

Opportunities

Still MPEG is appreciated by many non-ISO industries, less so by organisationally closer ones, because it the organisation you are considered for your level, but in industry you are considered for what you are.

Therefore MPEG has the opportunity not only to recover old client industries, but also to acquire new ones.

Threats

The most challenging threat is the breaking up of the MPEG ecosystem into a set of more or less independent components. This would be a big loss because today industry knows they can come to the MPEG “emporium” and get the integrated technologies they need. With MPEG broken up in pieces there will be a strengthened competition from individual environments who can offer the same individual pieces as the broken up MPEG can offer.

The threat to MPEG can also come from industries abandoning MPEG not because its standards are the best, but because MPEG standards cannot compete on other features.

Another threat comes from the possibility for industry consortia to build their own specifications by aggregating MPEG building blocks rather than propose a new standard to MPEG.

Collaboration with other bodies

Strengths

MPEG has always had an open attitude to collaboration with other bodies operating in neighbouring fields. 3GPP, AES, IETF, ITU-T SG 16, Khronos, SCTE, SMPTE, VRIF, W3C are just some examples

We have had many collaborations with

  • ITU-T SG 16: MPEG-2 Systems and Video, MPEG-4 AVC, MPEG-H HEVC, MPEG-I VVC, and Coding Independent Code Points (CICP)
  • TC 276/WG 5: File Format, Genome Compression, Metadata and API and Reference Software and Conformance in a week time
  • SC 24: Augmented Reality Reference Model

MPEG has developed many standards for other committees

  • JPEG: developed the file format for JPEG and JPEG 2000, developed the MPEG-2 transport for JPEG 2000 and JPEG XL
  • 3GPP: DASH was originally suggested by 3GPP and developed in close consultation with this and other committees.

The broad scope of MPEG standards offers more opportunities to collaborate and provide better and better focused standards.

More about MPEG collaboration on standard development at Standards and collaboration,

Weaknesses

The lowly WG status reduces MPEG’s ability to deal with other bodies authoritatively.

Opportunities

MPEG is not part of those who think they can do everything that is useful by themselves. The time devices and services were controlled by a major technology are long gone. MPEG technologies are present in a large number of devices, but to be useful those devices need to integrate other key technologies as well.

Therefore more collaborations are vital to the success of MPEG standards:

  1. MPEG has a Internet of Media Things (IoMT) standard that may have dependencies on what KTC 1/SC 41 IoT is doing
  2. JTC 1/SC 42 is in charge of Artificial Intelligence (AI). Neural Networks are an important AI technology. MPEG is working on a Neural Network Compression standard
  3. MPEG has developed Network Based Media Processing, an implementation of the Big Data Reference Model developed by SC 42
  4. MPEG is extending the glTF™ (GL Transmission Format) 3D scene description specification developed by Khronos

Collaboration is a great tool to achieve a goal based on common interests with other committees.

Threats

Less collaborations is a threat because it means less opportunities to develop good standards that extend to other fields!

Standards development process

Strengths

The process to develop standards is a major MPEG strength. It is a thorough implementation of ISO/IEC process with major extensions added to seek technology with Calls for Evidence (CfE) and Call for Proposals (CfP), develop the standards with Core Experiments (CE) and prove value of standards with Verification Tests (VT).

The figure illustrates the major elements of the MPEG standard development process.

Some industry participantscomplain about the ISO process to develop and approve standards. For MPEG that process is an excellent tool that allows the development of better standards, not a bureaucratic imposition.

Weaknesses

In a positive context there is one weakness. A major value of MPEG standards lies in the fact that standards are typically multi-part and that the parts have been designed to interact with one another. MPEG has a lean organisation that promotes focus on technical matter and chairs coordination meetings that monitor progress of work and identify area of interaction or contention between different technical areas. The figure represents the interaction between and among different parts of a standard.

In spite of the essential role played by coordination, this function has an ad hoc nature, which is a weakness.

Another weakness is the fact that at meetings there are so many interesting things happening but not enough time to follow them all.

Opportunities

There are great opportunities to improve the MPEG standard development process. Institut Mines Télécom has lent its ICT support to MPEG and Christian Tulvan has done – and he is doing – a lot to provide more tools. MPEG is facing the current Covid-19 emergency with new ICT tools provided by Christian.

It would be great if the MPEG ICT tools could be integrated in the ISO IT platforms.

Threats

The big threat is the breaking up of the MPEG ecosystem.

Posts in this thread

An “unexpected” MPEG media type: fonts

Introduction

These days MPEG is adapting its working methods at physical meetings to cope with the constraints of the Covid-19 pandemic. But this is not the first time that MPEG does works, and excellent one, without the help of physical meetings. This article will talk about this first and indispensable attempt at developing standards outside of regular MPEG meetings.

Compression is a basic need

In the early 2000, MPEG received an unexpected, but in hindsight obvious proposal. MPEG-4 defines several compressed media types – audio, video and 3D graphics are obvious choices – but also the technology to compose the different media. Text information, presented in a particular font, is more and more an essential component of a multimedia presentation and the font data must be accessible with the text objects in the multimedia presentation. However, as thousands of fonts are available today for use by content authors, and only few fonts (or just one) could be available on devices that are used to consume media content – there is a need to compress and stream font information.
So MPEG began to work on the new “font” media type, and in 2003 produced the new standard: MPEG-4 Part 18 – Font compression and streaming, which relied on OpenType as input font data format.

Fonts need more from MPEG

Later MPEG received another proposal from Adobe and Microsoft, the original developers of OpenType, a format for scalable computer fonts. By that time, OpenType was already in wide use in the PC world, and was seen as an attractive technology for adoption by consumer electronic and mobile devices. The community using it kept on requesting support for new features, and many new usage scenarios have emerged. As MPEG knows only too well, responding to those requests from different constituencies required a significant investment of resources. Adobe and Microsoft asked MPEG if it could do that. I don’t have to tell you what was the response.

Not all technical communities are the same

A new problem arose, though. The overlap of the traditional MPEG membership with the OpenType community was minimal, save for Vladimir Levantovsky of Monotype who had brought the font compression and streaming proposal to MPEG. So a novel arrangement was devised. MPEG would create an ad hoc group that would do the work via online discussions, and once the group has reached a consensus, the result of the discussions, in the form of “report of the ad hoc group”, would be brought by Vladimir to MPEG. The report would be reviewed by the Systems group and the result integrated in the rest of MPEG work, as it is done with all other subgroups. The report would often be accompanied by additional material to produce draft standard that would be submitted by MPEG to National Body voting. Disposition of comments and the new version of the standard would be developed within the ad hoc group, and submitted to MPEG for action.
This working method has been able to draw from the vast and dispersed community of language and font experts, who would never have been willing or able to attend physical meetings, to develop, maintain and extend standard for 16 years, and with outstanding results. MPEG has produced four editions of the standard called Open Font Format (OFF). As always, each edition of the standard has been updated by several amendments (extensions to the standard) before a new edition was created.

What does OFF offer to the industry? 

High quality glyph scaling
A glyph is a unit of text display that provides visual representation for a character, such as the Latin letter a, the Greek letter  the Japanese hiragana letter あor the Chinese character 亞. In an output device, a scaled outline of a glyph is represented by dots (or pixels) in a process called rasterization. The approximation of a complex glyph outline with dots can lead to significant distortions, especially when the number of dots per unit length is low.

OFF fonts introduced programmatic approach – a technology called “hinting” deals with these inconsistencies by adding additional instructions encoded in the font. The OFF standard helped overcome the obstacles by making proprietary font hinting and other patented technologies easily accessible to implementers, and enabled multiple software and device vendors to obtain Fair, Reasonable and Non Discriminatory (FRAND) licences.

The figure illustrate the connection of the and new worlds, i.e. the transformation of analog to digital. The curved arrows on the right side of the picture illustrate the effects of applying hint instructions to glyph control points, and their interdependencies.

Leveraging Unicode

The Unicode Standard is a universal character encoding system that was developed by the Unicode Consortium to support the interchange, processing, and display of the written text in many world languages – both modern (i.e. used today) and historic (so that conversion of e.g. ancient books and document archives into digital format is possible).

The Unicode Standard follows a set of fundamental principles of separation of text content from text display. This goal is accomplished by introducing a universal repertoire of code-points for all characters, and by encoding plain text characters in their logical order, without regard for specific writing system, language, or text direction. As a result, the Unicode Standard enables text sequences to be editable, searchable, and machine-readable by a wide variety of applications.

In short, the Unicode encoding defines the semantics of text content and rules for its processing – to visualize a text and make it human readable we need fonts that bring all missing elements required for text display. Glyph outlines, character to glyph mapping, text metrics and language-specific layout features for dynamic text composition are only some (but not all) of the data elements encoded in an OFF font. It is the combination of font information with the Unicode compliant character encoding that makes it possible for all computer applications to “speak” in any of the world languages without effort!

Advanced typography

OFF fonts offer a number of key ingredients that enable advanced typographic features, including support for certain features (e.g. ligatures and contextual glyph substitution) that are mandatory for certain languages, and a number of discretionary features. These include support for stylistic alternates, colour fonts for emoji, and many other advanced features that are needed to support bi-directional text, complex scripts and writing systems, layout and composition of text when mixing characters from different languages … the list can go on!

Lately, the addition of the variable fonts made it possible to both improve the efficiency of encoding of font data (e.g. by replacing multi-font families with a single variable font that offers a variety of weight width and style choices), and introduce new dynamic typographic tools that revolutionized the way we use fonts on the web.

Font Embedding
As I mentioned earlier, thousands of fonts are available for use in media presentations, and making a particular font choice is solely the author’s right and responsibility. Media devices and applications must respect all authors’ decisions to ensure faithful content presentation. By embedding fonts in electronic documents and multimedia presentations, content creators are assured that text content will be reproduced exactly as intended by authors, preserving content appearance, look and feel, original layout, and language choices.

OFF supports many different ways for font embedding, including embedding font data in electronic documents, font data streaming and encoding as part of ISO Base Media File Format, and font linking on the web. OFF font technology standard developed by MPEG became a fundamental component for deployment of Web Open Font Format, which has facilitated adoption of OFF fonts in the web environment and is now widely supported in different browsers.

What is OFF being used for?
OFF has become the preferred font encoding solution for applications that demand advanced text and graphics capabilities. Fonts are a key component of any media presentation, including web content and applications, digital publishing, newscast, commercial broadcasting and advertisement, e-learning, games, interactive TV and Rich Media, multimedia messaging, sub-titling, Digital Cinema, document processing, etc.

It is fair to say that after many years of OFF development, the support for OFF and font technology in general became ubiquitous in all consumer electronic devices and applications, bringing tremendous benefits for font designers, authors, and users alike!

Acknowledgement

The review and substantial additions of Vladimir Levantosky of Monotype to this article are gratefully acknowledged. For 20 years Vladimir has dedicated his energies to coordinate the font community efforts, make font standards in MPEG and promote the standards to industry.

Thank you Vladimir!

Posts in this thread

MPEG and the future of visual information coding standards

Video in MPEG has a long history

MPEG started with the idea of compressing the 216 Mbit/s of standard definition video, and the associated 1.41 Mbit/s of stereo audio, for interactive video applications on Compact Disc (CD). That innovative medium of the early 1980s was capable to provide a sustained bitrate of 1.41 Mbit/s for about 1 hour. The bitrate was expected to accommodate both the video and audio information. At about the same time, some telco research laboratories were working on an oddly named technology called Asymmetric Digital Subscriber Line (ADSL), in other words, a modem for high-speed (at that time) transmission of ~1.5 Mbit/s for the “last mile”, but only from the telephone exchange to the subscriber’s network termination. In the other direction, only a few tens of kbit/s were supported.

Therefore, if we exclude a handful of forward-looking broadcasters, the MPEG-1 project was really a Consumer Electronics – Telco project.

Setting aside the eventual success of the MPEG-1 standard – Video CD (VCD) used MPEG-1 and 1 billion player were produced in total, hence a different goal than the original interactive video – MPEG-1 was a remarkable success for being the first toidentify an enticing business case for video (and audio) compression, and systems, on top of which tens of successful MPEG standards were built over the years.

This article has many links

In Forty years of video coding and counting I have recounted the full story of video coding and in The MPEG drive to immersive visual experiences I have focused on the efforts MPEG has made, since its early years, to provide standards for an extended 3D visual experience. In Quality, more quality and more more quality I have described how MPEG uses and innovates subjective quality assessment methodologies to develop and eventually certify the level of performance of a visual coding standard. In On the convergence of Video and 3D Graphics I have described the efforts MPEG is making to develop a unified framework that encompasses a set of video sources producing pixels and a set of sensors producing points. In More video with more features I described how MPEG has been able to support more video features in addition to basic compression.

Now to the question

Seeing all this, the obvious question a reader might ask could be: if MPEG has done so much in the area of visual information coding standards, does MPEG still have much to do in that space? A reader with only a superficial understanding of the force that drives MPEG should probably know the answer, but I am not going to give it right now. I will first argument what I see as the future of MPEG in this area.

I need to make a disclaimer first. The title of this article is “The future of visual information coding standards”, but I should restrict the scope to “dynamic (i.e. time dependent) visual information coding”. Indeed the coding of still pictures is a different field of emdeavour serving the needs of a different industry with a different business model. It should not be a surprise if the two JPEG standards – the original JPEG and JPEG 2000 – have both a baseline mode (the only one that is actually used) which is Option 1 (ISO/IEC/ITU language for “royalty free”). It should also be no surprise to see that, while it is conceivable to think of a standard for holographic still image coding,  holography is not even mentioned in this article.

There was always a need for new video codecs

Forty years of video coding and counting explains the incredible decade-long ride to develop video coding standards all based on the same basic ideas enriched at each generation, that will enable the industry to achieve a bitrate reduction of 1,000 from video PCM samples to compressed bitstream with the availability of the latest VVC  video compression standard, hopefully in the second half of 2020 (the incertainty is caused by the current Covid-19 pandemic which is taking its toll on MPEG as well).

The need for new and better compression standards, when technology makes it possible and the improvement over the latest existing standard justifies it, has been justified by the push toward higher resolution, colour, bit-per-pixel, dynamic range, viewing angle etc. video and the lagging availability of a correspondingly higher bitrate to the end user.

The push toward “higher everything” will continue, but will the bitrate made available to the end user continue to lag?

The safe answer is: it will depend. It is a matter of fact that bandwidth is not an asset uniformly available in the world. In the so-called advanced economies the introduction of fibre to the home or to the curb continues apace. The global 5G services market size is estimated to reach 45.7 B$ by 2020 and register a CAGR of 32.1% from 2021 to 2025 reaching ~184 B$. Note that 2025 is the time when MPEG should think seriously about a new video coding standard. The impact of the current pandemic could further accelerate 5G deployment.

More video and which codecs

The first question is whether there will be room for a new video coding standard. My answer is yes and for at least two reasons. The first is socio-economic: the amount of world population that is served by a limited amount of bandwidth will remain large while the desire to enjoy the same level of experience of the rest of the world will remain high. The second is technical: currently, efficient 3D video compression is largely dependent on efficient 2D video compression.

The second question is more tricky. Will this new (after VVC) 2D video compression standard still be another extension of the motion compensated prediction scheme? I am sure that the answer could be yes. The prowess of the MPEG community is such that another 50% improvement could well be provided. I am not sure that will happen, though. Machine learning applied to video coding is showing that significant improvements over state-of-the-art video compression can be obtained by replacing components of existing schemes with Neural Networks (NN), or even by defining entirely new NN-based architectures.

The latter approach has several aspects that make it desirable. The first is that a NN is trained for a certain purpose but you can always trained it better, possibly at the cost of making it heavier. Neural Network Compression (NNC), another standard MPEG is working on, could further extend the range of incrementally improving the performance of a video coding standard, without changing the standard, by making components of the standard downloadable. Another desirable aspect is that media devices will become more and more addicted to using Artificial Intelligence (AI)-inspired technologies. Therefore a NN-based video codec could simply be more attractive for a device implementor because the basic processing architectures are shared amongst a larger number of data types.

New types of video codec

There is another direction that needs to considered in this context and that is the large and growing quantity of data that are being and will be produced by connected vehicles, video surveillance, smart cities etc. In most cases today and more so in the future, it is out of question to have humans at the other side of the transmission channel watching what is being transmitted. More likely there will be machines that will monitor what happens. Therefore, the traditional video coding scenario that aims to achieve the best video/image under certain bit-rate constraints having humans as consumption targets is inefficient and unrealistic in terms of latency and scale when the consumption target is a machine.

Video Coding for Machines (VCM) is the title of an MPEG investigation that seeks to determine the requirement for this novel, but not entirely new video coding standard. Indeed, the technologies standardised by MPEG-7 – efficiently compressed image and video descriptors and description schemes – belong to the same category as VCM. It must be said, however, that 20 years have not passed in vain. It is expected that all descriptors will be the output of one or more NNs.

One important requirement is the fact that while millions of streams may be monitored by machines, some streams may need to be monitored by humans as well, possibly after having been alerted by a machine. Therefore VCM is linked to the potential new video coding I have talked about above. The question is whether VCM should be called HMVC (Human-Machine Video Coding) or there should be VCM (where the human part remains below threshold in terms of priority) and YAVC (Yet Another Video Coding, where the user is meant to be a human).

Immervice video codecs

The MPEG drive to immersive visual experiences shows that MPEG has always been fascinated by immersive video. The fascination is not fading away as shown by the fact that, in four months MPEG plans to release the Video-based Point Cloud Compression standard and in a year the MPEG Immersive Video standard.

These standards, however, are not the end points, but the starting points in the drive to more rewarding user experiences. Today we cannot say how and when MPEG standards will be able to provide full navigation in a virtual space. However, that remains the goal for MPEG. Reaching that goal will also depend on the appearance of new capture and display technologies.

Conclusions

The MPEG war machine will be capable to deliver the standards that will keep the industry busy developing the products and the service that will enhance the user experience. But we should not forget an important element: the need to adapt MPEG’s business model to the new age.

MPEG needs to adapt, not change its business model. If MPEG has been able to sustain the growth of the media industry, it is because it has provided opportunities to remunerate the good Intellectual Property that is part of its standards.

There are other business models appearing. The MPEG business model has shown its worth for the last 30 years. It can do the same for another 30 years if MPEG will be able to develop a strategy to face and overcome the competition to its standards.

Posts in this thread