The MPEG Future Manifesto

Communication makes us humans different. Media make communication between humans effective and enjoyable. Standards make media communication possible.

Thirty-two years ago, the MPEG vision was forming: make global standards available to allow industry to provide devices and services for the then emerging digital media so that humans could communicate seamlessly.

For thirty-two years the MPEG standards group has lived up to the MPEG vision: MPEG standards are behind the relentless growth of many industries – some of them created by MPEG standards. More than half the world population uses devices or accesses services, on a daily or hourly basis, that rely on MPEG standards.

The MPEG Future Manifesto claims that the MPEG mission is far from exhausted:

  • New media compression standards can offer more exciting user experiences to benefit consumers that the service, distribution and manufacturing industries want to reach, but also for new machine-based services;
  • Compression standards can facilitate the business or mission of other non-media industries and the MPEG standards group has already shown that this is possible.

Therefore, the MPEG Future Manifesto proposes a concerted effort to

  • Support and expand the academic and research community which provides the life blood of MPEG standards;
  • Enhance the value of intellectual property that make MPEG standards unique while facilitating their use;
  • Identify and promote the development of new compression-related standards benefitting from the MPEG approach to standardisation;
  • Further improve the connection between industry and users, and the MPEG standards group
  • Preserve and enhance the organisation of MPEG, the standards group who can achieve the next goals because it brought the industry to this point.

MPEG Future is a group of people, many of whom are MPEG members, who care about the future of MPEG. MPEG Future is open to those who support the MPEG Future Manifesto’s principles and actions.

You may:

  • Participate in the MPEG Future activities, by subscribing to the LinkedIn MPEG Future group https://bit.ly/2m6r19y
  • Join the MPEG Future initiative, by sending an email to info@mpegfuture.org.

Posts in this thread

 

What is MPEG doing these days?

It is a now a few months since I last talked about the standards being developed  by MPEG. As the group dynamics is fast, I think it is time to make an update about the main areas of standardisation: Video, Audio, Point Clouds, Fonts, Neural NetworksGenomic data, Scene description, Transport, File Format and API. You will also find a few words on three exploration that MPEG is making

  1. Video Coding for Machines
  2. MPEG-21 contracts to smart contracts
  3. Machine tool data.

Video

Video continues to be a very active area of work. New SEI messages are being defined for HEVC while there a high activity in VVC that is due to reach FDIS in July 2020. Verification Tests for VVC have not been carried out yet, but the expectation is that VVC will bring compression of video of about ~1000, as can be seen from the following table where bitrate reduction of a standard is measured with respect to that of the previous standard. MPEG-1 bitrate reduction with respect to uncompressed video. VVC bitrate reduction is estimated.

Standard Bitrate reduction Year
MPEG-1 Video -98% 1992
MPEG-2 Video -50% 1994
MPEG-4 Visual -25% 1999
MPEG-4 AVC -30% 2003
MPEG-H HEVC -60% 2013
MPEG-I VVC -50% 2020

Compression of 1000 is obtained by computing the inverse of 0.02*0.5*0.75*0.7*0.4*0.5.

SEI messages for VVC are now being collected in MPEG-C Part 7 “SEI messages for coded video bitstreams”. The specification of SEI messages is generic in the sense that the transport of SEI messages can be effected both in the video bitstream or at the Systems layer. Care is also taken to make messages transport possible on previous video coding standards.

MPEG CICP (Coding-Independent Code-Points) Part 4 “Usage of video signal type code points” has been released. This Technical Report provides guidance on combinations of video properties that are widely used in industry production practices by documenting the usage of colour-related code points and description data for video content production.

MPEG is also working on two more “traditional” video coding standard, both included in MPEG-5.

  1. Essential Video Coding (EVC) will be a standard video coded that addresses business needs in some use cases, such as video streaming, where existing ISO video coding standards have not been as widely adopted as might be expected from their purely technical characteristics. EVC is now being balloted as DIS. Experts working on EVC are actively preparing for the Verification Tests to see how much “addressing business needs” will cost in terms of performance.
  2. Low Complexity Enhancement Video Coding (LCEVC) will be a standardvideo coded that leverages other video codecs yo improves video compression efficiency while maintaining or lowering the overall encoding and decoding complexity. LCEVC is now being balloted as CD.

MPEG-I OMAF already supports (2018) 3 Degrees of Freedom (3DoF), where a user’s head can yaw, pitch and roll, but the position of the body is static. However, rendering flat 360° video, i.e. supporting head rotations only, may generate visual discomfort especially when rendering objects close to the viewer.

6DoF enables translation movements in horizontal, vertical, and depth directions in addition to 3DoF orientations. The translation support enables interactive motion parallax giving viewers natural cues to their visual system and resulting in an enhanced perception of volume around them.

MPEG is currently working on a video compression standard (MPEG-I Part 12 Immersive Video – MIV) that enables head-scale movements within a limited space. In the article On the convergence of Video and 3D Graphics I have provided some details of the technology being used to achieve the goal, comparing it with the technology used for Video-based Point Cloud Compression (V-PCC). MIV is planned to reach FDIS in October 2020.

Audio

Audio experts are working with the goal to leverage MPEG-H 3D Audio to provide a full 6DoF Audio experience, viz. where the user can localise sound objects in horizontal and vertical planes, and perceive sound objects’s loudness changes as a user moves around an audio object, sound reverberation as in a real room and occlusion when a physical object is interposed between a sound source and a user.

The components of the system to be used to test proposals are

  • Coding of audio sources: using MPEG-H 3D Audio
  • Coding of meta-data: e.g. source directivity or room acoustic properties
  • Audio and visual presentations for immersive VR worlds (correctly perceiving a virtual audio space without any visual cues is very difficult)
  • Virtual Reality basketball court where the Immersive Audio renderer makes all the sounds in response to the user interaction of bouncing the ball and all “bounce sounds” are compressed and transmitted from server to client.

Evaluation of proposals will be done via

  • Full, real-time audio-visual presentation
  • Head-Mounted Display for “Unity” visual presentation
  • Headphones and “Max 8” for audio presentation
  • Proponent technology will run in real-time in Max VST3 plugin.

Currently this is the longest term MPEG-I project as FDIS is planned for January 2022.

MPEG Immersive Video and Audio share a number of features. The most important is the fact that both are not “compression standards”, in the sense that they use existing compression technologies on top of which immersive features are provided by metadata that will be defined by Immersive Video (part 12 of MPEG-I) and Immersive Audio (part 5 of MPEG-I). MPEG-I Part 7 Immersive Media Metadata will specify additional metadata coming from the different subgroups.

Point Clouds

Video-based Point Cloud Compression is progressing fast as FDIS is scheduled for January 2020. The maturity of the technology, suitable for dense point clouds (see, e.g. https://mpeg.chiariglione.org/webtv?v=802f4cd8-3ed6-4f9d-887b-76b9d73b3db4) is reflected in related Systems activities that will be reported later.

Geometry-based Point Cloud Compression, suitable for sparse point clouds (see, e.g. https://mpeg.chiariglione.org/webtv?v=eeecd349-61db-497e-8879-813d2147363d) is following with a delay of 6 months, as FDIS is expected for July 2020.

Fonts

MPEG is extending MPEG-4 Part 22 Open Font Format with an amendment titled “Colour font technology and other updates”.

Neural Networks

Neural Networks are a new data type. Strictly speaking is addressing the compression of Neural Networks trained for multimedia content description and analysis.

NNR, as MPEG experts call it, has taken shape very quickly. First aired and discussed at the October 2017, a Call for Evidence (CfE)  was issued in July 2018 and a Call for Proposal (CfP) issued in October 2018.  Nine responses were received at the January 2019 meeting that enabled the group to produce the first working draft in March 2019. A very active group is working to produce the FDIS in October 2020.

Read more abour NNR at Moving intelligence around.

Genomic data

With MPEG-G parts 1-3 MPEG has provided a file and transport format, compression technology, metadata specifications, protection support and standard APIs for the access of sequencing data in the native compressed format. With the companion parts 4 and 5 reference software and conformance, due to reach FDIS level in April 2020, MPEG will provide a software implementation of a large part of the technologies in parts 1 to 3 and the means to test an implementation for conformity to MPEG-G.

January 2020 is the deadline for responding to the Call for Proposals on Coding of Genomic Annotations. The call is in response to the need of most biological studies based on sequencing protocols to attach different types of annotations, all associated to one or more intervals on the reference sequences, resulting from so-called secondary analyses. The purpose of the call is to acquire technologies that will allow to provide a compressed representation of such annotation.

Scene description

MPEG’s involvement in scene description technologies dates back to 1996 when it selected VRML as the starting point for its Binary Format for Scenes (BIFS). MPEG’s involvement continued with MPEG-4 LASeR, MPEG-B Media Orchestration and MPEG-H Composition Information.

MPEG-I, too, cannot do without a scene technology. As for the past, MPEG will start from an existing specification – glTF2 (https://www.khronos.org/gltf/) – selected because it is an open, extensible, widely supported with many loaders and exporters and enables MPEG to extend glTF2 capabilities of for audio, video and point cloud objects.

The glTF2-based Scene Description will be part 14 of MPEG-I.

Transport

Transport is a fundamental function of real-time media and MPEG continues to develop standards, not just for its own standards, but also for JPEG standards (e.g. JPEG 2000 and JPEG XS). This is what MPEG is currently doing in this vital application area:

  1. MPEG-2 part 1 Systems: a WD of an amendment on Carriage of VVC in MPEG-2 TS. This is urgently needed because broadcasting is expected to be a good user of VVC.
  2. MPEG-H part 10 MMT FEC Codes: an amendment on Window-based Forward Error Correcting (FEC) code
  3. MPEG-H part 13 MMT Implementation Guidelines: an amendment on MMT Implementation Guidelines.

File format

The ISO-based Media File Format is an extremely fertile standards area that extends over many MPEG standards. This is what MPEG is doing in this vital application area:

  1. MPEG-4 part 12 ISO Base Media File Format: two amendments on Compact movie fragments and EventMessage Track Format
  2. MPEG-4 part 15 Carriage of NAL unit structured video in the ISO Base Media File Format: an amendment on HEVC Carriage Improvements and the start of an amendment on Carriage of VVC, a companion of Carriage of VVC in MPEG-2 TS
  3. MPEG-A part 19 Common Media Application Format: the start of an amendment on Additional media profile for CMAF. The expanding use of CMAF prompts the need to support more formats
  4. MPEG-B part 16 Derived Visual Tracks in ISOBMFF: a WD is available as a starting point
  5. MPEG-H part 12 Image File Format: an amendment on Support for predictive image coding, bursts, bracketing, and other improvements to give HEIF the possibility to store predictively encoded video
  6. MPEG-DASH part 1 Media presentation description and segment formats: start of a new edition containing CMAF support, events processing model and other extensions
  7. MPEG-DASH part 5 Server and network assisted DASH (SAND): the FDAM of Improvements on SAND messages has been released
  8. MPEG-DASH part 8 Session based DASH operations: a WD of Session based DASH operations has been initiated
  9. MPEG-I part 2 Omnidirectional Media Format: the second edition of OMAF has started
  10. MPEG-I part 10 Carriage of Video-based Point Cloud Compression Data: currently a CD.

API

This area is more and more being populated with MPEG standards

  1. MPEG-I part 8 Network-based Media Processing is on track to become FDIS in January 2020
  2. MPEG-I part 11 Implementation Guidelines for NBMP is due to reach TR stage in April 2020
  3. MPEG-I part 13 Video decoding interface is a new interface standard to allow an external application to provide one or more rectangular video windows from a VVC bitstream.

Explorations

Video Coding for Machines

MPEG is carrying out explorations in areas than may give rise to future standards: 6DoF, Dense Light Fields and Video Coding for Machines (VCM). VCM is motivated by the fact that, while traditional video coding aims to achieve the best video/image under certain bit-rate constraints having humans as consumption targets, the sheer quantity of data being/to be produced by connected vehicles, video surveillance, smart cities etc. makes the traditional human-oriented scenario inefficient and unrealistic in terms of latency and scale.

Twenty years ago the MPEG-7 project started the development of a comprehensive set of audio, video and multimedia descriptors. Other parts of MPEG-7 have added other standard descriptions of visual information for search and analysis application. VCM may leverage that experience and frame it in the new context of expanded use of neural networks. Those interested can subscribe to the Ad hoc group on Video Coding for Machines at https://lists.aau.at/mailman/listinfo/mpeg-vcm and participate in the discussions at mpeg-vcm@lists.aau.at.

MPEG-21 Based Smart Contracts

MPEG has developed several standards in the framework of MPEG-21 media ecommerce framework addressing the issue of digital licences and contracts. Blockchain can execute smart contracts, but is it possible to translate an MPEG-21 contract to a smart contract?

Let’s consider the following use case where User A and B utilise a Transaction system that interfaces with a Blockchain system and a DRM system. If the transaction on the Blockchain system is successful, DRM System authorises User B to use the media item.

The workflow is

  1. User A writes a CEL contract and a REL licence and sends both to User B
  2. User B sends the CEL and the REL to a Transaction system
  3. Transaction system translates CEL to smart contract, creates token and sends both to Blockchain system
  4. Blockchain system executes smart contract, records transaction and notifies Transaction system of result
  5. If notification is positive Blockchain system translates REL to native DRM licence and notifies User A
  6. User A sends media item to User B
  7. User B requests DRM system to use media item
  8. DRM system authorises User B

In this use case, Users A and B can communicate using the standard CEL and REL languages, while Transaction system is tasked to interface with Blockchain system and DRM system.

A standard way to translate MPEG-21 contracts to smart contracts will ensure users that the smart contract executed by a block chain corresponds to the human-readable MPEG-21 contract.

Those interested in exploring this topic can subscribe to the Ad hoc group on MPEG-21 Contracts to Smart Contracts at https://lists.aau.at/mailman/listinfo/smart-contracts and participate in the discussions at smart-contracts@lists.aau.at.

Machine tools data

Mechanical systems are become more and more sophisticated in terms of functionalities but also in terms of capability to generate data. Virgin Atlantic says that a Boeing 787s may be able to create half a terabyte of data per flight. The diversity of data generated by an aircraft makes the problem rather challenging, but machine tools are less complex machines that may still generate 1 Terabyte of data per year. The data are not uniform in nature and can be classified in 3 areas: Quality control, Management and Monitoring.

There are data available to test what is means to process machine tool data.

Other data

MPEG is deeply engaged in compressing two strictly non-media data: Genomic and Neural Networks, even though the latter is currently considered as a compression add-on to multimedia content description and analysis. It is also exploring compression of machine tool data.

The MPEG work plan

The figure graphically illustrates the current MPEG work plan. Dimmed coloured items are not (yet) firm elements of the workplan.

 

Posts in this thread

MPEG is a big thing. Can it be bigger?

Introduction

Having become the enabler of a market of devices and services worth 1.5 T$ p.a., MPEG is a big achievement, but is that a climax or the starting point aiming at new highs?

This is a natural question to ask for a group that calls itself “MPEG Future”. The future is still to be written and the success of MPEG will largely depend on the ability of those who attempt to write it.

This article will try and analyse the elements of an answer though the following steps:

  1. The MPEG machine as it has been running for several years
  2. The key success factors
  3. The situation today
  4. What’s next to make MPEG bigger.

The MPEG machine: input, processing and output

To understand if MPEG can be bigger the first thing to do is to understand how could MPEG reach this point. I will start doing that by considering a simplified model of the MPEG standards ecosystem that I believe is what has made MPEG the big thing that we know (Figure 1.

Figure 1: The current MPEG standard ecosystem

The MPEG machine has three iterative phases of operation

  1. MPEG receives inputs from 3 sources:
    1. MPEG members
    2. Partners, i.e. committees who may be interested in developing a joint standard
    3. Customers, typically committees or industry associations who may need a standard in an area for which MPEG has expertise.
  2. MPEG processes inputs and may decide
    1. To start an exploration by studying use cases and requirements, or by exploring technologies
    2. To develop a new MPEG standard, if the exploration is successful
    3. To extend or correct an existing standard.
  3. MPEG generates outputs
    1. To the industry at large, by announcing its work plan, milestones of the work plan such as Calls for Proposal, events such as workshops, results of verification tests etc.
    2. To communicate to partners and customers about how MPEG is handling their inputs or to seek their opinion or to propose new initiatives
    3. To inform partners and customers of the progress of its standard and eventually making standards available.

MPEG’s key success factors

These are the main success factors of MPEG’s handling of the business of standardisation.

  1. Search for customers. MPEG started with the vision of “digital media standards for a global market” but it did not have – and still does not have – a “constituency” whose interest it was expected to further. It assembled the expertise required to implement its vision, but needed to find buyers for its standards. Finding customers is the main element of MPEG’s DNA.
  2. Customer care. Each industry has its own requirements, some shared with others. MPEG needed to find both the common denominator of requirements and the industry-specific requirements, and to design solutions where all industries could operate unencumbered by the requirements of others.
  3. Integrated standards. MPEG has been able to develop complete digital media solutions without leaving its customers struggling with the task of making different pieces from different sources work together. Still the single parts can be individually used.
  4. The role of research. MPEG has been a magnet attracting the best researchers in digital media from both industry and academia and is influencing many research programs.
  5. New customers without losing old ones. With MPEG-2, MPEG had “acquired” the broadcasting industry, but with MPEG-4 it acquired the IT and mobile industry. MPEG succeeded in providing the same standards to both industries even though they were more and more in competition. This has continued in other areas of MPEG standardisation.
  6. Strategic plans. MPEG has developed its program of work through collaboration with its client industries. MPEG does not have a centralised “strategic planning” function, but this function is part of its modus operandi.
  7. Business model. Companies participating in MPEG know that good technologies are rewarded by royalties and that they should invest in technologies for future standards.

The situation today

The MPEG operation has seen many years of successes, but the context has greatly changed.

  1. MPEG has an impressive portfolio of standards actively used by a global number of loyal customers.
  2. The media industry has greatly expanded in scope and its members are becoming more diverse.
  3. MPEG values the requirements of its customers but there are so many technologies fighting for dominance
  4. There is an increasing percentage of MPEG members who come from research/academia or are NPEs
  5. Acquiring new customers in new areas is getting more and more onerous
  6. Many work items in the strategic plan heavily depend on technologies whose development path is unclear
  7. The MPEG business model is still an asset, but may no longer serve the needs of a significant part of MPEG customers.

And now, what?

MPEG Future strives to facilitate the creation of a new environment that will enable the development of standards for media compression and distribution and their adoption for ever more pervasive media-related user experiences.

What should be the principal axes of the new MPEG age advocated by MPEG Future?

Technology? Sure, mastering technology for top performing standards remains important to MPEG, but commonality and synergies of technologies is not the issue. MPEG has a large and dedicated group of experts who explore the implications of new technologies (more is better, but it is not the issue).

Market? Sure, market is important. Companies may be reluctant to talk too much about market but making standards that are driven by research and academia is not the way to go.

The principal axis of the next phase of MPEG work should focus on how market players want to package technologies – that MPEG Future obviously advocates to be standard – to serve market needs.

MPEG Future envisages that a new group called Market Needs be created in MPEG in its new status of subcommittee, next to the existing Technical Requirements group. The latter should continue to explore the technology side of new ideas, while the former should monitor the relevance of new ideas as enabled by technology, to market reality. The new form of the MPEG standards ecosystem is depicted in Figure 2.

Figure 2: The neW MPEG standard ecosystem

There are two main challenges for the new MPEG standards ecosystem

  1. A Market Needs group populated with industry leaders
  2. A modus operandi where inputs from Market Needs to Technical Requirements enrich the technical exploration and results from Technical Requirements to Market Needs are used to strengthen the market value of the a new idea.

MPEG is well placed to create an effective Market Needs group because of its network of partners and customers and is well placed to extend its modus operandi with an effective Market Needs – Technical Requirements interaction. After all MPEG has spent its last 30 years incorporating new communities. The Market Needs community is of a new type, but this makes the challenge all the more enticing…

Posts in this thread

MPEG: vision, execution, results and a conclusion

Introduction

In 1987, a few months before MPEG was the established, ISO TC 97 Data Processing became the ISO/IEC Joint Technical Committee 1 (JTC 1) on Information Technology. With this operation the data processing industry, renamed information technology industry on the occasion, was able to concentrate on a single Technical Committee (TC) all standardisation activities needed by that industry, including those “electrical”, until then the exclusive purview of IEC.

Thirty-two years later, JTC 1 has become a very large TC, but inside IEC things have been all but static. A major achievement, dating back a couple of decades ago, has been the creation of Technical Committee 100 “Audio, Video and Multimedia systems and equipment”, that grouped activities until then scattered in different parts of IEC.

On the occasion of the IEC General Meeting that included meetings of many IEC TCs, a joint TC 100 and JTC 1 workshop was held in Shanghai on 2019/10/19. MPEG was invited to give a talk about how it develops standards and its most promising current projects. The title selected was an unambiguous “The world is going digital – MPEG did so 30 years ago”.

This article will talk about what I said in the first part of my speech: what drove the establishment of MPEG, how MPEG is organised, and main results produced by MPEG. There is a very short conclusion worth reading.

Digitising the audio-visual distribution

The MPEG story is a successful case of digitisation. Thirty years ago, the audio-visual distribution industry engaged in a process that, unlike what is often happening today, was based on international standards. But why did digitisation of the audio-visual distribution industry succeed? That was because of a number of reasons.

At the end of the 1980s

  • Audio-visual data widely used since a few decades in analogue form
  • Everybody understood that digitising analogue data was technically convenient but costly and that use of compression could reduce the size of digital information and even multiply available capacity
  • Compression technology research was beginning to provide exploitable results
  • Just everybody was waiting for an accessible audio-visual compression technology
    • Telcos: new interactive services and video distribution
    • Broacasters: more efficient distribution of old and new services
    • CE Companies: new products for new distribution channels
    • IT Companies: hardware and software for digital audio-visual distribution services.

At that time standards were needed as they are needed today,. However, at that time, the prevailing attitude of the audio-visual sector was that every industry, country, company etc. should have its own “standard”. The MPEG standardisation way of prevailed over the “old way”. Today  international standards are used to compress audio, moving pictures (including 3D Graphics), and to deliver and consume audio-visual data.

MPEG standards were excellent in terms of quality and standardisation made the audio-visual digitisation technology accessible to all users from a plurality of sources. MPEG’s ability to attract all industry made itthe melting pot of the new global audio-visual distribution that eventually became the famed “industry convergence”.

This is well represented by Figure 1 where the many independent analogue basebands of the analogue worlds, instead of becoming many independent digital basebands, became the single MPEG “digital baseband”.

Figure 1: Audio-visual distribution before and after MPEG

The figure applies specifically to MPEG-2 but it is also the conceptual foundation of the large majority of MPEG standards that followed MPEG-2. The repetition of the first success was made possible by the “MPEG business model”:

  1. When developing a standard MPEG requests companies to provide their best technologies
  2. MPEG develops high-performance standards using the best technologies available at a given time frame
  3. Patents holders receive royalties which they may re-invest in new technologies
  4. When time comes, MPEG can develop a new generation of MPEG standards because it can draw from new technologies, some resulting from patent holders’ re-investments.

In the early MPEG days, the “MPEG industries” were those manufacturing devices (implementation industries) and those actually using the devices for their business (client industries). Both were main contributors to the MPEG standards. Today MPEG is quite different from 30 years ago because the context in which it operates has changed substantially. There is a growing role of companies who ow valuable technologies they contribute to MPEG standards but are unlikely to be manufacturers or users of the standard (technology industries), as depicted in Figure 2.

Figure 2: MPEG standards and industries

The MPEG organisation

What is the inside of the machine that produces the MPEG standards? Judging from Figure 3 one could think that it is a very standard machine.

 Figure 3: The MPEG organisational structure

The Requirements subgroup develops requirements for the standards to be developed; 4 technical subgroups – Systems, Video (that includes two joints groups with ITU-T), Audio and 3D Graphics – develop the standards; the Test subgroup assesses the quality; and the Communication subgroup informs the world of the results.

That’s all? Well, no. One must look first at Table 1 to see the “interaction events” between the different subgroups that took place in the last 12 months during MPEG meetings. They were 68 in total, each lasting from one hour to half a day of joint meetings, each involving at least 2 subgroups and some 3 subgroups or more.

Table 1 MPEG interaction events in 2019

  Systems Video Audio 3DG Test
Requirements 6 14 4 2 1
Systems 11 3 9
Video 1 7 4
Audio 3
3DG 1

Table 1 describes the amount of interaction taking place inside MPEG but does not describe how interaction takes place. In Figure 4 the subgroups are represented as a circle surrounding “units” whose first letter indicates the subgroups they belong to. Units are temporary or stable entities within the subgroups who get together (indicated by the arrows) as orchestrated by the subgroup chairs meeting as “Technical Coordination”.

 

Figure 4: MPEG, subgroups and units

It is this ability to mobilise people with the right expertise that allows MPEG to create standards that can be used to make complete audio-visual systems, but can also be used independently (Figure 5).

Figure 5: MPEG makes integrated standards

MPEG standards are technology heavy. How does MPEG make decisions about which technology get into a standard?

Subgroups are tasked to decide which technologies are adopted in a standard. Because standards are so intertwined, ~10% of official meeting time is used to keep members informed of what is being developed/decided in subgroups, though massive use of IT tools. MPEG members can keep themselves informed of what is being discussed and or decided where and when.

The purpose of MPEG plenaries is to review and approve subgroup decisions. These may be challenged at MPEG plenaries (and this has happened less than 10 times in 30 years). Challenges are addressed by applying thoroughly and conservatively the ISO/IEC definition of consensus.

There is an additional aspect that must be considered: MPEG does not have a constituency because if it had one it would be forced to consider the interests of that industry to the possible detriment of other industries.

Therefore, MPEG has partners, with which it develops standards and customers for which it develops standards. this is shown Table 2.

Table 2 MPEG partners (P) and customers (C)

Committee Status Standards
3GPP C 4, DASH
AES C 2, D
ARIB C 2, 4, H, DASH
ATSC C 2, 4, H, DASH
CTA C Several
DASH-F C DASH
DVB C 2, 4, H, DASH
EBU C Several
IEC TC 100 C 2, 4, H
ISO TC 276 P G
ISO/IEC JTC 1/SC 24 P MAR RF
ITU-T SC 16 P 2, 4, H, I
JPEG C 4, 7, H
Khronos C I
SCTE C Several
SMPTE C Several
TTA C 2, 4, H, DASH
W3C P Several

The success of MPEG standards

So far MPEG has produced ~180 standards. This would amount to an average of 6 standards per years. In practice it is much more because a standard is a living body that typically evolves to include many Amendments and is published multiple times incorporating those Amendments and Corrigenda. Figure 6 show how productive MPEG has been: in spite of being a working group it has produced more standards than any other JTC 1 Subcommittee.

Figure 6: MPEG has published more standards than any other JTC 1 SC

Table 3 lists the 7 areas in which its standards can be classified and maps each area to one of the MPEG standards. It is easy to see that compression and transport are the areas most populated by MPEG standards.

Table 3 MPEG areas and standards

Areas Standard
Compression 1, 2, 4, 5, C, D, G, H, I
Descriptor compression 7
Content e-commerce 21
Combinations of content formats A
Systems & transport 1, 2, 4, B, DASH, H, I, G
Multimedia platforms E, M
Device & application interfaces M, IoMT, U, V

Table 4 assesses the economic value MPEG has brought to the device manufacturing and service industries. The data refer to 2018. Roughly speaking MPEG-enabled devices are worth ~1 trillion USD p.a. and MPEG-enabled services are worth 0.5 trillion USD p.a.

Table 4 The impact of MPEG standards: 1 T$ (devices), 0.5 T$ (services)

Device manufacturing B$ Services B$
Smartphones 522 Pay-TV 227
Tablets 145 TV advertising 177
Laptops 103 Games, films and music 138
TV sets 100 TV production (US) 40
Video surveillance 37 OTT TV 38
Set Top Boxes 20 Social media 34
Digital cameras 18.9 Enterprise video 13.5
In-vehicle infotainment 15 In-flight Entertainment 5
Video conferencing 5 TV subscriptions 1.5
Commercial drones 1.5 SVOD subscriptions 0.5

We should not forget that the life of a large share of the world population is constantly and pervasively affected by MPEG standards.

Conclusions

MPEG is a unique machine that has produced and counts on producing standards affecting the life of billions of people and wide swathes of industry. Industry and consumers have the right to expect that this machine is allowed to do its work and that no improvised apprentice tamper with it.

Posts in this thread

Who “decides” in MPEG?

If MPEG were a typical company, the answer to this question would be simple. Persons in charge of different levels of the organisation “decide”. But MPEG is not a company and there is no chain of command where A tells B to do C or else.

Decisions are made, but how? As an autocracy, an oligarchy or a democracy? To answer these questions, let’s first see how the work in MPEG is organised.

The convenor chairs the 3 plenary sessions.

  1. Monday morning: the results of the ad hoc groups established at the meeting before are presented and the work of the week is organised. The meeting last typically 3 hours. Typically no “decisions” are made.
  2. Wednesday morning: the results of the first two days of work are presented and the work of the rest of the week is organised. Comments and questions for clarification may be asked. Typically the meeting schedule for the next two years is approved, based on the recommendation of a group called “Convenor’s Advisors” who assesses proposals for meeting venues. This can hardly be called a “decision”. The meeting typically lasts 2 hours.
  3. Friday afternoon: the recommendations from subgroups are reviewed by the plenary. Typically they are read, possibly edited and, as a rule, accepted, unless there is a exception I will talk about later. One can say that the plenary “decides”, but actually it ratifies. The meeting typically last 4 hours.

So, where are decisions made?

To answer this question let’s see how the technical work is done in the subgroups: Requirements, Systems, Video (including groups in collaboration with ITU-T), Audio, 3D Graphics and Tests. This is the rough assignment of responsibilities:

  1. Requirements receives proposals for new work, manages the explorations that lead to issuing Calls for Proposals, participates in the assessment of test results and eventually in the definition of profiles. Requirements makes decisions.
  2. Technology development groups – Systems, Video, groups in collaboration with ITU-T, Audio and 3D Graphics – take the results of the test, develop draft specifications and manage the standard approval process (note that tests are not always required to start a new project, but a requirements definition phase is always present). Technology development groups make decisions.
  3. Tests carries out the growing number of tests that are required for the development of visual standards. Tests does not make decisions, it simply provides the results of test to the appropriate group.

It is clear that decisions are really made by subgroups, but how?

The main ingredients of decisions by technology development groups are input contributions by members and their assessment made by the specific subgroup. Evidence must be brought that a technology does what the proponents claims it does and the main tool to achieve this is called “Core Experiment”. This is carried out by the proponent and at least another independent participant.. The results from the two must be compatible and prove that the technology brings gains to be accepted into the standard.

The decisions of the technology development groups are not easy, but getting to a decision can be achieved in a structured way because technology plays an overriding role. Definitely less structured is the process managed by the Requirements group, done just by itself or jointly with a technical group. The decisions to be made are of the type: “does this proposal for new work make sense” or “is this profile needed”?

The question “does this proposal make sense” leaves ample margins for decision because a new technology may be in competition with another existing technology, can be immature, addresses a questionable need etc. MPEG tends to be open to new proposals based on the principle that if someone needs something, why should those unconcerned prohibit the work? After all, the task of MPEG is not to make a “decision” for the new work to start, but only to make a preliminary assessment of a proposal so that a formal proposal for a new work item can be made and voted by National Bodies.

So far, all MPEG proposals for new standards have passed the ISO acceptance criterion of simple majority of P members in the committee approving the proposal.

Adoption of a profile can also be a really tricky matter. Profiles are levels of performance, typically enabled by the presence of technologies in the profiles, required by certain application domains. How much is the profile driven by technology and how much by the market? Discussions may drag on for a long time but eventually a decision must be made. This one is a real decision because the profile becomes part of the standard.

In ISO, to which MPEG belongs, decision must be made by consensus. The ISO definition of consensus is

General agreement, characterized by the absence of sustained opposition to substantial issues by any important part of the concerned interests and by a process that involves seeking to take into account the views of all parties concerned and to reconcile any conflicting arguments.

NOTE    Consensus need not imply unanimity.

Therefore MPEG subgroups make decisions based on this definition of consensus.

What about the exception made at plenaries I was talking about before? It may happen – and probably it did happen less that 10 times in the 30 years of MPEG history – that the party whose wish was overruled by a decision made by a subgroup based on the above definition of consensus, challenges the subgroup consensus at the Friday plenary. The plenary applies the definition of consensus in a very conservatory way to determine if the challenge has to be accepted or the subgroup decision is confirmed.

We can now see that the question “is MPEG an autocracy, an oligarchy or a democracy?” is the wrong question. The right one is the title of this article “who decides in MPEG”. The answer is: MPEG members, if they want to decide. If not the chairs or the convenor have the task of finding a way to a consensus or else declare that no consensus was found. Then, no decision is made.

Posts in this thread

What is the difference between an image and a video frame?

The question looks innocent enough. A video is a sequence of images (called frames) captured and eventually displayed at a given frequency. However, by stopping at a specific frame of the sequence, a single video frame, i.e. an image, is obtained.

If we talk of a sequence of video frames, that would always be true. It would also be true if an image compression algorithm (an “intra-frame” coding system) is applied to each individual frame. Such coding system may not give an exciting compression ratio, but can serve very well the needs of some applications, for instance those requiring the ability to decode an image using just one compressed image. This is the case of Motion JPEG (now largely forgotten) and Motion JPEG 2000 (used for movie distribution and other applications) or some profiles of MPEG video coding standards used for studio or contribution applications.

If the application domain requires more powerful compression algorithms, the design criteria are bound to be different. Interframe video compression that exploits the redundancy between frames must be used. In general, however, if video is compressed using an interframe coding mode, a single frame may very well not be an image because its pixels may have been encoded using pixels of some other frames. This can be seen in the image below dating back 30 years ago in MPEG-1 times.

The first image (I-picture) at the left is compressed using only the pixels in the image. The fourth one (P-picture) is predictively encoded starting from the I-Picture. The second and third image (B-pictures) are interpolated using the first and the fourth. This continue in the next frames where the sequence can be P-B-B-B-P where the last P-picture is predicted from the first P-picture and 3 interpolated pictures (B-pictures) are created from the first and the last P pictures.

All MPEG intraframe coding schemes – MPEG-1, MPEG-2, MPEG-4 Visual and AVC, MPEG-H (HEVC), and MPEG-I (VVC) – have intraframe encoded pictures. This is needed because in broadcasting applications the time it takes for a decoder to “tune-in” must be as short as possible. Having an intra-coded picture, say, every half a second or every second, is a way to achieve that. Having intra-coded pictures is also helpful in interactive applications where the user may wish to jump anywhere in a video.

Therefore, some specific video frames in an interframe coding scheme can be images.

Why don’t we make the algorithms for image coding and intra-coded pictures of an interframe coding scheme the same?

We could but this has never been done for several reasons

  1. The intra-coding mode is a subset of a general interframe video coding scheme. Such schemes are rather complex, over the years many coding tools have been designed and when the intraframe coding mode is designed some tools are used because “they are already there”.
  2. Most applications employing an interframe coding scheme have strict real time decoding requirements. Hence complexity of decoding tools plays a significantly more critical role in an interframe coding scheme than in a still picture coding scheme.
  3. A large number of coding tools in an interframe video coding scheme are focused on motion-related processing.
  4. Due to very large data collected in capturing video than capturing images, the impact of coding efficiency improvement is different.
  5. Real time delivery requirements of coded video have led MPEG to develop significantly different System Layer technologies (e.g. DASH) and make different compromises at the system layer.
  6. Comparisons between the performance of the still picture coding mode of the various interframe coding standards with available image coding standards have not been performed in an environment based on a design of tests agreed among experts from all areas.
  7. There is no proven need or significant benefit of forcing the still picture coding mode of an MPEG scheme to be the same as any image compression standard developed by JPEG or vice-versa.

There is no reason to believe that this conclusion will not be confirmed in future video coding systems. So why are there several image compression schemes that have no relationship with video coding systems? The answer is obvious: the industry that needs compressed images is different than the industry that needs compressed video. The requirements of the two industries are different and, in spite of the commonality of some compression tools, the specification of the image compression schemes and of the video compression schemes turn out to be different and incompatible.

One could say that the needs of traditional 2D image and video are well covered by existing standards, But what about new technologies that enable immersive 2D visual experiences?

One could take a top-down philosophical approach. This is intellectually rewarding but technology is not necessarily progressing following a rational approach. The alternative is to take a bottom-up experiential approach. MPEG has constantly taken the latter approach and, in this particular case, it acts in two directions:

  1. Metadata for Immersive Video (MIV). This representsa dynamic immersive visual experience with 3 streams of data: Texture, Depth and Metadata. Texture information is obtained by suitably projecting the scene on a series of suitably selected planes. Texture and Depth are currently encoded with HEVC.
  2. Point Clouds with a large number of points can efficiently represent immersive visual content. Point clouds are projected on a fixed number of planes and projections can be encoded using any video codec.

Both #1 and #2 coding schemes include the equivalent of video intra-coded pictures. As for video, these are designed using the tools that exist in the equivalent of video inter-coded pictures.

Posts in this thread

MPEG and JPEG are grown up

Introduction

A group of MPEG and JPEG members have developed a proposal seek to leverage the impact MPEG and JPEG standards have had on thousands of companies and billions of people all over the world.

A few numbers related to 2018 tell a long story. At the device level, the installed base of MPEG-enabled devices was worth 2.8 trillion USD and the value of devices in that year was in excess of 1 trillion USD. At the service level, the revenues of the PayTV industry were ~230 billion USD and of the total turnover of the global digital terrestrial television was ~200 billion USD.

Why we need to do something

So far MPEG and JPEG were hosted by Subcommittee 29 (SC 29). The group thinks that it is time to revitalise the 27-year old SC 29 structure. To achieve the goal, let’s make the following considerations:

  1. MPEG has been and continues to be able to conceive strategic visions for new media user experiences, design work plans in response to industry needs, develop standards in close collaboration with client industries, demonstrate their performance and promote their use.
  2. For many years MPEG and JPEG have provided standards to operate and innovate the broadcast, broadband and mobile distribution industries, and the imaging industry, respectively;
  3. MPEG and JPEG have become the reference committee for their industries;
  4. MPEG reference industries’ needs for more standards continue to grow causing a sustained increase in MPEG members attending (currently 600);
  5. JPEG and MPEG have a track record of widely deployed standards developed for and in collaboration with other committees that require a more appropriate level of liaison;
  6. MPEG and JPEG operate as virtual SCs, each with a structure of interacting subgroups covering the required areas of expertise, including a strategic planning function;
  7. MPEG and JPEG have independent and and universally recognised strong brands that must be preserved unfettered and enhanced;
  8. MPEG and JPEG are running standardisation projects whose operation must be guaranteed;

A Strengths-Weaknesses-Opportunities-Threats (SWOT) analysis has been carried out on MPEG. The results point to the need for MPEG

  1. To achieve an SC status compatible with its wide scope of work and large membership (1500 registered members and 600 attending physical meetings)
  2. To retain its scope and structure slightly amended to improve the match of standards with market needs and leverage internal talents
  3. To keep and enhance the MPEG brand.

What should be done

This is the proposal

  1. MPEG becomes a JTC 1 SC (SC 4x) with the title “MPEG compression and delivery of Moving Pictures, Audio and Other Data”;
  2. JPEG becomes SC 29 with the title “JPEG Coding of digital representations of images”;
  3. MPEG/JPEG subgroups become working groups (WG) or advisory groups (AG) of SC 4x/SC 29. MPEG adds a Market needs AG;
  4. Both SC 4x and SC 29 retain existing collaborations with ITU-T and their collaborative stance with other committees/bodies, e.g. by setting up joint working groups (JWG);
  5. SC 4x may create, in addition to genomics, WGs/JWGs for compression of other types of data with relevant committees, building on MPEG’s common tool set;
  6. If selected as secretariat (a proposal for a new SC 4x requires that a National Body be ready to take the secretariat), the Italian National Body (ITNB) is willing to make the following steps to expedite a smooth transition:
    1. Nominate the MPEG convenor as SC 4x chair;
    2. Nominate an “SC 4x chair elect” from a country other than Italy using criteria of 1) con-tinuity of MPEG’s vision and strategy, 2) full understanding of the scope of SC 4x and 3) record of performance in the currently held position;
    3. Call for nominations of convenors of SC 4x working groups (WG). We nominate current subgroup chairs as convenors of the respective WG

The benefits of the proposal

The proposal brings a significant number of benefits

  1. It has a positive impact on the heavy load of MPEG and JPEG work plans:
    1. It supports and enhances MPEG work plan, as MPEG is moved to SC 4x, retaining its proven structure, modus operandi and relationships with client industries in scope;
    2. It supports and enhances JPEG work plan, as SC 29 elevates JPEG SGs to WGs, retaining its proven modus operandi and relationships with client industries in scope;
  2. It preserves and builds upon the established MPEG and JPEG brands;
  3. It retains and improves all features of MPEG success, in particular its structure and modus operandi:
    1. SC 4x holds its meetings collocated with the meetings of its WGs and AGs requesting to meet;
    2. SC 4x facilitates the formation of break-out groups during meetings and of ad hoc groups in between meetings;
    3. SC 4x exploits inter-group synergies by facilitating joint meetings between different WGs and AGs during physical meetings;
    4. SC 4x promotes use of every ICT tools that can improve its effectiveness, e.g. teleconferencing and MPEG-specific IT tools to support standards development.
  4. It enhances MPEG’s and JPEG’s collaboration stance with other committees via Joint Working Groups;
  5. It improves MPEG’s supplier-client relationship with its client industries with its new status;
  6. It adds formal governance to the well-honed MPEG and JPEG structures;
  7. It balances continuity and renewal of MPEG leadership at all levels;
  8. It formalises MPEG’s and JPEG’s high-profile standard reference roles for the video and image sectors, respectively.

The title and scope of SC 4x

Upon approval by JTC 1 and ratification by the TMB, SC 4x will assume the following

  1. Title: MPEG compression and delivery of moving pictures, audio and other data;
  2. Scope: Standardisation in the area of efficient delivery of moving pictures and audio, their descriptions and other data
    • Serve as the focus and proponent for JTC 1’s standardisation program for broadcast, broadband and mobile distribution based on analysis, compression, transport and consumption of digital moving pictures and audio, including conventional and immersive, generated or captured by any technology;
    • Serve as the focus and co-proponent for JTC 1’s standardisation program on efficient storage, processing and delivery of genomic and other data, in agreement and collaboration with the relevant committees.

The SC 4x structure

  1. WG 11 subgroups become:
  2. SC 4x Advisory Groups (AG) – do not produce standards;
  3. SC 4x Working Groups (WG) – produce standards;
  4. Minor adjustments to WG 11 subgroup structure made to strengthen productivity:
  1. New Market needs AG to enhance alignment of standards with market needs (to be installed at an appropriate time after establishment of SC 4x);
  2. Genome Coding moves from a Requirements activity to WG level;
  3. SC 4x retains WG 11’s collaborative stance with other committees/bodies, e.g. Collaborative Teams with ITU-T on Video Coding and Joint Working Groups with ISO/IEC committees to carry out commonly agreed projects;

Joint Working Groups (JWG) may be established if the need for common standards with other ISO/IEC committees is identified.

SC 4x will constantly monitor the state of standards development and adapt its structure accor­dingly, including by establishing new WGs, e.g. on standards for other data types.

SC 4x meetings

  1. For the time being, to effectively pursue its standardisation goals, SC 4x will continue its practice of quarterly meetings collocated with its AGs and WGs (same time/place) organised as an “SC 4x week”, virtually the same of that of MPEG. Extended plenaries are joint meetings of all WGs/AGs. SC 4x plenaries held on the Sunday before and during an hour after the extended plenary on Friday. The last plenary deals with matters such as liaisons, meeting schedules etc that used to be handled by WG 11 plenaries
Day Time Meeting Chaired by
Sunday 14-16 SC 4x plenary Chair
Monday 09-13 Extended SC 4x plenary to review AhG reports and plan for the week Chair elect
Wednesday 09-11 Extended SC 4x plenary to review work done so far by AGs/WGs and plan for the rest of the week Chair elect with Tech. Coord. AG Convenor
Friday 14-17 Extended SC 4x plenary to review and approve recommend­ations produced by AGs/WGs Chair
Friday 17-18 Plenary to act on matters requiring SC 4x intervention Chair
  1. WGs and AGs could have longer meeting durations (i.e. start before first SC 4x meeting);
  2. Carry out a thorough review of all details of meeting sessions, agendas, document regis­tration etc. with the involvement of all affected experts;
  3. Institut Mines Télécom’s unique services offered for the last 15 years would be warmly welcome to preserve and continually improve WG 11’s operating efficiency with the involvement of all WG/AG members.

Title and scope of SC 29

(the following is a first attempt at defining the SC 29 title and scope after creation of SC 4x)

Upon approval by JTC 1, SC 29 will change its title and scope as follows:

  1. Title: JPEG coding of digital representations of images
  2. Scope: Development of international standards for
  • Efficient digital representations, processing and interchange of conventional and immersive images
  • Efficient digital representations of image-related sensory and digital data, such as medical and satellite
  • Support to digital image coding applications
  • Maintenance of ISO/IEC 13522

The structure of SC 29

  1. WG 11 subgroups become:
  1. SC 4x Advisory Groups (AG) – do not produce standards;
  2. SC 4x Working Groups (WG) – produce standards;
  3. SC 29 may set up Joint Working Groups, e.g. with SC 4x and TC 42, to carry out commonly agreed projects;

(the following is a first attempt at defining the SC 29 structure after creation of SC 4x, using the current SG structure of WG 1)

  1. SC 29 meetings: similar organisation as currently done by JPEG.

Why MPEG and JPEG do not work together?

This is a reasonable question, and has a simple answer. They can and should, however, the following should be taken into consideration

In an MPEG moving picture codec, there is always a still picture coding mode, a mode of the general moving picture coding scheme, whose tools are a subset of the tools of the complete moving picture coding scheme.

No need or significant benefit has ever been found that justifies the adoption of a JPEG image coding scheme, as the still picture coding mode of an MPEG moving picture coding scheme. Ditto for other schemes

There is no reason to believe that the same should not apply to such media types as point cloud and lightfield. The still picture coding mode of a dynamic (time dependent) point cloud or lightfield coding scheme uses coding tools from the general coding scheme, not those independently developed for images.

Image compression schemes have their own market. separate from the market of moving picture compression schemes. Often the market for images anticipates the market for moving pictures. That is why independent JPEG standards can be useful.

Posts in this thread

Standards and collaboration

The hurdles of standardisation today

Making standards is not like any other tasks. In most cases it is technical in nature because it is about agreeing on and documenting how certain things should be done to claim to be conforming to the standard. Standards can be developed unilaterally by someone powerful enough to tell other people how they should do things. More often, however, standards are developed collaboratively by people who share an interest in a standard, i.e. in enabling those who are willing to do certain things in the same way to have an agreed reference.

Many years ago, making a standard required that those who developed it just talk to people in their environment. Before MPEG all television distribution industries were silos sharing at most some technologies here and there. This is shown in Figure 1.

Figure 1 – The video industry – Before MPEG

By specifying a common “digital baseband” layer, MPEG standards prompted industry convergence, as shown in Figure 2.

Figure 2 – The video industry – After MPEG

Today, and especially in the domain of digital media, it is common not to have the luxury of defining a standard in isolation. Systems get more and more complex and their individual elements – which may be implementations of  standards – have to interact with other elements – which are again possibly implementations of other standards.

Some of these standards are produced by the same standards organisation while other standards are produced by different organisations. How is it possible to make sure that the “standard” elements used to make the system fit nicely, if there is no one overseeing the overall process?

The answer is that, indeed, it is not possible. If it happens it is because of luck or because there were enough people of good will who cared to attend the different groups to ensure coordination.

In some cases, all standards used to make the systems are produced by groups belonging to the same standards organisation. Some of these organisations, however, think that they can solve the problem of interoperability of standards by defining precise borders (“scopes”) within which a group of experts is allowed to develop standards.

This approach probably worked decently well in the past that is represented by Figure 1. However, this approach is destined to become less and less practical to implement and the result to become less and less satisfactory and reliable.

Many standards for use today must be conceived more on their ability to integrate or interface to technologies from different sources than on the traditional “territory” delimited by the “scope” or “terms of reference” etc. of the group that created it. This trend will only continue in the future. A new approach to standardisation must be developed and put to work.

A “systems-based” approach to standardisation

That the scope-based approach to standardisation is no longer serving its original purpose does not mean that it should be abandoned. It should just be given a different purpose. So far, the “scope” was more like the ring of walls that protected medieval towns against invasions. However, the scope should become an area of competence where “gates” can be “opened” so that “alliances” with other groups can be stipulated.

MPEG has put this attitude into practice for many years. The success of MPEG standards is largely based on this attitude.

Here follows a list of cases.

Collaboration with ISO/TC 276 for the creation of a standard for DNA read compression

In the first half of 2010’s MPEG identified “compression of DNA reads” generated by high-speed sequencing machines as an area where its coding expertise can be put to good use. MPEG investigated the field and identified a first set of requirements. As DNA can certainly not be assimilated to “moving pictures and audio” (the area MPEG is competent for) MPEG experts met with TC 276 Biotechnology to present their findings and propose a collaboration.

This move was positively received because TC 276 was indeed in need for such a standard but did not have the expertise to develop it. Therefore, MPEG and TC 276 engaged in a joint effort to refine the requirements of the project.

Then TC 276 entrusted the development of the standard (called MPEG-G) to MPEG on condition of regular reports to TC 276. Ballots on the standard at different phases of development were managed by MPEG, and the results were reported to TC 276.

Today the joint MPEG-TC 276 “venture” has produced 3 standards (File format, Compression, and API and Metadata), is finalising two standards (Reference software and Conformance) and has issued a Joint Call for Proposals for a 6th standard on “Genomic Annotation Representation”.

This is an excellent example of MPEG “entrepreneurship”. Some experts saw the opportunity to develop a DNA read compression standard using the MPEG “toolkit”. They “opened a gate” to communicate with the Biotechnology world and were lucky to find that Biotechnology was equally happy to “open a gate” on their side.

Collaboration with a non-ISO standards group in need of standards MPEG can develop

The MPEG-4 project, started in 1993 (!), has been the first consistent effort by MPEG to provide standards usable by the IT and mobile industry. The 3rd Generation Partnership Project (3GPP), so named because it started in December 1998, at the time of 3G (now we are at 5G and looking forward to 6G) is a very successful international endeavour providing standards for the entire protocol stack needed by the mobile industry (that largely includes the IT industry).

Quite a few MPEG experts attend 3GPP meetings. They are best placed to understand 3GPP’s early standardisation needs. Here I will mention two successful cases.

3GPP needed a file format for multimedia content and MPEG had developed the ISO Based Media File Format (ISOBMFF, aka MP4 File Format). MPEG liaised with 3GPP using its common members, understood the requirements and developed a specification that is essentially a restriction of ISOBMFF (ETSI TS 126 244).

More recently (end of 2010’s), 3GPP has initiated studies on adaptive streaming of time-dependent media. MPEG experts attending 3GPP saw the opportunity and convinced 3GPP that they should entrust to MPEG the development of the standard. MPEG developed requirements that were checked for consistency with 3GPP needs at 3GPP meetings by the common MPEG-3GPP experts. MPEG developed 3GPP-DASH standard and the experts attending both MPEG and 3GPP relayed the necessary information to 3GPP and checked that the choices made by MPEG were agreeable to 3GPP. The 3GPP-DASH specification is ETSI TS 126 247.

In the case of DASH, an industry forum (DASH-IF) was formed to handle the needs of industry members who cannot afford to join MPEG. Experts attending both MPEG and DASH-IF relay information in both directions. The information brought to MPEG has given and is still giving rise to amendments to the DASH standard supporting more functionalities.

DASH is again an excellent example of MPEG entrepreneurship. MPEG “opened gates” to DASH that are still very busy and connect to many more external “gates”, e.g. Digital Video Broadcasting) DVB, Hybrid Broadcast Broadband TV (HbbTV).

Collaboration with an ISO/IEC committee needing MPEG standards to enhance use of its standards

MPEG “opened gates” to JPEG to respond to its needs for “Systems” support to its standards.

The original JPEG image compression standards was widely used in the early days of digital video because it could use inexpensive VLSI chips implementing the relatively simple JPEG codec to store and transmit sequences of individual images (video frames). However, there was no specification for this “Motion JPEG”.

In the early 2000’s, JPEG 2000 appeared as the next generation image compression standard and JPEG needed a file format to store and transmit sequences of individually JPEG 2000 coded images. MPEG gladly adapted the ISOBMFF to make it able to carry sequences of JPEG 2000 and original JPEG images. The file format has allowed wider use of JPEG 2000, e.g. by the movie industry.

A related case is provided by the JPEG need to enable transport of two image compression formats – JPEG 2000 and JPEG XS – on the successful MPEG transport standard, MPEG-2 Transport Stream. For both case MPEG received requests with a first set of requirements. It analysed the requests, added other requirements and sent them back to JPEG. An occasional face-to-face meeting was needed to close the requirements and to provide suggestion for minor extensions to the JPEG standard.

MPEG developed and balloted the amendment to carry JPEG 2000 and JPEG XS on MPEG-2 Transport Stream.

Collaboration to develop a specific instance of an ISO/IEC committee’s general standard

MPEG has two instances of this form of collaboration: Internet of Media Things (IoMT) and Network Based Media Processing (NBMP). The former is about APIs for discovery of and interaction between “Media Things” (e.g. cameras, microphones, displays and loudspeakers) communicating according to the Internet of Things (IoT) paradigm. The latter is a set of APIs allowing a device (e.g. a handset) to get some processing on media done by a networked service.

In JTC 1 MPEG stands out for its standards because they offer interoperability between implementations as opposed to most other standards which are about frameworks and architectures. This does not mean that MPEG does not need architectures. It needs them but it makes no sense to develop its own architectures. Much better if its architectures are specific instances of general architectures. This is true of IoMT and NBMP.

SC 41 was in the process of developing a general architecture for Internet of Things (IoT). MPEG developed a draft architecture and had it validated by SC 41.

SC 42 has developed a general architecture for Big Media Data. MPEG is developing Network based Media Processing (BNMP), which can be seen as an instance of the general Big Media Data architecture. Work on aligning the architectures of the two development is progressing.

MPEG collaborates on a standard that is also of interest to another ISO/IEC committee

This is the case of the Mixed and Augmented Reality Reference Model that MPEG has jointly developed with SC 24. This happened because SC 24 needed a framework standard for Mixed and Augmented Reality, from the architectural viewpoint and MPEG had similar interests, but from the bit-level interoperability viewpoint. SC 24 and MPEG agreed on the requirements for the standard and established a joint group (in this case, a Joint Ad hoc Group) with terms of reference, two chairs (one for SC 24 and one for MPEG) and a timeline. Ballots were handled by the SC 24 secretariat and the Joint Ad hoc Group resolved the comments from both NBs.

Collaboration to enable MPEG to develop an extension of one of its standards that falls under another ISO/IEC committee’s scope

This case is exemplified by a scenario under which MPEG and JPEG have collaborated towards a new image coding standard that is based on an MPEG moving picture coding standard.

This happened because conventional video coding standards need to support “clean” switching between different channels in broadcast applications, and random access for other use cases. This allows a decoder to reconstruct certain pictures in a video sequence (intra pictures), independently from other pictures in that sequence.

MPEG wished to develop the High Efficiency Image Format (HEIF) by defining a special case of the ISOBMFF relative to the HEVC intra picture mode. In a face-to-face meeting this goal was agreed and HEIF is now a successful file format supporting many modalities of interest to users.

Conclusions

The scope of work of an ISO/IEC committee is certainly useful as a reference. However, the current trend toward more convergence and more complex systems that rely on multiple standards requires a more flexible “gate” approach exemplified above. A committee may “open gates” toward another committee and the two may committees agree on developing specific projects. This approach does not work in a “defence of territory” mode where collaborations are seen as limiting a committee’s freedom, but by seeing collaborations with other committees and groups as opportunities to develop standards with a larger field of use where the constituencies of both committees share the benefits.

The examples mentioned in this article are actual cases that show how the extent of the MPEG scope and the modalities of collaboration used have been made possible by the use of the “gate” approach to develop collaborative standards.

Posts in this thread

 

 

 

The talents, MPEG and the master

Introduction

In the parable of the Talents the Gospel tells the story of a master who entrusts 5 talents (a large amount of money at that time) to one servant and 2 talents to another before leaving for a long travel. The first servant works hard and doubles his talents, while the second plays safe and buries the talents. When the master returns, he awards the first servant and punishes the second.

Thirty-one years ago, MPEG was given the field of standards for coding of moving pictures and audio to exploit. Now the master comes. To help him make the right judgement about the use of the the talents that he gave, I will briefly review the milestones reached in these years. Of course, I am not going to revisit all the MPEG standardisation areas developed in the last 31 years. There are several posts in this blog (see the list at the bottom of the page) and in the book A vision made real – Past, present and future of MPEG), I will just take some snapshots of the major achievements.

Making some media digital

Before MPEG-1 there had been attempts at making media digital, but MPEG-1 was the first standards that made the media really digital in consumer products: Video CD brought movies on CD, Digital Audio Broadcasting (DAB) made the first digital radio and MP3, well, that simply created the new music experience triggering a development that continues to this day. This was possible thanks to the vision that a global audio-video-systems standard would take over the world. It did.

Making television digital

MPEG-1 did not make all media digital, television was the major exception. This was an intricate world where politics, commercial interests, protection of culture and more had defied all attempts made by established standards organisations. MPEG applied its recipe and produced an effective MPEG-2 specification that added DSM-CC to support TV distribution on cable. Sharp vision, excellent technology and unstinting promotion efforts delivered the result.

Making media ICT friendly

MP3 encoding and decoding on PC was achieved in the early days of the standard, but an announcement by Intel that MPEG-2 video could be decoded in real time on their x86 chips made headlines. The real marriage between media and ICT – defined as IT + mobile – was the planned result of MPEG-4. Two video standards in sequence (Visual and AVC), the ultimate audio format (AAC in all its variations), the File Format (ISO Based Media File Format – ISOBMFF), Fonts (Open Font Format) and a lot of other standard technologies still largely in use today in spite of the fast-evolving technology scenario.

Media not just for humans

MPEG-7, conceived in the mid-1990’s, was a project ahead of its time.MPEG-7, conceived in the mid-1990’s, was a project ahead of its time. It was triggered by the vision that 500 TV channels would become available thanks to the saving of MPEG-2 on cable with the technology of the time. The idea was to enable the description of content – audio, video and multimedia – in the same bit-thrifty way as MPEG had done for MPEG-1/-2 and was doing for MPEG-4. Then descriptions would be distributed to machines to enable them to respond to human queries. Audio-Visual Description Profile (AVDP) is an example of how MPEG-7 is used in the content production world, but more is expected in the upcoming Video Coding for Machines work.

E-commerce of media

Around the turn of the millennium, there was an intense debate on how media could be handled in the new context enabled by MPEG standards. This had been triggered by the advent of Peer-to-Peer protocols that allowed new forms of distribution somehow at odds with practices and laws. With MPEG-21 MPEG developed a comprehensive framework and a suite of standards to enable e-commerce of media that respected the rights and interests of the parties involved. Some of these are the specification of: Digital Item (DI), identification of DIs and its components, protection of DIs, machine-readable languages to express rights and contracts, adaptation of DIs and more. Industry has taken pieces of MPEG-21, but not the entire framework yet.re. Industry has taken pieces of MPEG-21, but not the entire framework yet.

Standards for media combinations

At the beginning of the new millennium MPEG had collected enough standards that the following question was asked: how can we combine a set of content items each represented by MPEG standards or, when MPEG standards are not available, by other standards, in a standard way? This was the start of MPEG-A, a suite of standards Multimedia Application Formats (MAF). Examples are Surveillance AF, Interactive Music AF (IMAF), Augmented Reality AF ARAF), Common Media AF (CMAF) and Multi-Image (MIAF). CMAF is actually affecting millions of streaming devices today.

Systems-Video-Audio à la carte

With the main elements of the MPEG-4 standard in place, MPEG had the need for systems, video and audio standards without being able to define a unified standard. This was the birth of 3 standard suites: MPEG-B (Systems), MPEG-C (Video) and MPEG-D (Audio). Among the most relevant standards we mention Common encryption format (CENC) for ISOBMFF and MPEG-2 TS, Reconfigurable Video Coding (RVC) and Unified speech and audio coding (USAC). The last is the only standard that is capable to encode audio and speech with a quality superior to the best audio codec and the best speech codec.

Interacting with media

Media can be defined as virtual representations of audio and video information that match, hopefully in a faithful way, something that exists in the real world, or a representation of synthetically-generated audio and video information, or a mix of the two. MPEG started to tackle this issue in the middle of the first decade at the time Second Life offered an attractive paradigm for interaction with synthetically-generated audio and video information. MPEG developed MPEG-V, a framework and a suite of standards for the information flowing from sensors and to actuators and the characteristics of virtual world objects.

Getting media in any way

Broadcasting was the first system for mass distribution of media – audio and video. Originally, it was strictly one way, cable added return information, then the telecommunication networks provided the technical means to achieve full two-way distribution. With its MPEG-2 standard, MPEG provided the full stack from transport up. This was universally adopted by broadcasting, but the Internet Protocol (IP) was the transport selected for telecom distribution. With MPEG-H, MPEG provided a unified solution where content meant for one-way distribution can seamlessly distributed in a two-way fashion. With this Systems-Video-Audio based suite of standards MPEG has achieved unification of media distribution.

Facing an unpredictable internet

Probably most readers have never heard of the Asynchronous Transfer Mode (ATM), designed to transport fixed-size packets on a fixed route between two points before transferring data. ATM’s AAL1 could have guaranteed bandwidth, but had to give way to the leaner and cheaper IP. The successful digitatisation we live in is paid by unpredictability. You start with a good bandwidth between you and the source, but a moment later the bandwidth available is cut by half. A disaster for those who want to provide reliable services. MPEG-DASH is the standard that allows a consumer device to request (video, mostly) information of the appropriate bitrate matching the bitrate made available by the network at a given instant.

The immersive media dream

MPEG has dreamt for a quarter of century to immerse in media 😊. In the second half of the 1990’s MPEG developed the MPEG-2 Multiview Profile, the first attempt at providing the two eyes of a viewer with the kind of different information the eyes receive when they are hit by the light reflected by an object. The latest attempts were the Multiview and 3D extension attempts to HEVC. Technology is maturing, but many the context is far from stable as companies providing solution come and go. MPEG is developing standards in this slippery space based on 6 keypoints:

  1. Architecture for immersive services;
  2. Omnidirectional MediA Format (OMAF) for omnidirectional media applications (e.g. 360° video) and a basis for integration of other technologies;
  3. Immersive video starting from 3DoF+;
  4. Immersive audio (6DoF);
  5. Point Clouds providing a easy way to manipulate 3D visual objects;
  6. Network based Media Processing (NBMP) to allow a user to get the network to do some processing of their media.

Media devices are Things

The Internet of Things (IoT) paradigm is well known but how can we apply the general IoT paradigm to media? MPEG-IoMT (Internet of Media Things) is an MPEG standard suite providing interfaces, protocols and associated media-related information representations that enable advanced services and applications based on human to device and device to device interaction. IoMT will be the platform on which new standards such as Video Coding for Machines will be hosted.

More supple compression

MPEG video coding standards have been hugely successful. However, in certain domain, such as internet streaming adoption encounters non-technical difficulties. Essential Video Coding (EVC) is the standard that will yield excellent performance with the prospect of an easier licensing.

Compression for all

MPEG has developed an impressive number of technologies whose focus is on compression and transport of data. Some are strictly media-related. Others, however, have a more general applicability. That this is true and can be implemented is demonstrated by MPEG-G, a standard that allows efficient transport of DNA reads obtained by high-speech sequencing machines. MPEG-G compression is lossless and will allow savings on storage and transmission costs and in access to DNA information for clinical analyses.

The master returns

The master had a really long travel – 31 years – but has finally returned. Will he say to MPEG: “Well done, good and trustworthy servant; you have been trustworthy in a few things, I will put you in charge of many things; enter into the joy of your master” or will he say: “throw this lazy servant into the outer darkness, where there will be weeping and gnashing of teeth”?

Posts in this thread

 

 

Standards and business models

Introduction

Some could think that the title is an oxymoron. Indeed standards, certainly international ones, are published by not-for-profit organisations. How could they have a business model?

The answer is that around a standard there are quite a few entities, some of which are far from being not-for-profit.

Therefore, this article intends to analyse how business models can influence standards.

The actors of standardisation

Let’s first have a look at the actors of standardisation.

  1. The first actor is the organisation issuing standards. It may be an international organisation such as ISO, IEC or ETSI, or a trade association or an industry forum, but the organisation itself has not been designed to make money. A typical arrangement is a membership fee that allows an individual or a company employee to participate. Another is to make users of the standard pay to obtain the specification
  2. The second actor is the staff of the standards developing organisation. Depending of the type of organisation their role may be marginal of highly influential
  3. The third actor is the company who is a member of the organisation issuing standards.
  4. The fourth actor is the expert, typically the personnel sent by the company to contribute to the development of the standards.

From the interaction of these actors, the the standard is created, Then the standard creates an ecosystem. Companies become member of the ecosystem.

Why do companies participate in standard development?

Here is an initial list of motivations prompting companies to send their personnel to a standards committee.

  1. A company is interested in shaping the landscape of how a new technology will be used by concerned companies or industries. This is the case of Artificial Intelligence (AI), a technology that has recently matured and whose use has different sometimes unexpected implications. JTC 1/SC 42 has recently been formed to define AI architectures, frameworks, models etc. This kind of participation is not exclusive of companies. Universities find it useful to join this “exploratory” work because it may help them identify new research topics.
  2. A company is interested in developing a new product or launch a new service that requires a new standard technology.
  3. A company may be obliged by national regulations to participate in the development of a standard
  4. A company or, more and more often a university, owns technology it believes is useful or even required to draft a standard that a committee plans to develop. Again, a relevant case for this is MPEG where the number of Non-Performing Entities (NPE) is on the rise.
  5. A university or, not infrequently, a company wants to keep abreast of what is going on in a technology field or become aware as early as possible of the emergence of new standards that will affect its domain. MPEG is a typical case because it is a group open to new ideas and is attended by all relevant players.

Not all standards are born equal

The word “equal” in the title does not imply that there is a hierarchy of standards where some are more important than others. It simply means that the same name “standard” can be attached to quite different things.

The compact disc (CD) can be taken as the emblem of a traditional standard. Jointly developed by Philips and Sony, the CD quickly defeated the competing product by RCA and became the universal digital music distribution medium. The technical specification of the CD was originally defined in the Red Book and later became the IEC 60908 standard.

MPEG introduced a new process that replaced the development of a product, the marketplace success and the eventual ratification by a recognised standards organisation. This is how the process can be summarised:

  1. Identify the need of a new standard
  2. Develop requirements
  3. Issue call for proposals (CfP)
  4. Integrate technologies obtained from the CfP
  5. Draft the standard

In the early MPEG days, most participants were companies interested in developing new products or launch new services. They actively contributed to the standards because they needed it but also because they had relevant technologies developed in their laboratories.

Later the range of contributors to standard development got larger. The fact that in the mid- 1990’s a patent from Columbia University, clearly an NPE, had been declared essential in MPEG-2 video made headlines and prompted many to follow suit. The trend so initiated continues to this day.

After MPEG-2 the next step was to revive the old model represented by the CD. MPEG-4 became just one “product” but other companies developed other “products” some of which got a recognition as “standard” by a professional organisation. The creation of such standards implied the conversion of an internal company specification to the standard format of the professional organisation. The use of those “standards” was “free” in the sense that there no fees were charged for their use. However, other strings of less immediate comprehension to laymen were typically attached.

MPEG (formally WG 11) is about “coding of moving pictures and audio”, but a parallel group called JPEG (formally WG 1) is about “coding of digital representations of images”. The two groups operate based on different “business models”. Today the ubiquitous JPEG standard for image compression is royalty free because the 20-year validity of any patents has been largely overcome. However, even before the 20-year limit was crossed, the JPEG standard could be used freely with no charge. The same happened to the less famous but still quite important JPEG2000 standard used for movie distribution and to the less used JPEG XR standard.

More recently a consortium was formed to develop a royalty-free video compression specification. In rough, imperfect but sufficiently descriptive words, members of that consortium can freely use the specification in their products and services.

The business model of a standard is a serious matter

From the above we see that working for a standard has the basic motivation of creating a technology to enable a certain function in an ecosystem. The ecosystem can be anything from the ensemble of users of a product/service of a company, a country, an industry or the world at large. Beyond this common motivation, however, a company contributing to the development of a standard can have widely different motivations that I simplify as follows

  1. The common technology is encumbered because, by rewarding inventions, the ecosystem has embedded the means to constantly innovate its enabling technologies for new products and services. This is the basis of the MPEG business model that has ensured 30 years of development to the digital media industry. It has advantages and disadvantages
    1. The advantage of this model is that, once a licence for the standard has been defined, no one can hold the community hostage.
    2. The disadvantage is that getting agreement to the licence may prove difficult, thus disabling or hampering the business of the entire community.
  2. The common technology is “free” because the members of the ecosystem have assessed that they do not have interest in the technology per se but only in the technology as an enabler of other functions on which their business is built. This is the case of Linux/Android and most web technologies. Here, too, there are advantages and disadvantages
    1. The advantage of this model is that anybody can access the technology by accepting the “free” licence.
    2. The disadvantage is that a member of the ecosystem can be excluded for whatever reason and have its business ruined.

Parallel worlds

It is clear now that “standard” is a name that can be assigned to things that have the promotion of the creation of an ecosystem in common but may be very different otherwise. The way the members of the ecosystem operate is completely different depending on whether the standard is encumbered or free.

Let’s see the concrete cases of MPEG and JPEG. In the late 1980’s they started as two group with roughly the same size (30 people). Thirty years later MPEG has become a 600-member group and JPEG a 60-member group. In spite of handling similar technologies, less than 1% of MPEG members attend JPEG meetings. Why?

The answer is because MPEG decided (more correctly, was forced by the very complex IP environment of video and audio coding) to adopt the encumbered standard model while JPEG could decide to adopt the free standard model. In the last 30 years companies have heavily invested in MPEG standards because they have seen a return from that investment, and a host of new companies were created and are operating thanks to the reward coming from their inventions. JPEG developed less because fewer companies saw a return from the free standard business model.

A low number of common members exists between MPEG and JPEG because the MPEG and JPEG business models are antithetical.

Conclusions

I would like to apply the elements above to some current discussions where some people argue that, since JPEG and some MPEG experts have similar expertise, we should put them together to make “synergy”.

The simple answer to this argument is that it would be foolish to do that. JPEG people produce free standards because, those who have a business in mind, want to make money from something else that is enabled by the free standard. If JPEG people are mixed with MPEG people who want encumbered standards, the business of JPEG people is gone

People have better play the game they know, not improvise competences in things they don’t know. It is more or less the same story as in Einige Gespenster gehen um in der Welt – die Gespenster der Zauberlehrlinge.

The right solution is MPEGfuture.

Posts in this thread