What is the difference between an image and a video frame?

The question looks innocent enough. A video is a sequence of images (called frames) captured and eventually displayed at a given frequency. However, by stopping at a specific frame of the sequence, a single video frame, i.e. an image, is obtained.

If we talk of a sequence of video frames, that would always be true. It would also be true if an image compression algorithm (an “intra-frame” coding system) is applied to each individual frame. Such coding system may not give an exciting compression ratio, but can serve very well the needs of some applications, for instance those requiring the ability to decode an image using just one compressed image. This is the case of Motion JPEG (now largely forgotten) and Motion JPEG 2000 (used for movie distribution and other applications) or some profiles of MPEG video coding standards used for studio or contribution applications.

If the application domain requires more powerful compression algorithms, the design criteria are bound to be different. Interframe video compression that exploits the redundancy between frames must be used. In general, however, if video is compressed using an interframe coding mode, a single frame may very well not be an image because its pixels may have been encoded using pixels of some other frames. This can be seen in the image below dating back 30 years ago in MPEG-1 times.

The first image (I-picture) at the left is compressed using only the pixels in the image. The fourth one (P-picture) is predictively encoded starting from the I-Picture. The second and third image (B-pictures) are interpolated using the first and the fourth. This continue in the next frames where the sequence can be P-B-B-B-P where the last P-picture is predicted from the first P-picture and 3 interpolated pictures (B-pictures) are created from the first and the last P pictures.

All MPEG intraframe coding schemes – MPEG-1, MPEG-2, MPEG-4 Visual and AVC, MPEG-H (HEVC), and MPEG-I (VVC) – have intraframe encoded pictures. This is needed because in broadcasting applications the time it takes for a decoder to “tune-in” must be as short as possible. Having an intra-coded picture, say, every half a second or every second, is a way to achieve that. Having intra-coded pictures is also helpful in interactive applications where the user may wish to jump anywhere in a video.

Therefore, some specific video frames in an interframe coding scheme can be images.

Why don’t we make the algorithms for image coding and intra-coded pictures of an interframe coding scheme the same?

We could but this has never been done for several reasons

  1. The intra-coding mode is a subset of a general interframe video coding scheme. Such schemes are rather complex, over the years many coding tools have been designed and when the intraframe coding mode is designed some tools are used because “they are already there”.
  2. Most applications employing an interframe coding scheme have strict real time decoding requirements. Hence complexity of decoding tools plays a significantly more critical role in an interframe coding scheme than in a still picture coding scheme.
  3. A large number of coding tools in an interframe video coding scheme are focused on motion-related processing.
  4. Due to very large data collected in capturing video than capturing images, the impact of coding efficiency improvement is different.
  5. Real time delivery requirements of coded video have led MPEG to develop significantly different System Layer technologies (e.g. DASH) and make different compromises at the system layer.
  6. Comparisons between the performance of the still picture coding mode of the various interframe coding standards with available image coding standards have not been performed in an environment based on a design of tests agreed among experts from all areas.
  7. There is no proven need or significant benefit of forcing the still picture coding mode of an MPEG scheme to be the same as any image compression standard developed by JPEG or vice-versa.

There is no reason to believe that this conclusion will not be confirmed in future video coding systems. So why are there several image compression schemes that have no relationship with video coding systems? The answer is obvious: the industry that needs compressed images is different than the industry that needs compressed video. The requirements of the two industries are different and, in spite of the commonality of some compression tools, the specification of the image compression schemes and of the video compression schemes turn out to be different and incompatible.

One could say that the needs of traditional 2D image and video are well covered by existing standards, But what about new technologies that enable immersive 2D visual experiences?

One could take a top-down philosophical approach. This is intellectually rewarding but technology is not necessarily progressing following a rational approach. The alternative is to take a bottom-up experiential approach. MPEG has constantly taken the latter approach and, in this particular case, it acts in two directions:

  1. Metadata for Immersive Video (MIV). This representsa dynamic immersive visual experience with 3 streams of data: Texture, Depth and Metadata. Texture information is obtained by suitably projecting the scene on a series of suitably selected planes. Texture and Depth are currently encoded with HEVC.
  2. Point Clouds with a large number of points can efficiently represent immersive visual content. Point clouds are projected on a fixed number of planes and projections can be encoded using any video codec.

Both #1 and #2 coding schemes include the equivalent of video intra-coded pictures. As for video, these are designed using the tools that exist in the equivalent of video inter-coded pictures.

MPEG and JPEG are grown up

Introduction

A group of MPEG and JPEG members have developed a proposal seek to leverage the impact MPEG and JPEG standards have had on thousands of companies and billions of people all over the world.

A few numbers related to 2018 tell a long story. At the device level, the installed base of MPEG-enabled devices was worth 2.8 trillion USD and the value of devices in that year was in excess of 1 trillion USD. At the service level, the revenues of the PayTV industry were ~230 billion USD and of the total turnover of the global digital terrestrial television was ~200 billion USD.

Why we need to do something

So far MPEG and JPEG were hosted by Subcommittee 29 (SC 29). The group thinks that it is time to revitalise the 27-year old SC 29 structure. To achieve the goal, let’s make the following considerations:

  1. MPEG has been and continues to be able to conceive strategic visions for new media user experiences, design work plans in response to industry needs, develop standards in close collaboration with client industries, demonstrate their performance and promote their use.
  2. For many years MPEG and JPEG have provided standards to operate and innovate the broadcast, broadband and mobile distribution industries, and the imaging industry, respectively;
  3. MPEG and JPEG have become the reference committee for their industries;
  4. MPEG reference industries’ needs for more standards continue to grow causing a sustained increase in MPEG members attending (currently 600);
  5. JPEG and MPEG have a track record of widely deployed standards developed for and in collaboration with other committees that require a more appropriate level of liaison;
  6. MPEG and JPEG operate as virtual SCs, each with a structure of interacting subgroups covering the required areas of expertise, including a strategic planning function;
  7. MPEG and JPEG have independent and and universally recognised strong brands that must be preserved unfettered and enhanced;
  8. MPEG and JPEG are running standardisation projects whose operation must be guaranteed;

A Strengths-Weaknesses-Opportunities-Threats (SWOT) analysis has been carried out on MPEG. The results point to the need for MPEG

  1. To achieve an SC status compatible with its wide scope of work and large membership (1500 registered members and 600 attending physical meetings)
  2. To retain its scope and structure slightly amended to improve the match of standards with market needs and leverage internal talents
  3. To keep and enhance the MPEG brand.

What should be done

This is the proposal

  1. MPEG becomes a JTC 1 SC (SC 4x) with the title “MPEG compression and delivery of Moving Pictures, Audio and Other Data”;
  2. JPEG becomes SC 29 with the title “JPEG Coding of digital representations of images”;
  3. MPEG/JPEG subgroups become working groups (WG) or advisory groups (AG) of SC 4x/SC 29. MPEG adds a Market needs AG;
  4. Both SC 4x and SC 29 retain existing collaborations with ITU-T and their collaborative stance with other committees/bodies, e.g. by setting up joint working groups (JWG);
  5. SC 4x may create, in addition to genomics, WGs/JWGs for compression of other types of data with relevant committees, building on MPEG’s common tool set;
  6. If selected as secretariat (a proposal for a new SC 4x requires that a National Body be ready to take the secretariat), the Italian National Body (ITNB) is willing to make the following steps to expedite a smooth transition:
    1. Nominate the MPEG convenor as SC 4x chair;
    2. Nominate an “SC 4x chair elect” from a country other than Italy using criteria of 1) con-tinuity of MPEG’s vision and strategy, 2) full understanding of the scope of SC 4x and 3) record of performance in the currently held position;
    3. Call for nominations of convenors of SC 4x working groups (WG). We nominate current subgroup chairs as convenors of the respective WG

The benefits of the proposal

The proposal brings a significant number of benefits

  1. It has a positive impact on the heavy load of MPEG and JPEG work plans:
    1. It supports and enhances MPEG work plan, as MPEG is moved to SC 4x, retaining its proven structure, modus operandi and relationships with client industries in scope;
    2. It supports and enhances JPEG work plan, as SC 29 elevates JPEG SGs to WGs, retaining its proven modus operandi and relationships with client industries in scope;
  2. It preserves and builds upon the established MPEG and JPEG brands;
  3. It retains and improves all features of MPEG success, in particular its structure and modus operandi:
    1. SC 4x holds its meetings collocated with the meetings of its WGs and AGs requesting to meet;
    2. SC 4x facilitates the formation of break-out groups during meetings and of ad hoc groups in between meetings;
    3. SC 4x exploits inter-group synergies by facilitating joint meetings between different WGs and AGs during physical meetings;
    4. SC 4x promotes use of every ICT tools that can improve its effectiveness, e.g. teleconferencing and MPEG-specific IT tools to support standards development.
  4. It enhances MPEG’s and JPEG’s collaboration stance with other committees via Joint Working Groups;
  5. It improves MPEG’s supplier-client relationship with its client industries with its new status;
  6. It adds formal governance to the well-honed MPEG and JPEG structures;
  7. It balances continuity and renewal of MPEG leadership at all levels;
  8. It formalises MPEG’s and JPEG’s high-profile standard reference roles for the video and image sectors, respectively.

The title and scope of SC 4x

Upon approval by JTC 1 and ratification by the TMB, SC 4x will assume the following

  1. Title: MPEG compression and delivery of moving pictures, audio and other data;
  2. Scope: Standardisation in the area of efficient delivery of moving pictures and audio, their descriptions and other data
    • Serve as the focus and proponent for JTC 1’s standardisation program for broadcast, broadband and mobile distribution based on analysis, compression, transport and consumption of digital moving pictures and audio, including conventional and immersive, generated or captured by any technology;
    • Serve as the focus and co-proponent for JTC 1’s standardisation program on efficient storage, processing and delivery of genomic and other data, in agreement and collaboration with the relevant committees.

The SC 4x structure

  1. WG 11 subgroups become:
  2. SC 4x Advisory Groups (AG) – do not produce standards;
  3. SC 4x Working Groups (WG) – produce standards;
  4. Minor adjustments to WG 11 subgroup structure made to strengthen productivity:
  1. New Market needs AG to enhance alignment of standards with market needs (to be installed at an appropriate time after establishment of SC 4x);
  2. Genome Coding moves from a Requirements activity to WG level;
  3. SC 4x retains WG 11’s collaborative stance with other committees/bodies, e.g. Collaborative Teams with ITU-T on Video Coding and Joint Working Groups with ISO/IEC committees to carry out commonly agreed projects;

Joint Working Groups (JWG) may be established if the need for common standards with other ISO/IEC committees is identified.

SC 4x will constantly monitor the state of standards development and adapt its structure accor­dingly, including by establishing new WGs, e.g. on standards for other data types.

SC 4x meetings

  1. For the time being, to effectively pursue its standardisation goals, SC 4x will continue its practice of quarterly meetings collocated with its AGs and WGs (same time/place) organised as an “SC 4x week”, virtually the same of that of MPEG. Extended plenaries are joint meetings of all WGs/AGs. SC 4x plenaries held on the Sunday before and during an hour after the extended plenary on Friday. The last plenary deals with matters such as liaisons, meeting schedules etc that used to be handled by WG 11 plenaries
Day Time Meeting Chaired by
Sunday 14-16 SC 4x plenary Chair
Monday 09-13 Extended SC 4x plenary to review AhG reports and plan for the week Chair elect
Wednesday 09-11 Extended SC 4x plenary to review work done so far by AGs/WGs and plan for the rest of the week Chair elect with Tech. Coord. AG Convenor
Friday 14-17 Extended SC 4x plenary to review and approve recommend­ations produced by AGs/WGs Chair
Friday 17-18 Plenary to act on matters requiring SC 4x intervention Chair
  1. WGs and AGs could have longer meeting durations (i.e. start before first SC 4x meeting);
  2. Carry out a thorough review of all details of meeting sessions, agendas, document regis­tration etc. with the involvement of all affected experts;
  3. Institut Mines Télécom’s unique services offered for the last 15 years would be warmly welcome to preserve and continually improve WG 11’s operating efficiency with the involvement of all WG/AG members.

Title and scope of SC 29

(the following is a first attempt at defining the SC 29 title and scope after creation of SC 4x)

Upon approval by JTC 1, SC 29 will change its title and scope as follows:

  1. Title: JPEG coding of digital representations of images
  2. Scope: Development of international standards for
  • Efficient digital representations, processing and interchange of conventional and immersive images
  • Efficient digital representations of image-related sensory and digital data, such as medical and satellite
  • Support to digital image coding applications
  • Maintenance of ISO/IEC 13522

The structure of SC 29

  1. WG 11 subgroups become:
  1. SC 4x Advisory Groups (AG) – do not produce standards;
  2. SC 4x Working Groups (WG) – produce standards;
  3. SC 29 may set up Joint Working Groups, e.g. with SC 4x and TC 42, to carry out commonly agreed projects;

(the following is a first attempt at defining the SC 29 structure after creation of SC 4x, using the current SG structure of WG 1)

  1. SC 29 meetings: similar organisation as currently done by JPEG.

Why MPEG and JPEG do not work together?

This is a reasonable question, and has a simple answer. They can and should, however, the following should be taken into consideration

In an MPEG moving picture codec, there is always a still picture coding mode, a mode of the general moving picture coding scheme, whose tools are a subset of the tools of the complete moving picture coding scheme.

No need or significant benefit has ever been found that justifies the adoption of a JPEG image coding scheme, as the still picture coding mode of an MPEG moving picture coding scheme. Ditto for other schemes

There is no reason to believe that the same should not apply to such media types as point cloud and lightfield. The still picture coding mode of a dynamic (time dependent) point cloud or lightfield coding scheme uses coding tools from the general coding scheme, not those independently developed for images.

Image compression schemes have their own market. separate from the market of moving picture compression schemes. Often the market for images anticipates the market for moving pictures. That is why independent JPEG standards can be useful.

Posts in this thread

Standards and collaboration

The hurdles of standardisation today

Making standards is not like any other tasks. In most cases it is technical in nature because it is about agreeing on and documenting how certain things should be done to claim to be conforming to the standard. Standards can be developed unilaterally by someone powerful enough to tell other people how they should do things. More often, however, standards are developed collaboratively by people who share an interest in a standard, i.e. in enabling those who are willing to do certain things in the same way to have an agreed reference.

Many years ago, making a standard required that those who developed it just talk to people in their environment. Before MPEG all television distribution industries were silos sharing at most some technologies here and there. This is shown in Figure 1.

Figure 1 – The video industry – Before MPEG

By specifying a common “digital baseband” layer, MPEG standards prompted industry convergence, as shown in Figure 2.

Figure 2 – The video industry – After MPEG

Today, and especially in the domain of digital media, it is common not to have the luxury of defining a standard in isolation. Systems get more and more complex and their individual elements – which may be implementations of  standards – have to interact with other elements – which are again possibly implementations of other standards.

Some of these standards are produced by the same standards organisation while other standards are produced by different organisations. How is it possible to make sure that the “standard” elements used to make the system fit nicely, if there is no one overseeing the overall process?

The answer is that, indeed, it is not possible. If it happens it is because of luck or because there were enough people of good will who cared to attend the different groups to ensure coordination.

In some cases, all standards used to make the systems are produced by groups belonging to the same standards organisation. Some of these organisations, however, think that they can solve the problem of interoperability of standards by defining precise borders (“scopes”) within which a group of experts is allowed to develop standards.

This approach probably worked decently well in the past that is represented by Figure 1. However, this approach is destined to become less and less practical to implement and the result to become less and less satisfactory and reliable.

Many standards for use today must be conceived more on their ability to integrate or interface to technologies from different sources than on the traditional “territory” delimited by the “scope” or “terms of reference” etc. of the group that created it. This trend will only continue in the future. A new approach to standardisation must be developed and put to work.

A “systems-based” approach to standardisation

That the scope-based approach to standardisation is no longer serving its original purpose does not mean that it should be abandoned. It should just be given a different purpose. So far, the “scope” was more like the ring of walls that protected medieval towns against invasions. However, the scope should become an area of competence where “gates” can be “opened” so that “alliances” with other groups can be stipulated.

MPEG has put this attitude into practice for many years. The success of MPEG standards is largely based on this attitude.

Here follows a list of cases.

Collaboration with ISO/TC 276 for the creation of a standard for DNA read compression

In the first half of 2010’s MPEG identified “compression of DNA reads” generated by high-speed sequencing machines as an area where its coding expertise can be put to good use. MPEG investigated the field and identified a first set of requirements. As DNA can certainly not be assimilated to “moving pictures and audio” (the area MPEG is competent for) MPEG experts met with TC 276 Biotechnology to present their findings and propose a collaboration.

This move was positively received because TC 276 was indeed in need for such a standard but did not have the expertise to develop it. Therefore, MPEG and TC 276 engaged in a joint effort to refine the requirements of the project.

Then TC 276 entrusted the development of the standard (called MPEG-G) to MPEG on condition of regular reports to TC 276. Ballots on the standard at different phases of development were managed by MPEG, and the results were reported to TC 276.

Today the joint MPEG-TC 276 “venture” has produced 3 standards (File format, Compression, and API and Metadata), is finalising two standards (Reference software and Conformance) and has issued a Joint Call for Proposals for a 6th standard on “Genomic Annotation Representation”.

This is an excellent example of MPEG “entrepreneurship”. Some experts saw the opportunity to develop a DNA read compression standard using the MPEG “toolkit”. They “opened a gate” to communicate with the Biotechnology world and were lucky to find that Biotechnology was equally happy to “open a gate” on their side.

Collaboration with a non-ISO standards group in need of standards MPEG can develop

The MPEG-4 project, started in 1993 (!), has been the first consistent effort by MPEG to provide standards usable by the IT and mobile industry. The 3rd Generation Partnership Project (3GPP), so named because it started in December 1998, at the time of 3G (now we are at 5G and looking forward to 6G) is a very successful international endeavour providing standards for the entire protocol stack needed by the mobile industry (that largely includes the IT industry).

Quite a few MPEG experts attend 3GPP meetings. They are best placed to understand 3GPP’s early standardisation needs. Here I will mention two successful cases.

3GPP needed a file format for multimedia content and MPEG had developed the ISO Based Media File Format (ISOBMFF, aka MP4 File Format). MPEG liaised with 3GPP using its common members, understood the requirements and developed a specification that is essentially a restriction of ISOBMFF (ETSI TS 126 244).

More recently (end of 2010’s), 3GPP has initiated studies on adaptive streaming of time-dependent media. MPEG experts attending 3GPP saw the opportunity and convinced 3GPP that they should entrust to MPEG the development of the standard. MPEG developed requirements that were checked for consistency with 3GPP needs at 3GPP meetings by the common MPEG-3GPP experts. MPEG developed 3GPP-DASH standard and the experts attending both MPEG and 3GPP relayed the necessary information to 3GPP and checked that the choices made by MPEG were agreeable to 3GPP. The 3GPP-DASH specification is ETSI TS 126 247.

In the case of DASH, an industry forum (DASH-IF) was formed to handle the needs of industry members who cannot afford to join MPEG. Experts attending both MPEG and DASH-IF relay information in both directions. The information brought to MPEG has given and is still giving rise to amendments to the DASH standard supporting more functionalities.

DASH is again an excellent example of MPEG entrepreneurship. MPEG “opened gates” to DASH that are still very busy and connect to many more external “gates”, e.g. Digital Video Broadcasting) DVB, Hybrid Broadcast Broadband TV (HbbTV).

Collaboration with an ISO/IEC committee needing MPEG standards to enhance use of its standards

MPEG “opened gates” to JPEG to respond to its needs for “Systems” support to its standards.

The original JPEG image compression standards was widely used in the early days of digital video because it could use inexpensive VLSI chips implementing the relatively simple JPEG codec to store and transmit sequences of individual images (video frames). However, there was no specification for this “Motion JPEG”.

In the early 2000’s, JPEG 2000 appeared as the next generation image compression standard and JPEG needed a file format to store and transmit sequences of individually JPEG 2000 coded images. MPEG gladly adapted the ISOBMFF to make it able to carry sequences of JPEG 2000 and original JPEG images. The file format has allowed wider use of JPEG 2000, e.g. by the movie industry.

A related case is provided by the JPEG need to enable transport of two image compression formats – JPEG 2000 and JPEG XS – on the successful MPEG transport standard, MPEG-2 Transport Stream. For both case MPEG received requests with a first set of requirements. It analysed the requests, added other requirements and sent them back to JPEG. An occasional face-to-face meeting was needed to close the requirements and to provide suggestion for minor extensions to the JPEG standard.

MPEG developed and balloted the amendment to carry JPEG 2000 and JPEG XS on MPEG-2 Transport Stream.

Collaboration to develop a specific instance of an ISO/IEC committee’s general standard

MPEG has two instances of this form of collaboration: Internet of Media Things (IoMT) and Network Based Media Processing (NBMP). The former is about APIs for discovery of and interaction between “Media Things” (e.g. cameras, microphones, displays and loudspeakers) communicating according to the Internet of Things (IoT) paradigm. The latter is a set of APIs allowing a device (e.g. a handset) to get some processing on media done by a networked service.

In JTC 1 MPEG stands out for its standards because they offer interoperability between implementations as opposed to most other standards which are about frameworks and architectures. This does not mean that MPEG does not need architectures. It needs them but it makes no sense to develop its own architectures. Much better if its architectures are specific instances of general architectures. This is true of IoMT and NBMP.

SC 41 was in the process of developing a general architecture for Internet of Things (IoT). MPEG developed a draft architecture and had it validated by SC 41.

SC 42 has developed a general architecture for Big Media Data. MPEG is developing Network based Media Processing (BNMP), which can be seen as an instance of the general Big Media Data architecture. Work on aligning the architectures of the two development is progressing.

MPEG collaborates on a standard that is also of interest to another ISO/IEC committee

This is the case of the Mixed and Augmented Reality Reference Model that MPEG has jointly developed with SC 24. This happened because SC 24 needed a framework standard for Mixed and Augmented Reality, from the architectural viewpoint and MPEG had similar interests, but from the bit-level interoperability viewpoint. SC 24 and MPEG agreed on the requirements for the standard and established a joint group (in this case, a Joint Ad hoc Group) with terms of reference, two chairs (one for SC 24 and one for MPEG) and a timeline. Ballots were handled by the SC 24 secretariat and the Joint Ad hoc Group resolved the comments from both NBs.

Collaboration to enable MPEG to develop an extension of one of its standards that falls under another ISO/IEC committee’s scope

This case is exemplified by a scenario under which MPEG and JPEG have collaborated towards a new image coding standard that is based on an MPEG moving picture coding standard.

This happened because conventional video coding standards need to support “clean” switching between different channels in broadcast applications, and random access for other use cases. This allows a decoder to reconstruct certain pictures in a video sequence (intra pictures), independently from other pictures in that sequence.

MPEG wished to develop the High Efficiency Image Format (HEIF) by defining a special case of the ISOBMFF relative to the HEVC intra picture mode. In a face-to-face meeting this goal was agreed and HEIF is now a successful file format supporting many modalities of interest to users.

Conclusions

The scope of work of an ISO/IEC committee is certainly useful as a reference. However, the current trend toward more convergence and more complex systems that rely on multiple standards requires a more flexible “gate” approach exemplified above. A committee may “open gates” toward another committee and the two may committees agree on developing specific projects. This approach does not work in a “defence of territory” mode where collaborations are seen as limiting a committee’s freedom, but by seeing collaborations with other committees and groups as opportunities to develop standards with a larger field of use where the constituencies of both committees share the benefits.

The examples mentioned in this article are actual cases that show how the extent of the MPEG scope and the modalities of collaboration used have been made possible by the use of the “gate” approach to develop collaborative standards.

Posts in this thread

 

 

 

The talents, MPEG and the master

Introduction

In the parable of the Talents the Gospel tells the story of a master who entrusts 5 talents (a large amount of money at that time) to one servant and 2 talents to another before leaving for a long travel. The first servant works hard and doubles his talents, while the second plays safe and buries the talents. When the master returns, he awards the first servant and punishes the second.

Thirty-one years ago, MPEG was given the field of standards for coding of moving pictures and audio to exploit. Now the master comes. To help him make the right judgement about the use of the the talents that he gave, I will briefly review the milestones reached in these years. Of course, I am not going to revisit all the MPEG standardisation areas developed in the last 31 years. There are several posts in this blog (see the list at the bottom of the page) and in the book A vision made real – Past, present and future of MPEG), I will just take some snapshots of the major achievements.

Making some media digital

Before MPEG-1 there had been attempts at making media digital, but MPEG-1 was the first standards that made the media really digital in consumer products: Video CD brought movies on CD, Digital Audio Broadcasting (DAB) made the first digital radio and MP3, well, that simply created the new music experience triggering a development that continues to this day. This was possible thanks to the vision that a global audio-video-systems standard would take over the world. It did.

Making television digital

MPEG-1 did not make all media digital, television was the major exception. This was an intricate world where politics, commercial interests, protection of culture and more had defied all attempts made by established standards organisations. MPEG applied its recipe and produced an effective MPEG-2 specification that added DSM-CC to support TV distribution on cable. Sharp vision, excellent technology and unstinting promotion efforts delivered the result.

Making media ICT friendly

MP3 encoding and decoding on PC was achieved in the early days of the standard, but an announcement by Intel that MPEG-2 video could be decoded in real time on their x86 chips made headlines. The real marriage between media and ICT – defined as IT + mobile – was the planned result of MPEG-4. Two video standards in sequence (Visual and AVC), the ultimate audio format (AAC in all its variations), the File Format (ISO Based Media File Format – ISOBMFF), Fonts (Open Font Format) and a lot of other standard technologies still largely in use today in spite of the fast-evolving technology scenario.

Media not just for humans

MPEG-7, conceived in the mid-1990’s, was a project ahead of its time.MPEG-7, conceived in the mid-1990’s, was a project ahead of its time. It was triggered by the vision that 500 TV channels would become available thanks to the saving of MPEG-2 on cable with the technology of the time. The idea was to enable the description of content – audio, video and multimedia – in the same bit-thrifty way as MPEG had done for MPEG-1/-2 and was doing for MPEG-4. Then descriptions would be distributed to machines to enable them to respond to human queries. Audio-Visual Description Profile (AVDP) is an example of how MPEG-7 is used in the content production world, but more is expected in the upcoming Video Coding for Machines work.

E-commerce of media

Around the turn of the millennium, there was an intense debate on how media could be handled in the new context enabled by MPEG standards. This had been triggered by the advent of Peer-to-Peer protocols that allowed new forms of distribution somehow at odds with practices and laws. With MPEG-21 MPEG developed a comprehensive framework and a suite of standards to enable e-commerce of media that respected the rights and interests of the parties involved. Some of these are the specification of: Digital Item (DI), identification of DIs and its components, protection of DIs, machine-readable languages to express rights and contracts, adaptation of DIs and more. Industry has taken pieces of MPEG-21, but not the entire framework yet.re. Industry has taken pieces of MPEG-21, but not the entire framework yet.

Standards for media combinations

At the beginning of the new millennium MPEG had collected enough standards that the following question was asked: how can we combine a set of content items each represented by MPEG standards or, when MPEG standards are not available, by other standards, in a standard way? This was the start of MPEG-A, a suite of standards Multimedia Application Formats (MAF). Examples are Surveillance AF, Interactive Music AF (IMAF), Augmented Reality AF ARAF), Common Media AF (CMAF) and Multi-Image (MIAF). CMAF is actually affecting millions of streaming devices today.

Systems-Video-Audio à la carte

With the main elements of the MPEG-4 standard in place, MPEG had the need for systems, video and audio standards without being able to define a unified standard. This was the birth of 3 standard suites: MPEG-B (Systems), MPEG-C (Video) and MPEG-D (Audio). Among the most relevant standards we mention Common encryption format (CENC) for ISOBMFF and MPEG-2 TS, Reconfigurable Video Coding (RVC) and Unified speech and audio coding (USAC). The last is the only standard that is capable to encode audio and speech with a quality superior to the best audio codec and the best speech codec.

Interacting with media

Media can be defined as virtual representations of audio and video information that match, hopefully in a faithful way, something that exists in the real world, or a representation of synthetically-generated audio and video information, or a mix of the two. MPEG started to tackle this issue in the middle of the first decade at the time Second Life offered an attractive paradigm for interaction with synthetically-generated audio and video information. MPEG developed MPEG-V, a framework and a suite of standards for the information flowing from sensors and to actuators and the characteristics of virtual world objects.

Getting media in any way

Broadcasting was the first system for mass distribution of media – audio and video. Originally, it was strictly one way, cable added return information, then the telecommunication networks provided the technical means to achieve full two-way distribution. With its MPEG-2 standard, MPEG provided the full stack from transport up. This was universally adopted by broadcasting, but the Internet Protocol (IP) was the transport selected for telecom distribution. With MPEG-H, MPEG provided a unified solution where content meant for one-way distribution can seamlessly distributed in a two-way fashion. With this Systems-Video-Audio based suite of standards MPEG has achieved unification of media distribution.

Facing an unpredictable internet

Probably most readers have never heard of the Asynchronous Transfer Mode (ATM), designed to transport fixed-size packets on a fixed route between two points before transferring data. ATM’s AAL1 could have guaranteed bandwidth, but had to give way to the leaner and cheaper IP. The successful digitatisation we live in is paid by unpredictability. You start with a good bandwidth between you and the source, but a moment later the bandwidth available is cut by half. A disaster for those who want to provide reliable services. MPEG-DASH is the standard that allows a consumer device to request (video, mostly) information of the appropriate bitrate matching the bitrate made available by the network at a given instant.

The immersive media dream

MPEG has dreamt for a quarter of century to immerse in media 😊. In the second half of the 1990’s MPEG developed the MPEG-2 Multiview Profile, the first attempt at providing the two eyes of a viewer with the kind of different information the eyes receive when they are hit by the light reflected by an object. The latest attempts were the Multiview and 3D extension attempts to HEVC. Technology is maturing, but many the context is far from stable as companies providing solution come and go. MPEG is developing standards in this slippery space based on 6 keypoints:

  1. Architecture for immersive services;
  2. Omnidirectional MediA Format (OMAF) for omnidirectional media applications (e.g. 360° video) and a basis for integration of other technologies;
  3. Immersive video starting from 3DoF+;
  4. Immersive audio (6DoF);
  5. Point Clouds providing a easy way to manipulate 3D visual objects;
  6. Network based Media Processing (NBMP) to allow a user to get the network to do some processing of their media.

Media devices are Things

The Internet of Things (IoT) paradigm is well known but how can we apply the general IoT paradigm to media? MPEG-IoMT (Internet of Media Things) is an MPEG standard suite providing interfaces, protocols and associated media-related information representations that enable advanced services and applications based on human to device and device to device interaction. IoMT will be the platform on which new standards such as Video Coding for Machines will be hosted.

More supple compression

MPEG video coding standards have been hugely successful. However, in certain domain, such as internet streaming adoption encounters non-technical difficulties. Essential Video Coding (EVC) is the standard that will yield excellent performance with the prospect of an easier licensing.

Compression for all

MPEG has developed an impressive number of technologies whose focus is on compression and transport of data. Some are strictly media-related. Others, however, have a more general applicability. That this is true and can be implemented is demonstrated by MPEG-G, a standard that allows efficient transport of DNA reads obtained by high-speech sequencing machines. MPEG-G compression is lossless and will allow savings on storage and transmission costs and in access to DNA information for clinical analyses.

The master returns

The master had a really long travel – 31 years – but has finally returned. Will he say to MPEG: “Well done, good and trustworthy servant; you have been trustworthy in a few things, I will put you in charge of many things; enter into the joy of your master” or will he say: “throw this lazy servant into the outer darkness, where there will be weeping and gnashing of teeth”?

Posts in this thread

 

 

Standards and business models

Introduction

Some could think that the title is an oxymoron. Indeed standards, certainly international ones, are published by not-for-profit organisations. How could they have a business model?

The answer is that around a standard there are quite a few entities, some of which are far from being not-for-profit.

Therefore, this article intends to analyse how business models can influence standards.

The actors of standardisation

Let’s first have a look at the actors of standardisation.

  1. The first actor is the organisation issuing standards. It may be an international organisation such as ISO, IEC or ETSI, or a trade association or an industry forum, but the organisation itself has not been designed to make money. A typical arrangement is a membership fee that allows an individual or a company employee to participate. Another is to make users of the standard pay to obtain the specification
  2. The second actor is the staff of the standards developing organisation. Depending of the type of organisation their role may be marginal of highly influential
  3. The third actor is the company who is a member of the organisation issuing standards.
  4. The fourth actor is the expert, typically the personnel sent by the company to contribute to the development of the standards.

From the interaction of these actors, the the standard is created, Then the standard creates an ecosystem. Companies become member of the ecosystem.

Why do companies participate in standard development?

Here is an initial list of motivations prompting companies to send their personnel to a standards committee.

  1. A company is interested in shaping the landscape of how a new technology will be used by concerned companies or industries. This is the case of Artificial Intelligence (AI), a technology that has recently matured and whose use has different sometimes unexpected implications. JTC 1/SC 42 has recently been formed to define AI architectures, frameworks, models etc. This kind of participation is not exclusive of companies. Universities find it useful to join this “exploratory” work because it may help them identify new research topics.
  2. A company is interested in developing a new product or launch a new service that requires a new standard technology.
  3. A company may be obliged by national regulations to participate in the development of a standard
  4. A company or, more and more often a university, owns technology it believes is useful or even required to draft a standard that a committee plans to develop. Again, a relevant case for this is MPEG where the number of Non-Performing Entities (NPE) is on the rise.
  5. A university or, not infrequently, a company wants to keep abreast of what is going on in a technology field or become aware as early as possible of the emergence of new standards that will affect its domain. MPEG is a typical case because it is a group open to new ideas and is attended by all relevant players.

Not all standards are born equal

The word “equal” in the title does not imply that there is a hierarchy of standards where some are more important than others. It simply means that the same name “standard” can be attached to quite different things.

The compact disc (CD) can be taken as the emblem of a traditional standard. Jointly developed by Philips and Sony, the CD quickly defeated the competing product by RCA and became the universal digital music distribution medium. The technical specification of the CD was originally defined in the Red Book and later became the IEC 60908 standard.

MPEG introduced a new process that replaced the development of a product, the marketplace success and the eventual ratification by a recognised standards organisation. This is how the process can be summarised:

  1. Identify the need of a new standard
  2. Develop requirements
  3. Issue call for proposals (CfP)
  4. Integrate technologies obtained from the CfP
  5. Draft the standard

In the early MPEG days, most participants were companies interested in developing new products or launch new services. They actively contributed to the standards because they needed it but also because they had relevant technologies developed in their laboratories.

Later the range of contributors to standard development got larger. The fact that in the mid- 1990’s a patent from Columbia University, clearly an NPE, had been declared essential in MPEG-2 video made headlines and prompted many to follow suit. The trend so initiated continues to this day.

After MPEG-2 the next step was to revive the old model represented by the CD. MPEG-4 became just one “product” but other companies developed other “products” some of which got a recognition as “standard” by a professional organisation. The creation of such standards implied the conversion of an internal company specification to the standard format of the professional organisation. The use of those “standards” was “free” in the sense that there no fees were charged for their use. However, other strings of less immediate comprehension to laymen were typically attached.

MPEG (formally WG 11) is about “coding of moving pictures and audio”, but a parallel group called JPEG (formally WG 1) is about “coding of digital representations of images”. The two groups operate based on different “business models”. Today the ubiquitous JPEG standard for image compression is royalty free because the 20-year validity of any patents has been largely overcome. However, even before the 20-year limit was crossed, the JPEG standard could be used freely with no charge. The same happened to the less famous but still quite important JPEG2000 standard used for movie distribution and to the less used JPEG XR standard.

More recently a consortium was formed to develop a royalty-free video compression specification. In rough, imperfect but sufficiently descriptive words, members of that consortium can freely use the specification in their products and services.

The business model of a standard is a serious matter

From the above we see that working for a standard has the basic motivation of creating a technology to enable a certain function in an ecosystem. The ecosystem can be anything from the ensemble of users of a product/service of a company, a country, an industry or the world at large. Beyond this common motivation, however, a company contributing to the development of a standard can have widely different motivations that I simplify as follows

  1. The common technology is encumbered because, by rewarding inventions, the ecosystem has embedded the means to constantly innovate its enabling technologies for new products and services. This is the basis of the MPEG business model that has ensured 30 years of development to the digital media industry. It has advantages and disadvantages
    1. The advantage of this model is that, once a licence for the standard has been defined, no one can hold the community hostage.
    2. The disadvantage is that getting agreement to the licence may prove difficult, thus disabling or hampering the business of the entire community.
  2. The common technology is “free” because the members of the ecosystem have assessed that they do not have interest in the technology per se but only in the technology as an enabler of other functions on which their business is built. This is the case of Linux/Android and most web technologies. Here, too, there are advantages and disadvantages
    1. The advantage of this model is that anybody can access the technology by accepting the “free” licence.
    2. The disadvantage is that a member of the ecosystem can be excluded for whatever reason and have its business ruined.

Parallel worlds

It is clear now that “standard” is a name that can be assigned to things that have the promotion of the creation of an ecosystem in common but may be very different otherwise. The way the members of the ecosystem operate is completely different depending on whether the standard is encumbered or free.

Let’s see the concrete cases of MPEG and JPEG. In the late 1980’s they started as two group with roughly the same size (30 people). Thirty years later MPEG has become a 600-member group and JPEG a 60-member group. In spite of handling similar technologies, less than 1% of MPEG members attend JPEG meetings. Why?

The answer is because MPEG decided (more correctly, was forced by the very complex IP environment of video and audio coding) to adopt the encumbered standard model while JPEG could decide to adopt the free standard model. In the last 30 years companies have heavily invested in MPEG standards because they have seen a return from that investment, and a host of new companies were created and are operating thanks to the reward coming from their inventions. JPEG developed less because fewer companies saw a return from the free standard business model.

A low number of common members exists between MPEG and JPEG because the MPEG and JPEG business models are antithetical.

Conclusions

I would like to apply the elements above to some current discussions where some people argue that, since JPEG and some MPEG experts have similar expertise, we should put them together to make “synergy”.

The simple answer to this argument is that it would be foolish to do that. JPEG people produce free standards because, those who have a business in mind, want to make money from something else that is enabled by the free standard. If JPEG people are mixed with MPEG people who want encumbered standards, the business of JPEG people is gone

People have better play the game they know, not improvise competences in things they don’t know. It is more or less the same story as in Einige Gespenster gehen um in der Welt – die Gespenster der Zauberlehrlinge.

The right solution is MPEGfuture.

Posts in this thread

On the convergence of Video and 3D Graphics

Table of contents

Introduction

For a few years now, MPEG has explored the issue of to efficiently represent (i.e. compress) data from a range of technologies offering users dynamic immersive visual experiences. Here the word “dynamic” captures the fact that the user can have an experience where objects move in the scene as opposed to being static.

Being static and dynamic may not appear to be a conceptually important difference. In practice, however, products that handle static scenes may be orders of magnitude less complex than those handling dynamic scenes. This is true both at the capture-encoding side and at the decoding-display side. This consideration implies that industry may need standards for static objects much earlier than for dynamic objects.

Industry has guided MPEG to develop two standards that are based on two approaches that are conceptually similar but are targeted to different aoolications and involve different technologies:

  1. Point clouds generated by multiple cameras and depth sensors in a variety of setups. These may contain up to billions of points with colours, material properties and other attributes to offer reproduced scenes characterised by high realism, free interaction and navigation.
  2. Multi-view videos generated by multiple cameras that capture a 3D scene from a pre-set number of viewpoints. This arrangement can also provide limited navigation capabilities.

The compression algorithms employed for the two sources of information have similarities and differences as well, and the purpose of this article is to briefly describe the algorithms involved in a general point cloud and in the particular case that MPEG calls 3DoF+ (central case in Figure 1), investigate to what extent the algorithms are similar and different, they can share technologies today and in the future.

Figure 1 – 3DoF (left), 3DoF+ (centre) and 6DoF (left)

Computer-generated scenes and video are worlds apart

A video is composed of a sequence of matrices of coloured pixels, but a computer-generated 3D scene and its objects are not represented like a video, but by geometry and appearance attributes (colour, reflectance, material…). In other words, a computer-generated scene is based on a model.

Thirty-one years ago, MPEG started working on video coding and 7 years later did the same for computer-generated objects. The (ambitious) title of MPEG-4 “Coding of audio-visual objects” signalled MPEG’s intention to handle the two media types jointly.

Until quite recently the Video and 3D Graphics competence centres (read Developing standards while preparing for the future to know more about how work in MPEG is carried out by competence centres and units) have largely worked independently until the need to compress real world 3D objects in 3D scenes has become important to industry.

The Video and 3D Graphics competence centres attacked the problem using their own specific backgrounds: 3D Graphics used Point Cloud because it is a 3D graphics representation (it has geometry), while Video used the videos obtained from a number of cameras (because they only have colours).

Video came up with a solution that is video based (obviously, because there was no geometry to encode) and 3D Graphics came up with two solutions, one which encodes the 3D geometry directly (G-PCC) and another which projects the Point Cloud objects on fixed planes (V-PCC). In V-PCC, it is possible to apply traditional video coding because geometry is implicit.

Point cloud compression

MPEG is currently working on two PCC standards: G-PCC standard, a purely geometry-based approach without much to share with conventional video coding and on V-PCC that is heavily based on video coding. Why do we need two different algorithms? Because G-PCC does a better job in “new” domains (say, automotive) while V-PCC leverages video codecs already installed on handsets. The fact that V-PCC is due to become FDIS in January 2020, makes it extremely attractive to an industry where novelty in products is a matter of life or death.

V-PCC seeks to map a point of the 3D cloud to a pixel of a 2D grid (an image). To be efficient, this mapping should be as stationary as possible (only minor changes between two consecutive frames) and should not introduce visible geometry distortions. Then the video encoder can take advantage of the temporal and spatial correlations of the point cloud geometry and attributes by maximising temporal coherence and minimising distance/angle distortions.

A 3D to 2D mapping guarantees that all the input points are captured by the geometry and attribute images so that they can be reconstructed without loss. If the point cloud is projected to the faces of a cube or a sphere bounding the object does not guarantee lossless reconstruction because auto-occlusions (points projected in the same 2D pixel are not captured) may generates significant distortions.

To avoid these negative effects, V-PCC decomposes the input point cloud into “patches”, which can be independently mapped to a 2D grid through a simple orthogonal projection. Mapped patches do not suffer from auto-occlusions and do not require re-sampling of the point cloud geometry and can produce patches with smooth boundaries, while minimising the number of patches and mapping distortions. This is an NP-hard optimization problem that V-PCC solves by applying the heuristic segmentation approach of Figure 2.

Figure 2: from point cloud to patches

An example of how an encoder operates is provided by the following steps (note: the encoder process is not standardised):

  1. At every point the normal on the point cloud “surface” is estimated;
  2. An initial clustering of the point cloud is obtained by associating each point to one of the six planes forming the unit cube (each point is associated with the plane that has the closest normal). Projections on diagonal planes are also allowed;
  3. The initial clustering is iteratively refined by updating the cluster index associated with each point based on its normal and the cluster indexes of its nearest neighbours;
  4. Patches are extracted by applying a connected component extraction procedure;
  5. The 3D patches so obtained are projected and packed into the same 2D frame.
  6. The only attribute per point that is mandatory to encode is the color  (see right-hand side of Figure 3); other attributes, such as reflectance or material properties can be optionally encoded.
  7. The distances (depth) of the point to the corresponding projection plane are used to generate a gray-scale image which is encoded using a traditional video codec. When the object is complex and several points project to the same 2D pixel, two depth layers are used encoding near plane and far plane (see left-hand side Figure 3 with one single depth layer).

Figure 3: Patch projection

3DoF+ compression

3DoF+ is a simpler case of the general visual immersion case to be specified by part 12 Immersive Video in MPEG-I. In order to provide sufficient visual quality for 3DoF+, a large number of source views need to be used, e.g. 10 ~ 25 views for a 30cm radius viewing space. Each source view can be captured as omnidirectional or perspectively projected video with texture and depth.

If such large number of source views were independently coded with legacy 2D video coding standards, such as HEVC, an unpractically high bitrate would be generated, and a costly large number of decoders would be required to view the scene.

The Depth Image Based Rendering (DIBR) inter-view prediction tools of 3D-HEVC may help to reduce the bitrate, but the 3D-HEVC codec is not widely deployed. Additionally, the parallel camera setting assumption of 3D-HEVC may affect the coding efficiency of inter-view prediction with arbitrary camera settings.

MPEG-I Immersive Video targets the support of 3DoF+ applications, with a significantly reduced coding pixel rate and limited bitrate using a limited number of legacy 2D video codecs applied to suitably pre- and post-processed videos.

The encoder is described by Figure 4.

Figure 4: Process flow of 3DoF encoder

An example of how an encoder operates is described below (note that the encoding process is not standardised):

  1. Multiple views (possibly one) are selected from the source views;
  2. The selected source views are called basic views and the non-selected views additional views;
  3. All additional views are pruned by synthesizing basic views to the additional views to erase non-occluded area;
  4. Pixels left in the pruned additional views are grouped into patches;
  5. Patches in a certain time interval may be aggregated to increase temporal stability of the shape and location of patches;
  6. Aggregated patches are packed into one or multiple atlases (Figure 5).

Figure 5: Atlas Construction process

  1. The selected basic view(s) and all atlases with patches are fed into a legacy encoder (an example of how an input looks like is provided by Figure 6)

Figure 6: An example of texture and depth atlas with patches

The atlas parameter list of Figure 4 contains: a list of starting position in atlas, source view IDs, location in source view and size for all patches in the atlas. The camera parameter list comprises the camera parameters of all indicated source views.

At the decoder (Figure 7) the following operations are performed

  1. The atlas parameter and camera parameter lists are parsed from the metadata bitstream;
  2. The legacy decoder reconstructs the atlases from the video bitstream;
  3. An occupancy map with patch IDs are generated according to the atlas parameter list and decoded depth atlas;
  4. When users watch the 3DoF+ content, the viewports corresponding to the position and orientation of their head are rendered using patches in the decoded texture and depth atlases, and corresponding patch and camera parameters.

Figure 7: Process flow of 3DoF+ decoder

Figure 8 shows how the quality of synthesised viewports decreases with decreasing number of views. With 24 views the image looks perfect, with 8 views there are barely visible artefacts on the tube on the floor, but with only two views artefacts become noticeable. The goal of 3DoF+ is to achieve the quality of the leftmost image when using the bitrate and pixel rate for the rightmost case.

Figure 8: Quality of synthesized video as a function of the number of views

Commonalities and differences of PCC and 3DoF+

V-PCC and 3DoF+ can use the same 2D video codec, e.g. HEVC. For 3DoF+, input to the encoder and output from the decoder are sequences of texture and depth atlases containing patches, which are somewhat similar to V-PCC patches, sequences of geometry/attribute video data also containing patches.

Both 3DoF+ and V-PCC have metadata describing positions and parameters for patches in atlas or video. But 3DoF+ should describe the view ID each patch belongs to and its camera parameters to support flexible camera setting, while V-PCC just needs to indicate which of the 6 fixed cube-faces each patch bonds to. V-PCC does not need metadata of camera parameters.

3DoF+ uses a renderer to generate synthesised viewport at any desired position and towards any direction, while V-PCC re-projects pixels of decoded video into 3D space to regenerate the point cloud.

Further, the V-PCC goal is to reconstruct the 3D model, in order to obtain the 3D coordinates for each point. For 3DoF+, the goal is to obtain some additional views by interpolation but not necessarily any possible view. While both methods use patches/atlases and encode them as video + depth, the encoders and decoders are very different because the input formats (and, implicitly, the output) are completely different.

The last difference is how the two groups developed their solutions. It is already known that G-PCC has much more flexibility in representing the geometry than V-PCC. It is also expected that compression gains will be bigger for G-PCC than for V-PCC. However, the overriding advantage of V-PCC it that is can use using existing and widely deployed video codecs. Industry would not accept dumping V-PCC to rely exclusively on G-PCC.

How can we achieve further convergence?

You may ask: I understand the differences between PCC and 3DoF+, but why was convergence not identified at the start? The answer depends on the nature of MPEG.

MPEG could have done that if it were a research centre. At its own will it could put researchers together on common projects and give them the appropriate time. Eventually, this hypothetical MPEG could have merged and united the two cultures (within its organisation, not the two communities at large), identified the common parts and, step by step, defined all the lower layers of the solution.

But MPEG is not a research centre, it is a standards organisation whose members are companies’ employees “leased” to MPEG to develop the standards their companies need. Therefore, the primary MPEG task is to develop the standards its “customers” need. As explained in Developing standards while preparing for the future, MPEG has a flexible organisation that allows it to accomplish its primary duty to develop the standards that industry needs while at the same time explore the next steps.

Now that we have identified that there are commonalities, does MPEG need to change its organisation? By all means no. Look at the MPEG organisation of Figure 9

Figure 9 – The flat MPEG organisation

The PCC work is developed by a 3DG unit (soon to become two because of the widely different V-PCC and G-PCC) and the 3DoF+ standard is developed by a Video unit. These units are at the same level and can easily talk to one another now because they have concrete matters to discuss, even more than they did before. This will continue for the next challenges of 6DoF where the user can freely move in a virtual 3D space corresponding to a real 3D space.

The traditional Video and 3D Graphics tools can also continue to be in the MPEG tool repository and continue to supplemented by new technologies that make them more and more friendly to each other.

This is the power of the flat and flexible MPEG organisation as opposed to a hierarchical and rigid organisations advocated by some. A rigid hierarchical organisation where standards are developed in a top-down fashion is unable to cope with the conflicting requirements that MPEG continuously faces.

Conclusions

MPEG is synonymous of technology convergence and the case illustrated in this paper is just the most recent. It indicates that more such cases will appear in the future as more sophisticated point cloud compressions will be introduced and technologies supporting the full navigation of 6DoF will become available.

This can happen without the need to change the MPEG organisational structure because the MPEG organisation has been designed to allow units interact in the same easy way if they are in the same competence centre or in different ones.

Many thanks to Lu Yu (Zhejiang University) and Marius Preda (Institut Polytechnique de Paris) who are the real authors of this article.

Posts in this thread

 

 

Developing standards while preparing for the future

Introduction

In Einige Gespenster gehen um in der Welt – die Gespenster der Zauberlehrlinge I described the case of an apprentice who operates in an environment that practices some mysterious arts (that in the past could have been called sorcery). Soon the apprentice wakes up at the importance of what he is learning/doing and thinks he has learned enough to go his own way.

The reality is that the apprentice has not learnt yet the art not because he has not studied and practiced it diligently and for enough time. He has not learnt it because no one can say “I know the art”. One can say to have practiced the art for a certain time, to have got that much experience or to have had this and that success (better not talk of failures or, better, to talk of successes after failures). The best one can say is that successes were the result of a time-tested teamwork.

Nothing fits more this description than the organisation of human work and this article will deal with how MPEG has developed a non-apprentice based organisation.

Work organisation across the millennia

Thousands of years ago (even rather recently) there was slave labour. By “owning the humans” people assumed they could impose any task on them (until a Spartacus came in, I mean).

More advanced than slavery is hired labour, because humans are not owned, you pay them and they do work for you. You can do what you want within a contract but only up to a point (much lower than with slave labour). If you cross a threshold you have to deal with unions or simply with workers leaving for another job and employer.

Fortunately, there have been many innovations over and beyond this archaic form of relationship. One case is when you have a work force hired to do intellectual work, such as research on new technologies. Managing the work force is more complicated, but there is an unbeatable tool: the promise to share the revenues of any invention that the intellectual worker makes.

Here, too, there are many variations to bind researcher and employer, to the advantage of both.

The MPEG context is quite different

Apart from not being a company, MPEG has a radically different organisation than any of those described above. In MPEG there are “workers”, but MPEG has no control of them because MPEG is not paying any salary. Someone else does. Still, MPEG has to improve the relationship between itself, the “virtual employer”, and its“workers”, if it is not to produce dull standards.

Here are some: projecting the development of a standard as a shared intellectual adventure, pursuing the goal with a combination of collective and personal advantages, promoting a sense of belonging to a great team, flashing the possibility to acquire personal fame because “we are making history here” and more.

For the adventure to be possible, however, MPEG has to entice two types of “worker”. One is the researcher who knows things and the other the employer who pays the salary. Both have to buy into the adventure.

This is not the end of the story because MPEG must also convince users of the standard that it will make sense for their industrial plans. By providing requirements, the users of the standard establish client-supplier relationship with MPEG.

Thirty years ago, matters were much simpler because the guy who paid the salary was regularly the same guy who used the standard. Today things are more complicated because the guy who pays the salary of the “worker” may very well not have any relationship with the guy who uses the standard, because their role may stop at providing the technologies that are used in the standards.

Organising the work in MPEG

So far so good. This is the evolution of an established business that MPEG brought about. This evolution, however, was accompanied by substantial changes in the object of work. In MPEG-1 and MPEG-2, audio, video, 3D graphics and systems were rather well delimited areas (not really, but compared with today, certainly so). Starting with MPEG-4, however, the different pieces of MPEG business got increasingly entangled.

If MPEG had been a company, it could have launched a series of restructurings, a favourites activity by many companies who think that a restructuring shows how their organisation is flexible. They can think and say that because they are not aware of the human costs of such reorganisations.

I said that MPEG is not a company and MPEG “workers” are not really workers but researchers rented out by their employers, or self-styled entrepreneurs, or students working on a great new idea etc. In any case, MPEG “workers” are intellectually highly prominent individuals.

When it started its work on video coding for interactive applications on digital media, MPEG did not have particularly innovative organisational ideas. Little by little it extended the scope of its work to other areas that were required to provided complete audio-visual solution.

MPEG ended up with building a peculiar competence centre-based organisation by reacting to the changing conditions at each step of its evolution. The organisation has gradually morphed (see here for the full history). Today the competence centres are: Requirements, Systems, Video, Video collaborations with ITU-T, Audio, 3D Graphics and Tests.

The innovative parts of the MPEG organisation are the units, formed inside each of these competence centres. They can have one of these two main goals: to address specific items that are required to develop one or more standards and to investigate related issues that may not be directly related to the standard. This is the mix of two developments: technologies for the standards and know-how for future standards, of which MPEG is proud.

Units may be temporary and sometimes may be long-lived. All units are formed with a natural leader, often as a result of an ad hoc group, whose creation may have been triggered by a proposal.

A graphical description of the MPEG organisation is provided by Figure 1

Figure 1 – The flat MPEG organisation

The units working for standards under development produce outputs which are integrated by the relevant competence centre plenaries and implemented by the editors with the assistance of the experts who have developed component technologies. The activity of the units of the joint groups with ITU-T are limited to the development of specific standards. Therefore, they do not have units working on explorations if not directly finalised to providing answers to open issues in their standards under development.

This is MPEG’s modus operandi that some (outside MPEG) think is the MPEG process described in How does MPEG actually work?. Nothing is farther from truth. MPEG’s modus operandi is the informal but effective organisation that permeates the MPEG work and allows interactions to happen when they are needed by those who need them at the time they need them. It is a system that allows MPEG to get the most out of the individual initiative, combining the need to satisfy industry needs now and and the need to create the right conditions for future standards tomorrow.

Proposing, as mentioned in Einige Gespenster gehen um in der Welt – die Gespenster der Zauberlehrlinge, to create a group that merges the Video and 3D Graphics competence centres based on a a hierarchical structure with substructures is prehistory of work organisation – fortunately not stretching back to slave labour. This is something that today would not even be considered in the organisation of a normal company, to say nothing of the organisation of a peculiar entity such as MPEG.

Units are highly mobile. They interact with other groups either because an issue is identified by the competence centre chairs, or by the competence centre plenaries or by initiative of the unit itself. Interaction can also be between groups or between units in different groups.

The number of units at any given time is rather large, exceeding 20 or even 30. Therefore the IT support system described in Digging deeper in the MPEG work and that is reproduced below helps MPEG members to keep up with the dynamics of all these interactions by providing information on what is being discussed where and when.

Figure 2 – How MPEG people know what happens where

Conclusions

A good example of how MPEG’s modus operandi can pursue its primary goal of producing standards, while at the same time keeping abreast of what comes next is the common layer shared by 3DoF+ and 3DG. This is something that MPEG thought conceptually existed and could have been designed in a top-down fashion. We did not do it because MPEG is not a not-for-profit organisation that pursues the goal of furthering the understanding of science. MPEG is a not-for-profit organisation developing standard while at the same time preparing the know-how for the next challenges. By not by imposing a vision of the future but doing the work today and  investigating the next steps, we get ready to respond to future requests from the industry.

What will be the nest steps of 3DoF+ and 3DG convergence is another story for another article.

Posts in this thread

 

No one is perfect, but some are more accomplished than others

In quite a few of my articles on this blog I have described how well MPEG has managed to create generic standards for the media industry, how many new standards keep on being produced to support the expansion of the media business and how much the expansion has brought benefits to that industry.

The first part of the title, however, reminds us that MPEG is an organisation created by humans and populated by humans. As MPEG is not and cannot be perfect, it is appropriate to ask what is the MPEG level of performance.

Being able to constantly answer the performance question is important for MPEG because there must be a system in place that can be used to constantly monitor the performance of the group. Even if performance is excellent today, there is no reason to rest because MPEG may very well not be excellent tomorrow.

The problem in trying to answer the question of MPEG performance is that MPEG is not a company, but a standards committee. What are the Key Performance Indicators (KPI) of a standards committee like MPEG? For sure MPEG is “successful” (it does provide useful standards), but is it successful enough compared to the possibilities that its mission offers?

In this article I will try and answer the question of MPEG adequacy to its mission by applying the SWOT (Strengths-Weaknesses-Opportunity-Threats) methodology to a set of parameters: Context, Scope of standards and Business model.

Obviously, these are not the only parameters that are relevant to an answer to MPEG adequacy to its mission.

SWOT is an excellent methodology because it separates internal (strengths and weaknesses) from external issues (opportunities and threats), even though it is not always easy to allocate an issue to internal and external.

The SWOT analysis reported here uses in part the results of the SWOT analysis carried out by the Italian National Body UNI in their proposal to make MPEG a subcommittee.

In the future other parameters will be considered: Membership, Structure, Leadership, Client industries, Collaboration, Standards development, Standards adoption, Innovation capability, Communication and Brand.

This is quite an engaging plan of work. So, expect to see more episodes on this blog.

Context

By “context” I mean the framework in which MPEG operates, namely ISO and IEC, but also in part ITU, because MPEG and ITU collaborate in JCT-VC and JVET.

Strengths

ISO, IEC and ITU are the topmost international standards developing organisations (SDO). Their standards have a high reputation because they are produced in environments governed by rigid rules (the ISO/IEC directives). Published standards are of a high editorial quality because they result from a rigorous process.

The very fact that such an atypical organisation like MPEG could take root, create a network of contacts, develop standards, influence large swathes of industries and thrive shows the strength of the context in which MPEG operates.

Since they are labelled ISO, MPEG standards can become part of a conformance assessment program or even be referenced by legislation.

Weaknesses

Being international organisations, ISO, IEC and ITU are hierarchical and bureaucratic in different measures and in different domains. As I explained in MPEG and ISO, ISO covers all areas of international standardisation, with the exception of  electrotechnical (IEC) and telecommunication (IEC) matters. The same rules apply to all parts of ISO and this has a high price tag. They are, almost by definition, slow to adapt to fast changing industry environments. Publication of standards takes time, often a year or more since the final draft international standard is released by the originating committee.

MPEG is quite different than most ISO/JTC 1 entities because it almost does not develop standards that address terminology, policies, architectures, frameworks etc. Indeed, most MPEG standards are deeply technical, bit-level specifications. They are actually closer to an internal company specification than a regular architecture or framework standard.

Some of the ISO/IEC rules have serious impacts on MPEG’s ability to develop timely and bug-free standards.

MPEG is a working group (WG), the lowest level of the ISO structure. To design, develop and “market” its standards, MPEG needs to establish liaisons with many other bodies. However, the process that establishes liaisons is slow because of many hierarchical layers.

Establishing liaisons with a non-ISO/IEC/ITU entity is cumbersome because the target group must apply for “recognition” by ISO, a step that not many bodies, particularly those with a high status, are ready to undertake.

Opportunities

Being part of a huge organisation, MPEG may establish liaisons with any committee in that organisation. JTC 1 has a standing agreement with ITU-T whereby a common standard can be easily developed jointly by JTC 1 and ITU-T groups by making reference to that agreement.

Since MPEG standards cut across a large number of application domains, many of which are under the purview of the 3 international SDOs, MPEG can be the vehicle that helps foster communication among the 3 bodies.

Threats

MPEG is a large WG having reached an attendance of 600 participants at the last (July 2019) meeting. There is a risk that ISO blindly applies its directive calling for WGs to have a “restricted” number of members and to be of “limited” size. Should this happen the MPEG standards that industry needs would no longer be produced or will be produced with less quality. Nothing will replace the swift infrastructure that enables technology integration. A suitable new structure would take years to emerge while there is no time to quibble because competition from other standards based on different business models is eating up at industries that used to be MPEG client industries.

A decision to break up MPEG could have extremely serious consequences on an industry that has been accustomed for decades to interact with and to receive standards from MPEG. The consequences will extend to thousands of highly skilled researchers, millions of workers and billions of consumers.

Scope of standards

A simplified version of the MPEG scope of work is

  1. Efficient coding and compression of digital representations of light and sound fields
  2. Efficient coding and compression of other digital data
  3. Support to digital information coding and compression

#1 refers to the traditional audio, video and 3D graphics compression, including immersive media, #2 concerns other data, e.g. compression of DNA samples or neural networks and #3 concerns ancillary, but no less important, topics such as file and transport formats.

Strengths

A major strength of the MPEG scope is the fact that it extends through the entire chain enabled by compression and decompression technologies. The breadth of scope has allowed MPEG to develop suites of standards that can be individually used, but also used in an integrated package.

Figure 1 – Integrated MPEG standard suites

Its broad scope has thus enabled MPEG to create a digital media ecosystem composed of technology providers feeding their technologies into the MPEG technology repository via Calls for Proposals, that products manufacturers can integrate to service the MPEG client industries.

Figure 2 – The MPEG industries: Technology, Implementation and Client

MPEG’s ability to cover all the space defined by the scope has been a major element that has determined the success of MPEG standards. This was in place since the early MPEG days of MPEG-1, when the need to provide a complete specification prompted MPEG to also work on Audio and Systems issues, and of MPEG-2, when video distribution on analogue channels prompted MPEG to develop the famous Emmy Award recipient and long-lived MPEG-2 Transport Stream, and CATV applications prompted MPEG to develop the DSM-CC protocols.

Another strength is the fact that MPEG standards result from collaborative developments and are maintained by the MPEG “community”.

Weaknesses

Thirty-one years ago, when MPEG held its first meeting, companies were already applying digital technologies, but in an uncoordinated manner. However, when MPEG digital audio and video compression standards became available, they sold like hot cakes because market demanded the savings made possible by standard digital technologies and the new opportunities offered by MPEG standards. Today markets keep on demanding old forms of savings, but also new services enabled by new media technologies. The more complex technology scenario makes it increasingly difficult to understand which standards based on which technology are needed by industry.

Because MPEG is mostly a technical group, it is also difficult to have the appropriate number of market experts with the appropriate knowledge to develop the requirements for new projects.

Opportunities

The MPEG scope offers a very large number of opportunities for standardisation in the audio-visual domain, in the traditional space, in the new “immersive media space” and in the new non-media data compression space. Since digitisation is becoming more and more a buzzword, more industries are discovering the benefits of handling their processes with digital technologies. The large amounts of digital data so generated can benefit from compression, as I wrote in Compression – the technology for the digital age. Compression can be used to optimise and enhance their processes and provide, much as it happened for the media industries, unrelenting expansion to their businesses. This could happen any time soon in two other domains that MPEG has already tackled: genomic information (see Genome is digital, and can be compressed) and neural networks (see Moving intelligence around). In The MPEG frontier I have elaborated on some of the opportunities.

Threats

Data compression is important, and MPEG does offer plenty of solutions to achieve that in different instances. Data compression is a crucial enabling technology, but customers need more than that. This has been a recurring theme in all MPEG standards from the time (1990) MPEG realised that software could be used not just to run video and audio compression algorithms, but also as a way to specify the standard.

Industry has evolved a great deal since then. In some environments that MPEG could claim are part of its purview, the standard is just the software. Some organisations, alliances or even companies offer high-quality software without a textual specification. This means that, even if the software may be open source, the actual specification is practically “hidden”, and the reference software may easily become the only implementation, because it is “the specification”.

MPEG prides itself of its ability to produce bare-boned standards that specify the minimum, but is threatened by other entities who can provide packages whose completeness is not matched by MPEG specifications.

A related threat comes from the confusion generated by the fact that other standards organisation may produce standards on, say AR/VR, that appear to compete with MPEG compression standards. while they are at completely different levels.

A major threat is the possible change of mind on the part of ISO regarding the size of MPEG. A decision to split MPEG in its component elements would be a disaster because, as mentioned above, MPEG acts as an ecosystem of groups competent on a wide range of interacting technologies assembled to produce integrated and coherent standards. Note that MPEG experts do not feel uncomfortable with the size of the group because the wide scope gives them the opportunity to be exposed to more views, issues and opportunities.

Business model

MPEG is not a for-profit entity. However, it operates on the basis of an implicit “business model” that has powered its 30 year-long continuous expansion. In plain words

  1. MPEG develops high-performance standards using the best technologies available, as offered in response to Calls for Proposals (CfP);
  2. Patents holders receive royalties through mechanisms that do not involve MPEG and usually re-invest those royalties in new technologies;
  3. MPEG develops new generations of MPEG standards using those new technologies.

Strengths

The very existence of MPEG with a growing membership is a proof that the MPEG business model is valid because excellent MPEG standards remunerate good IP and royalties earned from existing standards fund more good IP for future standards.

Weaknesses

All good games must come to an end. The end in MPEG has not come yet, but difficulty in obtaining licensing for some MPEG standards (see, e.g. A crisis, the causes and a solution) show that the MPEG business model is no longer as strong and immediately applicable as it used to be.

There is resistance to changes even of a limited scope to the MPEG business model. Therefore, the MPEG business model is weakening because it has not been allowed to adapt.

Opportunities

I have tried to highlight some solutions in Can MPEG overcome its Video “crisis”?, IP counting or revenue counting? and Matching technology supply with demand make some proposals that provide an opportunity to enhance the MPEG business model without reneging its foundations.

Threats

The threats are concrete and serious. MPEG may become irrelevant if it stays exclusively with an outdated business model. But MPEG is not a company, it is an organisation that operates based on rules established by the appropriate authorities within ISO and where decisions are made by consensus.

Conclusions

This is just the beginning of the SWOT analysis. Very soon I will publish an article on Membership, Structure and Leadership.

Posts in this thread

 

Einige Gespenster gehen um in der Welt – die Gespenster der Zauberlehrlinge

Introduction

The title of this article is inspired by two masterpieces of German philosophy and literature. The first is Karl Marx’s “The Manifesto of the Communist Party” with the metaphor of the spectre (of communism) going around in Europe while the powers of conservation try to stop it. The second is Johann Wolfgang von Goethe’s “The Sorcerer’s Apprentice”, the story of an apprentice who thinks he can enchant a broom and get it to do some work for him, because he has seen his master doing it. The broom gets out of hand, the master comes back, the apprentice implores the master to help and the master sorts out the apprentice’s mess.

I agree that all the above is still rather cryptic and it is not at all clear what these two German works “combined” have to do with the topics that I usually deal with in this blog.

So let me explain: the broom is MPEG, the sorcerers are the people who run MPEG and the spectres are the multinational apprentices who think they can handle the MPEG broom because they have seen it done by those who know how to do it.

The apprentices are labelled spectres because the word indicates “something widely feared as a possible unpleasant or dangerous occurrence”.

Let’s talk about the MPEG broom

As I wrote in Who “owns” MPEG?, the word MPEG is used to indicate several related but often independent things. In one instance, MPEG stands for the “MPEG community”, i.e. the ensemble of people and entities who are affected by what the “MPEG group” does: end users, industries, companies who do business using MPEG standards, universities and research centres, and individuals with an MPEG technical background. Each element in the list is a microcosm, but here we are particularly interested in the last microcosm – individuals with an MPEG technical background. This is composed of active MPEG experts, non-attending registered MPEG experts, researchers working in companies on MPEG standards without being registered members, researchers at large who are doing research in areas that are, or are expected to become, MPEG standardisation areas, and consultants in MPEG matters.

All these people are MPEG stakeholders (the others, too, but here we concentrate on this particular microcosm). They rely on MPEG because MPEG is serving them. MPEG owes part of its existence because they exist and operate. To a significant extent, these MPEG stakeholders can operate because MPEG exists.

The “MPEG group” is another microcosm where ideas percolate through different channels. As explained in Looking inside an MPEG meeting and How does MPEG actually work?, to become standardisation projects, ideas are processed in different ways. Requirements are developed, communicated to different environments and agreements “stipulated” with different industry stakeholders based on those requirements. Standardisation projects require technologies whose existence and performance levels must be verified. Technologies come into MPEG through different channels and are processed in different ways by different groups. Standards are verified against the stipulated performance. Finally, standards are living beings: they evolve and need maintenance.

MPEG is not a broom that operates with a magic, at least not in Goethe’s sense. You do not require spirits (“Geister”) to use it. Still, it is a sophisticated broom that has taken the shape it has as a result of a Darwinian process that has involved and is involving incremental adaptations to match the MPEG group to the needs of the expanding industry coverage, the continuous shift of the way industry operates and the accelerating technology cycles.

The shape that MPEG has gotten today is not final. If it were so, that would mean that MPEG is dead. MPEG is evolving and keeping it adapted to changing conditions is a serious matter. It cannot be left in the hands of some sorcerers’ apprentices.

MPEG and JPEG

Next to MPEG there is a group called JPEG. Everybody knows the word JPEG because of the .jpg extension of image files. The JPEG standard (ISO/IEC 10918-1, first released in 1992) has had a far-reaching impact on consumers because all handsets and computers can handle .jpg files and many important services have those files at the core of their business. But let’s make a comparison between MPEG and JPEG.

Parameters JPEG MPEG
Constituencies Image Broadcasting & AV streaming
Capability to evolve Still working on images Expanded field
Number of projects A few in parallel Several tens in parallel
Business models “Royalty free” “IP-encumbered”
Competition of standards No Very lively
Approaches                Holistic, top down Bottom up
Industry/ academia mix 1:1 3:1
Work force 60 members 600 members (+1000s outside)
Organisation Simple Sophisticated
Standards impact Huge (2 standards) Huge (many standards)
Future-oriented standards Light field image Point cloud & immersive video

We see that the two groups are different in many key respects: the industries they serve (image vs broadcasting and streaming), the capability to make the best out of the field to serve industry needs, the number of projects (limited vs several tens), the type of standards they provide (royalty free vs encumbered), the competition (little vs a lot of competition), the approach used to develop standards (principle-based vs experience-based), the percentage of academia in the membership (50% vs 25%), the organisation (handling a few vs handling tens of parallel projects), the impact (2 standards vs many), future oriented standards (coding of light field images vs coding of point clouds & immersive video).

Simply, it is a law of nature. If the size scales by an order of magnitude, everything ends up being different.

Enter the sorcerers’ apprentices

Now, the apprentices think that, because a part of MPEG is handling some technologies that JPEG, too, is handling, we should interpenetrate JPEG (60 people working on images as 30 years ago) and MPEG (600 people working on video, audio, systems and other data, who have designed the strategy that has made the media industry digital and fomented its continuous development) and create news groups creatively organised to manage the huge MPEG work program, the vast array of technologies, the network of liaisons and the large swathes of client industries. A similar fate may also be suffered by JPEG who is at risk to dissolve in the flood caused by the apprentices, as in Goethe’s ballade.

A disclaimer is needed at this point: this sort of idea has nothing to do with the proposal to elevate the MPEG Working Group (WG) to Subcommittee (SC) status presented in Which future for MPEG. The elevation to SC status seeks to change the WG envelope to an SC envelope, keeping the inside – the work and its organisation – exactly the same. The other seeks to upset a working machine thinking that the changes will work, much like Goethe’s sorcerer’s apprentice thought he could handle the magic broom.

A couple of expressions come to mind. The first is “elephant in a china shop”. The effects of the apprentices’ proposal will be exactly this: you will need to merge proud and accomplished people serving different constituencies; operating in environments of largely different complexity in terms of projects, number of people and industry; with different business models and approaches; operating in differently competitive environments; with different 30 years of history and experiences… After the elephant has entered the shop, forget finding any piece of chinaware intact.

The second expression is “A camel is a horse designed by a committee”. Unfortunately, this is not a joke but the harsh reality of some environments where people with a lot of self-importance operate in areas where they have little or no competence or experience. MPEG is mostly free from such people. Indeed, the sorcerers’ apprentices’ proposal comes mostly from people who left MPEG long time ago.

Some effects of the apprentices’ proposal

I could write a long list of negative effects, but let’s limit it to four.

  1. The MPEG brand. The proposed interpenetration will kill the MPEG brand affecting thousands of companies and researchers. Today researchers use their “I belong to MPEG” as a status symbol supporting their research. Tomorrow they will lose both their status and funding. The same applies to the JPEG brand.
  2. The MPEG credibility. The proposed interpenetration will mix two groups who share only a little part of one thing: technology. Technology is important, however, designing the structure of an industrial standards group like MPEG on the basis of technology, instead of constituencies’ needs, wipes off the credibility built by thousands of MPEG experts in 30 years of well-considered efforts.
  3. The MPEG standards. The proposed interpenetration will alter the process by which MPEG standards are defined and developed. Industry will shy away from this new generation of self-styled “MPEG standards” because they will not fit their needs and will look elsewhere. The only sensible thing MPEG will be left with is the maintenance of the 180 standards that were produced by the real MPEG.
  4. The MPEG productivity. The proposed interpenetration will dramatically affect the number and quality of standards produced. One value of MPEG standards is the breadth and depth of their scope. More important, however, is the fact that MPEG standards are not independent specifications but are designed to work together thanks to the painstaking efforts of hundreds of MPEG experts from different areas.

The sorcerers have their hands tied

Decades after decade generations of MPEG sorcerers have learnt the magic, but they are not free. If the MPEG broom is wrongly used, there will be no sorcerers coming back to help the apprentices to undo their misdeeds. The apprentices may well moan die ich rief, die Geister, werd’ ich nun nicht los (I cannot get rid of the spirits I called), but no one will be capable to stop the MPEG broom gone crazy.

Those who care about MPEG have better make themselves heard. At stake there are trillions of USD year on year, billions of users, millions of workers and thousands of highly skilled researchers.

Posts in this thread

 

Does success breed success?

Introduction

Most readers will answer yes to the question asked in the title. Indeed, very often we see that success of a human organisation breeds success. Until, I mean, the machine that looked like it could produce results forever “seizes”. But don’t look elsewhere for the causes of failure: it’s not the machine, the causes are humans inside and/or outside.

In an age when things move fast and change, MPEG has been in operation for three decades. Its standards have achieved and continue achieving enormous success serving billions of human beings: consumers, service providers and manufacturers.

This article makes some considerations on the best way for MPEG success to breed success – unless success to breed failure is the goal. Apparently unrelated considerations are made in The Imperial Diet is facing a problem.

Recalling the MPEG story

MPEG started in 1988 as an “experts group” with the task to develop video coding standards for storage media at a rate of about 1.5 Mbit/s, like the compact disc (CD). This was because, in the second half of the 1980’s, the Consumer Electronics and telco industries imagined that interactive video – local or via the network – was a killing application.

Within 6 months MPEG had already started working on audio coding because – it looks obvious now, but it was not so obvious at that time – if you have video you also need audio and, if you do not compress stereo audio at 1.41 Mbit/s, the output bitrate of a CD, there will be no space left for video. In another 6 months MPEG had started working on “systems” aspects, those allowing a receiver to reproduce synchronised audio and video information.

These were the first steps in the MPEG drive to make standards that had no “holes” for implementors. Thanks to these efforts, the scope of use of MPEG standards, still within the scope of “coding of moving pictures and audio”, have expanded like wildfire: starting from coding of moving pictures at 1.5 Mbit/s and expanded to more video, audio, transport, protocols, API and more. With its standards, MPEG is handling all technologies that facilitate enhanced use of digital media.

The MPEG expansion is a joyous phenomenon that has created an expanding global brotherhood of digital media researchers – in industry and academia – for which MPEG and its standards are the motivation for more research. If research results are good, they can make their way into some MPEG standard.

MPEG needs a structure

Clearly you cannot have hundreds of people discussing such a broad scope of technologies at the same time and place. You can split the work because technologies can be considered independent up to a point. Eventually, however, like in a puzzle, all pieces have to find a place in the global picture. The MPEG structure has been implemented to allow the creation of ever more complex puzzles.

In its 31 years of activity MPEG has developed a unique organisation capable of channeling the efforts of thousands of researchers working at any one time on MPEG standards – only a fraction of which actually show up at MPEG meetings – into the suites of integrated standards that industry uses to churn out products and services worth trillions of USD a year.

The figure below depicts the MPEG structure from the viewpoint of the standard development workflow.

The MPEG workflow

Typically, new ideas come from members’ contributions, but can also be generated from inside MPEG. The Requirements group assesses and develops ideas and may go as far as to request “evidence” of existence and performance of technologies (Calls for Evidence – CfE) or actual “proposals” for fully documented technologies (Call for Proposals – CfP).

MPEG has never had a “constituency” because it develops horizontal standards cutting across industries. It has established liaisons with tens of industries and communities through their standards committees or trade associations. We call many of them as “client industries” in the sense that they provide their requirements to MPEG against which MPEG produces standards. At every meeting, several tens of input liaisons are received and about the same amount of output liaisons are issued.

Many CfPs cover a broad range of technologies that are within the competence of the different MPEG groups. The adequacy of submitted technologies is tested by the Test Group. The submitted proposals and the test results are provided to the appropriate technical groups – Systems, Video, Audio and 3D Graphics.

The Chairs group includes the chairs of all groups. It has the task to assess the progress of work, uncover bottlenecks, identify needs to discuss shared interests between groups and organise joint meetings to resolve issues.

An MPEG week is made of intense days (sometimes continuing until midnight). Coordinated work, however, does not stop when the meeting ends. At that time MPEG establishes tens of ad hoc groups with precise goals for collaborative development to be reported at the next meeting.

The Communication group has the task to keep the world informed of the progress of the work and to produce white papers, investigations and technical notes.

MPEG is not an empire

From the above, one may think that MPEG is an empire, but it is not. MPEG is a working group, the lowest layer of the ISO hierarchy, in charge of developing digital media standards. It formally reports to a Subcommittee called SC 29 but, as I have explained in Dot the i’s and cross the t’s, SC 29 has ended up with a laissez-faire attitude that has allowed MPEG to autonomously develop strategy, organisational structure and network of client industries. MPEG standards have given client industries the tools to make their analogue infrastructures digital and, subsequently, to leverage successive generations of standard digital media technologies to expand their business. With some success, one could say.

The MPEG organisation is robust. Virtually the same organisation has been in place since – 25 years ago – MPEG had an attendance of 300. Groups have come and gone and the structure currently in operation has been refined multiple times in response to actual needs. Changes have been effected, and there will be more changes in the future. However, they all have been and, as far as I can see, will be incremental adaptations, to perfect one aspect or another of the structure. With this structure, more than 150 standards have been produced, some of which have been wildly successful.

MPEG can count on three assets: the logic of the structure, the experience gained in all those years, its membership and its client industries. With these, MPEG success can breed more success in the years to come.

The Imperial Diet is facing a problem

I said before that MPEG is not an empire. In the imperial context of the Holy Roman Empire, MPEG could be defined as a Margraviate in charge of defending and extending a portion of the frontiers of the Empire. A Margraviate reported to a Kingdom who reported to the Imperial Diet.

Now, let’s suppose that the Imperial Diet has requested the S Kingdom to review the status of its two J and M Margraviates and propose a new arrangement. The main element in the decision is the size of the two Margraviates: 10% of the territory of the S Kingdom for the J Margraviate and 90% for the M Margraviate. Ruling out other fancy ideas, the S Kingdom has two options: request that the M Margraviate be elevated to Kingdom status or create a few smaller Margraviates inside the S Kingdom out of the M Margraviate.

There is a problem, though, if the M Margraviate is cut in smaller Margraviates: the Margraviates of the Holy Roman Empire are not domino game pawns. For decades the M Margraviate has fought hard extending its territory – hence the Holy Roman Empire’s territory – to lands that until then were occupied by unruly tribes. It has been successful in its endeavours because it had large armies with different skills: archers, knights, foot soldiers and more. By skillfully coordinating these specialised troops, the M Margraviate was able to conquer new lands and make them faithful fiefdoms.

But there is another important consideration: there are wild hordes coming from the steppes of Central Asia with a completely new warfare technique. Some armies of M Margraviate are having a hard time dealing with them, even though they are learning a trick or two to fight back.

How could the new armies of the different Margraviates created out of the M Margraviate defend – never mind extend – the frontier, when the S Kingdom does not know the territory, having lived all time in its castle, and has never led an army?

The Holy Roman Empire lasted 1,000 years. There is no doubt that the Imperial Diet would make the M Margraviate a Kingdom keeping its armies and structure unchanged. Warfare is a serious business and the effective defence of the frontiers is the priority.

Conclusions

Fortunately, today there is no Margraviates and Kingdoms anymore, much less the Holy Roman Empire. There are also no new territories to conquer by force of arms and there are no frontiers to defend against rebellious hordes.

I realise now that at the beginning of this article I have promised that I would make some considerations on the best way for MPEG success to breed success and not failure. Maybe I will do that next time.

Posts in this thread