On the convergence of Video and 3D Graphics

Table of contents

Introduction

For a few years now, MPEG has explored the issue of to efficiently represent (i.e. compress) data from a range of technologies offering users dynamic immersive visual experiences. Here the word “dynamic” captures the fact that the user can have an experience where objects move in the scene as opposed to being static.

Being static and dynamic may not appear to be a conceptually important difference. In practice, however, products that handle static scenes may be orders of magnitude less complex than those handling dynamic scenes. This is true both at the capture-encoding side and at the decoding-display side. This consideration implies that industry may need standards for static objects much earlier than for dynamic objects.

Industry has guided MPEG to develop two standards that are based on two approaches that are conceptually similar but are targeted to different aoolications and involve different technologies:

  1. Point clouds generated by multiple cameras and depth sensors in a variety of setups. These may contain up to billions of points with colours, material properties and other attributes to offer reproduced scenes characterised by high realism, free interaction and navigation.
  2. Multi-view videos generated by multiple cameras that capture a 3D scene from a pre-set number of viewpoints. This arrangement can also provide limited navigation capabilities.

The compression algorithms employed for the two sources of information have similarities and differences as well, and the purpose of this article is to briefly describe the algorithms involved in a general point cloud and in the particular case that MPEG calls 3DoF+ (central case in Figure 1), investigate to what extent the algorithms are similar and different, they can share technologies today and in the future.

Figure 1 – 3DoF (left), 3DoF+ (centre) and 6DoF (left)

Computer-generated scenes and video are worlds apart

A video is composed of a sequence of matrices of coloured pixels, but a computer-generated 3D scene and its objects are not represented like a video, but by geometry and appearance attributes (colour, reflectance, material…). In other words, a computer-generated scene is based on a model.

Thirty-one years ago, MPEG started working on video coding and 7 years later did the same for computer-generated objects. The (ambitious) title of MPEG-4 “Coding of audio-visual objects” signalled MPEG’s intention to handle the two media types jointly.

Until quite recently the Video and 3D Graphics competence centres (read Developing standards while preparing for the future to know more about how work in MPEG is carried out by competence centres and units) have largely worked independently until the need to compress real world 3D objects in 3D scenes has become important to industry.

The Video and 3D Graphics competence centres attacked the problem using their own specific backgrounds: 3D Graphics used Point Cloud because it is a 3D graphics representation (it has geometry), while Video used the videos obtained from a number of cameras (because they only have colours).

Video came up with a solution that is video based (obviously, because there was no geometry to encode) and 3D Graphics came up with two solutions, one which encodes the 3D geometry directly (G-PCC) and another which projects the Point Cloud objects on fixed planes (V-PCC). In V-PCC, it is possible to apply traditional video coding because geometry is implicit.

Point cloud compression

MPEG is currently working on two PCC standards: G-PCC standard, a purely geometry-based approach without much to share with conventional video coding and on V-PCC that is heavily based on video coding. Why do we need two different algorithms? Because G-PCC does a better job in “new” domains (say, automotive) while V-PCC leverages video codecs already installed on handsets. The fact that V-PCC is due to become FDIS in January 2020, makes it extremely attractive to an industry where novelty in products is a matter of life or death.

V-PCC seeks to map a point of the 3D cloud to a pixel of a 2D grid (an image). To be efficient, this mapping should be as stationary as possible (only minor changes between two consecutive frames) and should not introduce visible geometry distortions. Then the video encoder can take advantage of the temporal and spatial correlations of the point cloud geometry and attributes by maximising temporal coherence and minimising distance/angle distortions.

A 3D to 2D mapping guarantees that all the input points are captured by the geometry and attribute images so that they can be reconstructed without loss. If the point cloud is projected to the faces of a cube or a sphere bounding the object does not guarantee lossless reconstruction because auto-occlusions (points projected in the same 2D pixel are not captured) may generates significant distortions.

To avoid these negative effects, V-PCC decomposes the input point cloud into “patches”, which can be independently mapped to a 2D grid through a simple orthogonal projection. Mapped patches do not suffer from auto-occlusions and do not require re-sampling of the point cloud geometry and can produce patches with smooth boundaries, while minimising the number of patches and mapping distortions. This is an NP-hard optimization problem that V-PCC solves by applying the heuristic segmentation approach of Figure 2.

Figure 2: from point cloud to patches

An example of how an encoder operates is provided by the following steps (note: the encoder process is not standardised):

  1. At every point the normal on the point cloud “surface” is estimated;
  2. An initial clustering of the point cloud is obtained by associating each point to one of the six planes forming the unit cube (each point is associated with the plane that has the closest normal). Projections on diagonal planes are also allowed;
  3. The initial clustering is iteratively refined by updating the cluster index associated with each point based on its normal and the cluster indexes of its nearest neighbours;
  4. Patches are extracted by applying a connected component extraction procedure;
  5. The 3D patches so obtained are projected and packed into the same 2D frame.
  6. The only attribute per point that is mandatory to encode is the color  (see right-hand side of Figure 3); other attributes, such as reflectance or material properties can be optionally encoded.
  7. The distances (depth) of the point to the corresponding projection plane are used to generate a gray-scale image which is encoded using a traditional video codec. When the object is complex and several points project to the same 2D pixel, two depth layers are used encoding near plane and far plane (see left-hand side Figure 3 with one single depth layer).

Figure 3: Patch projection

3DoF+ compression

3DoF+ is a simpler case of the general visual immersion case to be specified by part 12 Immersive Video in MPEG-I. In order to provide sufficient visual quality for 3DoF+, a large number of source views need to be used, e.g. 10 ~ 25 views for a 30cm radius viewing space. Each source view can be captured as omnidirectional or perspectively projected video with texture and depth.

If such large number of source views were independently coded with legacy 2D video coding standards, such as HEVC, an unpractically high bitrate would be generated, and a costly large number of decoders would be required to view the scene.

The Depth Image Based Rendering (DIBR) inter-view prediction tools of 3D-HEVC may help to reduce the bitrate, but the 3D-HEVC codec is not widely deployed. Additionally, the parallel camera setting assumption of 3D-HEVC may affect the coding efficiency of inter-view prediction with arbitrary camera settings.

MPEG-I Immersive Video targets the support of 3DoF+ applications, with a significantly reduced coding pixel rate and limited bitrate using a limited number of legacy 2D video codecs applied to suitably pre- and post-processed videos.

The encoder is described by Figure 4.

Figure 4: Process flow of 3DoF encoder

An example of how an encoder operates is described below (note that the encoding process is not standardised):

  1. Multiple views (possibly one) are selected from the source views;
  2. The selected source views are called basic views and the non-selected views additional views;
  3. All additional views are pruned by synthesizing basic views to the additional views to erase non-occluded area;
  4. Pixels left in the pruned additional views are grouped into patches;
  5. Patches in a certain time interval may be aggregated to increase temporal stability of the shape and location of patches;
  6. Aggregated patches are packed into one or multiple atlases (Figure 5).

Figure 5: Atlas Construction process

  1. The selected basic view(s) and all atlases with patches are fed into a legacy encoder (an example of how an input looks like is provided by Figure 6)

Figure 6: An example of texture and depth atlas with patches

The atlas parameter list of Figure 4 contains: a list of starting position in atlas, source view IDs, location in source view and size for all patches in the atlas. The camera parameter list comprises the camera parameters of all indicated source views.

At the decoder (Figure 7) the following operations are performed

  1. The atlas parameter and camera parameter lists are parsed from the metadata bitstream;
  2. The legacy decoder reconstructs the atlases from the video bitstream;
  3. An occupancy map with patch IDs are generated according to the atlas parameter list and decoded depth atlas;
  4. When users watch the 3DoF+ content, the viewports corresponding to the position and orientation of their head are rendered using patches in the decoded texture and depth atlases, and corresponding patch and camera parameters.

Figure 7: Process flow of 3DoF+ decoder

Figure 8 shows how the quality of synthesised viewports decreases with decreasing number of views. With 24 views the image looks perfect, with 8 views there are barely visible artefacts on the tube on the floor, but with only two views artefacts become noticeable. The goal of 3DoF+ is to achieve the quality of the leftmost image when using the bitrate and pixel rate for the rightmost case.

Figure 8: Quality of synthesized video as a function of the number of views

Commonalities and differences of PCC and 3DoF+

V-PCC and 3DoF+ can use the same 2D video codec, e.g. HEVC. For 3DoF+, input to the encoder and output from the decoder are sequences of texture and depth atlases containing patches, which are somewhat similar to V-PCC patches, sequences of geometry/attribute video data also containing patches.

Both 3DoF+ and V-PCC have metadata describing positions and parameters for patches in atlas or video. But 3DoF+ should describe the view ID each patch belongs to and its camera parameters to support flexible camera setting, while V-PCC just needs to indicate which of the 6 fixed cube-faces each patch bonds to. V-PCC does not need metadata of camera parameters.

3DoF+ uses a renderer to generate synthesised viewport at any desired position and towards any direction, while V-PCC re-projects pixels of decoded video into 3D space to regenerate the point cloud.

Further, the V-PCC goal is to reconstruct the 3D model, in order to obtain the 3D coordinates for each point. For 3DoF+, the goal is to obtain some additional views by interpolation but not necessarily any possible view. While both methods use patches/atlases and encode them as video + depth, the encoders and decoders are very different because the input formats (and, implicitly, the output) are completely different.

The last difference is how the two groups developed their solutions. It is already known that G-PCC has much more flexibility in representing the geometry than V-PCC. It is also expected that compression gains will be bigger for G-PCC than for V-PCC. However, the overriding advantage of V-PCC it that is can use using existing and widely deployed video codecs. Industry would not accept dumping V-PCC to rely exclusively on G-PCC.

How can we achieve further convergence?

You may ask: I understand the differences between PCC and 3DoF+, but why was convergence not identified at the start? The answer depends on the nature of MPEG.

MPEG could have done that if it were a research centre. At its own will it could put researchers together on common projects and give them the appropriate time. Eventually, this hypothetical MPEG could have merged and united the two cultures (within its organisation, not the two communities at large), identified the common parts and, step by step, defined all the lower layers of the solution.

But MPEG is not a research centre, it is a standards organisation whose members are companies’ employees “leased” to MPEG to develop the standards their companies need. Therefore, the primary MPEG task is to develop the standards its “customers” need. As explained in Developing standards while preparing for the future, MPEG has a flexible organisation that allows it to accomplish its primary duty to develop the standards that industry needs while at the same time explore the next steps.

Now that we have identified that there are commonalities, does MPEG need to change its organisation? By all means no. Look at the MPEG organisation of Figure 9

Figure 9 – The flat MPEG organisation

The PCC work is developed by a 3DG unit (soon to become two because of the widely different V-PCC and G-PCC) and the 3DoF+ standard is developed by a Video unit. These units are at the same level and can easily talk to one another now because they have concrete matters to discuss, even more than they did before. This will continue for the next challenges of 6DoF where the user can freely move in a virtual 3D space corresponding to a real 3D space.

The traditional Video and 3D Graphics tools can also continue to be in the MPEG tool repository and continue to supplemented by new technologies that make them more and more friendly to each other.

This is the power of the flat and flexible MPEG organisation as opposed to a hierarchical and rigid organisations advocated by some. A rigid hierarchical organisation where standards are developed in a top-down fashion is unable to cope with the conflicting requirements that MPEG continuously faces.

Conclusions

MPEG is synonymous of technology convergence and the case illustrated in this paper is just the most recent. It indicates that more such cases will appear in the future as more sophisticated point cloud compressions will be introduced and technologies supporting the full navigation of 6DoF will become available.

This can happen without the need to change the MPEG organisational structure because the MPEG organisation has been designed to allow units interact in the same easy way if they are in the same competence centre or in different ones.

Many thanks to Lu Yu (Zhejiang University) and Marius Preda (Institut Polytechnique de Paris) who are the real authors of this article.

Posts in this thread

 

 

Developing standards while preparing for the future

Introduction

In Einige Gespenster gehen um in der Welt – die Gespenster der Zauberlehrlinge I described the case of an apprentice who operates in an environment that practices some mysterious arts (that in the past could have been called sorcery). Soon the apprentice wakes up at the importance of what he is learning/doing and thinks he has learned enough to go his own way.

The reality is that the apprentice has not learnt yet the art not because he has not studied and practiced it diligently and for enough time. He has not learnt it because no one can say “I know the art”. One can say to have practiced the art for a certain time, to have got that much experience or to have had this and that success (better not talk of failures or, better, to talk of successes after failures). The best one can say is that successes were the result of a time-tested teamwork.

Nothing fits more this description than the organisation of human work and this article will deal with how MPEG has developed a non-apprentice based organisation.

Work organisation across the millennia

Thousands of years ago (even rather recently) there was slave labour. By “owning the humans” people assumed they could impose any task on them (until a Spartacus came in, I mean).

More advanced than slavery is hired labour, because humans are not owned, you pay them and they do work for you. You can do what you want within a contract but only up to a point (much lower than with slave labour). If you cross a threshold you have to deal with unions or simply with workers leaving for another job and employer.

Fortunately, there have been many innovations over and beyond this archaic form of relationship. One case is when you have a work force hired to do intellectual work, such as research on new technologies. Managing the work force is more complicated, but there is an unbeatable tool: the promise to share the revenues of any invention that the intellectual worker makes.

Here, too, there are many variations to bind researcher and employer, to the advantage of both.

The MPEG context is quite different

Apart from not being a company, MPEG has a radically different organisation than any of those described above. In MPEG there are “workers”, but MPEG has no control of them because MPEG is not paying any salary. Someone else does. Still, MPEG has to improve the relationship between itself, the “virtual employer”, and its“workers”, if it is not to produce dull standards.

Here are some: projecting the development of a standard as a shared intellectual adventure, pursuing the goal with a combination of collective and personal advantages, promoting a sense of belonging to a great team, flashing the possibility to acquire personal fame because “we are making history here” and more.

For the adventure to be possible, however, MPEG has to entice two types of “worker”. One is the researcher who knows things and the other the employer who pays the salary. Both have to buy into the adventure.

This is not the end of the story because MPEG must also convince users of the standard that it will make sense for their industrial plans. By providing requirements, the users of the standard establish client-supplier relationship with MPEG.

Thirty years ago, matters were much simpler because the guy who paid the salary was regularly the same guy who used the standard. Today things are more complicated because the guy who pays the salary of the “worker” may very well not have any relationship with the guy who uses the standard, because their role may stop at providing the technologies that are used in the standards.

Organising the work in MPEG

So far so good. This is the evolution of an established business that MPEG brought about. This evolution, however, was accompanied by substantial changes in the object of work. In MPEG-1 and MPEG-2, audio, video, 3D graphics and systems were rather well delimited areas (not really, but compared with today, certainly so). Starting with MPEG-4, however, the different pieces of MPEG business got increasingly entangled.

If MPEG had been a company, it could have launched a series of restructurings, a favourites activity by many companies who think that a restructuring shows how their organisation is flexible. They can think and say that because they are not aware of the human costs of such reorganisations.

I said that MPEG is not a company and MPEG “workers” are not really workers but researchers rented out by their employers, or self-styled entrepreneurs, or students working on a great new idea etc. In any case, MPEG “workers” are intellectually highly prominent individuals.

When it started its work on video coding for interactive applications on digital media, MPEG did not have particularly innovative organisational ideas. Little by little it extended the scope of its work to other areas that were required to provided complete audio-visual solution.

MPEG ended up with building a peculiar competence centre-based organisation by reacting to the changing conditions at each step of its evolution. The organisation has gradually morphed (see here for the full history). Today the competence centres are: Requirements, Systems, Video, Video collaborations with ITU-T, Audio, 3D Graphics and Tests.

The innovative parts of the MPEG organisation are the units, formed inside each of these competence centres. They can have one of these two main goals: to address specific items that are required to develop one or more standards and to investigate related issues that may not be directly related to the standard. This is the mix of two developments: technologies for the standards and know-how for future standards, of which MPEG is proud.

Units may be temporary and sometimes may be long-lived. All units are formed with a natural leader, often as a result of an ad hoc group, whose creation may have been triggered by a proposal.

A graphical description of the MPEG organisation is provided by Figure 1

Figure 1 – The flat MPEG organisation

The units working for standards under development produce outputs which are integrated by the relevant competence centre plenaries and implemented by the editors with the assistance of the experts who have developed component technologies. The activity of the units of the joint groups with ITU-T are limited to the development of specific standards. Therefore, they do not have units working on explorations if not directly finalised to providing answers to open issues in their standards under development.

This is MPEG’s modus operandi that some (outside MPEG) think is the MPEG process described in How does MPEG actually work?. Nothing is farther from truth. MPEG’s modus operandi is the informal but effective organisation that permeates the MPEG work and allows interactions to happen when they are needed by those who need them at the time they need them. It is a system that allows MPEG to get the most out of the individual initiative, combining the need to satisfy industry needs now and and the need to create the right conditions for future standards tomorrow.

Proposing, as mentioned in Einige Gespenster gehen um in der Welt – die Gespenster der Zauberlehrlinge, to create a group that merges the Video and 3D Graphics competence centres based on a a hierarchical structure with substructures is prehistory of work organisation – fortunately not stretching back to slave labour. This is something that today would not even be considered in the organisation of a normal company, to say nothing of the organisation of a peculiar entity such as MPEG.

Units are highly mobile. They interact with other groups either because an issue is identified by the competence centre chairs, or by the competence centre plenaries or by initiative of the unit itself. Interaction can also be between groups or between units in different groups.

The number of units at any given time is rather large, exceeding 20 or even 30. Therefore the IT support system described in Digging deeper in the MPEG work and that is reproduced below helps MPEG members to keep up with the dynamics of all these interactions by providing information on what is being discussed where and when.

Figure 2 – How MPEG people know what happens where

Conclusions

A good example of how MPEG’s modus operandi can pursue its primary goal of producing standards, while at the same time keeping abreast of what comes next is the common layer shared by 3DoF+ and 3DG. This is something that MPEG thought conceptually existed and could have been designed in a top-down fashion. We did not do it because MPEG is not a not-for-profit organisation that pursues the goal of furthering the understanding of science. MPEG is a not-for-profit organisation developing standard while at the same time preparing the know-how for the next challenges. By not by imposing a vision of the future but doing the work today and  investigating the next steps, we get ready to respond to future requests from the industry.

What will be the nest steps of 3DoF+ and 3DG convergence is another story for another article.

Posts in this thread

 

No one is perfect, but some are more accomplished than others

In quite a few of my articles on this blog I have described how well MPEG has managed to create generic standards for the media industry, how many new standards keep on being produced to support the expansion of the media business and how much the expansion has brought benefits to that industry.

The first part of the title, however, reminds us that MPEG is an organisation created by humans and populated by humans. As MPEG is not and cannot be perfect, it is appropriate to ask what is the MPEG level of performance.

Being able to constantly answer the performance question is important for MPEG because there must be a system in place that can be used to constantly monitor the performance of the group. Even if performance is excellent today, there is no reason to rest because MPEG may very well not be excellent tomorrow.

The problem in trying to answer the question of MPEG performance is that MPEG is not a company, but a standards committee. What are the Key Performance Indicators (KPI) of a standards committee like MPEG? For sure MPEG is “successful” (it does provide useful standards), but is it successful enough compared to the possibilities that its mission offers?

In this article I will try and answer the question of MPEG adequacy to its mission by applying the SWOT (Strengths-Weaknesses-Opportunity-Threats) methodology to a set of parameters: Context, Scope of standards and Business model.

Obviously, these are not the only parameters that are relevant to an answer to MPEG adequacy to its mission.

SWOT is an excellent methodology because it separates internal (strengths and weaknesses) from external issues (opportunities and threats), even though it is not always easy to allocate an issue to internal and external.

The SWOT analysis reported here uses in part the results of the SWOT analysis carried out by the Italian National Body UNI in their proposal to make MPEG a subcommittee.

In the future other parameters will be considered: Membership, Structure, Leadership, Client industries, Collaboration, Standards development, Standards adoption, Innovation capability, Communication and Brand.

This is quite an engaging plan of work. So, expect to see more episodes on this blog.

Context

By “context” I mean the framework in which MPEG operates, namely ISO and IEC, but also in part ITU, because MPEG and ITU collaborate in JCT-VC and JVET.

Strengths

ISO, IEC and ITU are the topmost international standards developing organisations (SDO). Their standards have a high reputation because they are produced in environments governed by rigid rules (the ISO/IEC directives). Published standards are of a high editorial quality because they result from a rigorous process.

The very fact that such an atypical organisation like MPEG could take root, create a network of contacts, develop standards, influence large swathes of industries and thrive shows the strength of the context in which MPEG operates.

Since they are labelled ISO, MPEG standards can become part of a conformance assessment program or even be referenced by legislation.

Weaknesses

Being international organisations, ISO, IEC and ITU are hierarchical and bureaucratic in different measures and in different domains. As I explained in MPEG and ISO, ISO covers all areas of international standardisation, with the exception of  electrotechnical (IEC) and telecommunication (IEC) matters. The same rules apply to all parts of ISO and this has a high price tag. They are, almost by definition, slow to adapt to fast changing industry environments. Publication of standards takes time, often a year or more since the final draft international standard is released by the originating committee.

MPEG is quite different than most ISO/JTC 1 entities because it almost does not develop standards that address terminology, policies, architectures, frameworks etc. Indeed, most MPEG standards are deeply technical, bit-level specifications. They are actually closer to an internal company specification than a regular architecture or framework standard.

Some of the ISO/IEC rules have serious impacts on MPEG’s ability to develop timely and bug-free standards.

MPEG is a working group (WG), the lowest level of the ISO structure. To design, develop and “market” its standards, MPEG needs to establish liaisons with many other bodies. However, the process that establishes liaisons is slow because of many hierarchical layers.

Establishing liaisons with a non-ISO/IEC/ITU entity is cumbersome because the target group must apply for “recognition” by ISO, a step that not many bodies, particularly those with a high status, are ready to undertake.

Opportunities

Being part of a huge organisation, MPEG may establish liaisons with any committee in that organisation. JTC 1 has a standing agreement with ITU-T whereby a common standard can be easily developed jointly by JTC 1 and ITU-T groups by making reference to that agreement.

Since MPEG standards cut across a large number of application domains, many of which are under the purview of the 3 international SDOs, MPEG can be the vehicle that helps foster communication among the 3 bodies.

Threats

MPEG is a large WG having reached an attendance of 600 participants at the last (July 2019) meeting. There is a risk that ISO blindly applies its directive calling for WGs to have a “restricted” number of members and to be of “limited” size. Should this happen the MPEG standards that industry needs would no longer be produced or will be produced with less quality. Nothing will replace the swift infrastructure that enables technology integration. A suitable new structure would take years to emerge while there is no time to quibble because competition from other standards based on different business models is eating up at industries that used to be MPEG client industries.

A decision to break up MPEG could have extremely serious consequences on an industry that has been accustomed for decades to interact with and to receive standards from MPEG. The consequences will extend to thousands of highly skilled researchers, millions of workers and billions of consumers.

Scope of standards

A simplified version of the MPEG scope of work is

  1. Efficient coding and compression of digital representations of light and sound fields
  2. Efficient coding and compression of other digital data
  3. Support to digital information coding and compression

#1 refers to the traditional audio, video and 3D graphics compression, including immersive media, #2 concerns other data, e.g. compression of DNA samples or neural networks and #3 concerns ancillary, but no less important, topics such as file and transport formats.

Strengths

A major strength of the MPEG scope is the fact that it extends through the entire chain enabled by compression and decompression technologies. The breadth of scope has allowed MPEG to develop suites of standards that can be individually used, but also used in an integrated package.

Figure 1 – Integrated MPEG standard suites

Its broad scope has thus enabled MPEG to create a digital media ecosystem composed of technology providers feeding their technologies into the MPEG technology repository via Calls for Proposals, that products manufacturers can integrate to service the MPEG client industries.

Figure 2 – The MPEG industries: Technology, Implementation and Client

MPEG’s ability to cover all the space defined by the scope has been a major element that has determined the success of MPEG standards. This was in place since the early MPEG days of MPEG-1, when the need to provide a complete specification prompted MPEG to also work on Audio and Systems issues, and of MPEG-2, when video distribution on analogue channels prompted MPEG to develop the famous Emmy Award recipient and long-lived MPEG-2 Transport Stream, and CATV applications prompted MPEG to develop the DSM-CC protocols.

Another strength is the fact that MPEG standards result from collaborative developments and are maintained by the MPEG “community”.

Weaknesses

Thirty-one years ago, when MPEG held its first meeting, companies were already applying digital technologies, but in an uncoordinated manner. However, when MPEG digital audio and video compression standards became available, they sold like hot cakes because market demanded the savings made possible by standard digital technologies and the new opportunities offered by MPEG standards. Today markets keep on demanding old forms of savings, but also new services enabled by new media technologies. The more complex technology scenario makes it increasingly difficult to understand which standards based on which technology are needed by industry.

Because MPEG is mostly a technical group, it is also difficult to have the appropriate number of market experts with the appropriate knowledge to develop the requirements for new projects.

Opportunities

The MPEG scope offers a very large number of opportunities for standardisation in the audio-visual domain, in the traditional space, in the new “immersive media space” and in the new non-media data compression space. Since digitisation is becoming more and more a buzzword, more industries are discovering the benefits of handling their processes with digital technologies. The large amounts of digital data so generated can benefit from compression, as I wrote in Compression – the technology for the digital age. Compression can be used to optimise and enhance their processes and provide, much as it happened for the media industries, unrelenting expansion to their businesses. This could happen any time soon in two other domains that MPEG has already tackled: genomic information (see Genome is digital, and can be compressed) and neural networks (see Moving intelligence around). In The MPEG frontier I have elaborated on some of the opportunities.

Threats

Data compression is important, and MPEG does offer plenty of solutions to achieve that in different instances. Data compression is a crucial enabling technology, but customers need more than that. This has been a recurring theme in all MPEG standards from the time (1990) MPEG realised that software could be used not just to run video and audio compression algorithms, but also as a way to specify the standard.

Industry has evolved a great deal since then. In some environments that MPEG could claim are part of its purview, the standard is just the software. Some organisations, alliances or even companies offer high-quality software without a textual specification. This means that, even if the software may be open source, the actual specification is practically “hidden”, and the reference software may easily become the only implementation, because it is “the specification”.

MPEG prides itself of its ability to produce bare-boned standards that specify the minimum, but is threatened by other entities who can provide packages whose completeness is not matched by MPEG specifications.

A related threat comes from the confusion generated by the fact that other standards organisation may produce standards on, say AR/VR, that appear to compete with MPEG compression standards. while they are at completely different levels.

A major threat is the possible change of mind on the part of ISO regarding the size of MPEG. A decision to split MPEG in its component elements would be a disaster because, as mentioned above, MPEG acts as an ecosystem of groups competent on a wide range of interacting technologies assembled to produce integrated and coherent standards. Note that MPEG experts do not feel uncomfortable with the size of the group because the wide scope gives them the opportunity to be exposed to more views, issues and opportunities.

Business model

MPEG is not a for-profit entity. However, it operates on the basis of an implicit “business model” that has powered its 30 year-long continuous expansion. In plain words

  1. MPEG develops high-performance standards using the best technologies available, as offered in response to Calls for Proposals (CfP);
  2. Patents holders receive royalties through mechanisms that do not involve MPEG and usually re-invest those royalties in new technologies;
  3. MPEG develops new generations of MPEG standards using those new technologies.

Strengths

The very existence of MPEG with a growing membership is a proof that the MPEG business model is valid because excellent MPEG standards remunerate good IP and royalties earned from existing standards fund more good IP for future standards.

Weaknesses

All good games must come to an end. The end in MPEG has not come yet, but difficulty in obtaining licensing for some MPEG standards (see, e.g. A crisis, the causes and a solution) show that the MPEG business model is no longer as strong and immediately applicable as it used to be.

There is resistance to changes even of a limited scope to the MPEG business model. Therefore, the MPEG business model is weakening because it has not been allowed to adapt.

Opportunities

I have tried to highlight some solutions in Can MPEG overcome its Video “crisis”?, IP counting or revenue counting? and Matching technology supply with demand make some proposals that provide an opportunity to enhance the MPEG business model without reneging its foundations.

Threats

The threats are concrete and serious. MPEG may become irrelevant if it stays exclusively with an outdated business model. But MPEG is not a company, it is an organisation that operates based on rules established by the appropriate authorities within ISO and where decisions are made by consensus.

Conclusions

This is just the beginning of the SWOT analysis. Very soon I will publish an article on Membership, Structure and Leadership.

Posts in this thread

 

Einige Gespenster gehen um in der Welt – die Gespenster der Zauberlehrlinge

Introduction

The title of this article is inspired by two masterpieces of German philosophy and literature. The first is Karl Marx’s “The Manifesto of the Communist Party” with the metaphor of the spectre (of communism) going around in Europe while the powers of conservation try to stop it. The second is Johann Wolfgang von Goethe’s “The Sorcerer’s Apprentice”, the story of an apprentice who thinks he can enchant a broom and get it to do some work for him, because he has seen his master doing it. The broom gets out of hand, the master comes back, the apprentice implores the master to help and the master sorts out the apprentice’s mess.

I agree that all the above is still rather cryptic and it is not at all clear what these two German works “combined” have to do with the topics that I usually deal with in this blog.

So let me explain: the broom is MPEG, the sorcerers are the people who run MPEG and the spectres are the multinational apprentices who think they can handle the MPEG broom because they have seen it done by those who know how to do it.

The apprentices are labelled spectres because the word indicates “something widely feared as a possible unpleasant or dangerous occurrence”.

Let’s talk about the MPEG broom

As I wrote in Who “owns” MPEG?, the word MPEG is used to indicate several related but often independent things. In one instance, MPEG stands for the “MPEG community”, i.e. the ensemble of people and entities who are affected by what the “MPEG group” does: end users, industries, companies who do business using MPEG standards, universities and research centres, and individuals with an MPEG technical background. Each element in the list is a microcosm, but here we are particularly interested in the last microcosm – individuals with an MPEG technical background. This is composed of active MPEG experts, non-attending registered MPEG experts, researchers working in companies on MPEG standards without being registered members, researchers at large who are doing research in areas that are, or are expected to become, MPEG standardisation areas, and consultants in MPEG matters.

All these people are MPEG stakeholders (the others, too, but here we concentrate on this particular microcosm). They rely on MPEG because MPEG is serving them. MPEG owes part of its existence because they exist and operate. To a significant extent, these MPEG stakeholders can operate because MPEG exists.

The “MPEG group” is another microcosm where ideas percolate through different channels. As explained in Looking inside an MPEG meeting and How does MPEG actually work?, to become standardisation projects, ideas are processed in different ways. Requirements are developed, communicated to different environments and agreements “stipulated” with different industry stakeholders based on those requirements. Standardisation projects require technologies whose existence and performance levels must be verified. Technologies come into MPEG through different channels and are processed in different ways by different groups. Standards are verified against the stipulated performance. Finally, standards are living beings: they evolve and need maintenance.

MPEG is not a broom that operates with a magic, at least not in Goethe’s sense. You do not require spirits (“Geister”) to use it. Still, it is a sophisticated broom that has taken the shape it has as a result of a Darwinian process that has involved and is involving incremental adaptations to match the MPEG group to the needs of the expanding industry coverage, the continuous shift of the way industry operates and the accelerating technology cycles.

The shape that MPEG has gotten today is not final. If it were so, that would mean that MPEG is dead. MPEG is evolving and keeping it adapted to changing conditions is a serious matter. It cannot be left in the hands of some sorcerers’ apprentices.

MPEG and JPEG

Next to MPEG there is a group called JPEG. Everybody knows the word JPEG because of the .jpg extension of image files. The JPEG standard (ISO/IEC 10918-1, first released in 1992) has had a far-reaching impact on consumers because all handsets and computers can handle .jpg files and many important services have those files at the core of their business. But let’s make a comparison between MPEG and JPEG.

Parameters JPEG MPEG
Constituencies Image Broadcasting & AV streaming
Capability to evolve Still working on images Expanded field
Number of projects A few in parallel Several tens in parallel
Business models “Royalty free” “IP-encumbered”
Competition of standards No Very lively
Approaches                Holistic, top down Bottom up
Industry/ academia mix 1:1 3:1
Work force 60 members 600 members (+1000s outside)
Organisation Simple Sophisticated
Standards impact Huge (2 standards) Huge (many standards)
Future-oriented standards Light field image Point cloud & immersive video

We see that the two groups are different in many key respects: the industries they serve (image vs broadcasting and streaming), the capability to make the best out of the field to serve industry needs, the number of projects (limited vs several tens), the type of standards they provide (royalty free vs encumbered), the competition (little vs a lot of competition), the approach used to develop standards (principle-based vs experience-based), the percentage of academia in the membership (50% vs 25%), the organisation (handling a few vs handling tens of parallel projects), the impact (2 standards vs many), future oriented standards (coding of light field images vs coding of point clouds & immersive video).

Simply, it is a law of nature. If the size scales by an order of magnitude, everything ends up being different.

Enter the sorcerers’ apprentices

Now, the apprentices think that, because a part of MPEG is handling some technologies that JPEG, too, is handling, we should interpenetrate JPEG (60 people working on images as 30 years ago) and MPEG (600 people working on video, audio, systems and other data, who have designed the strategy that has made the media industry digital and fomented its continuous development) and create news groups creatively organised to manage the huge MPEG work program, the vast array of technologies, the network of liaisons and the large swathes of client industries. A similar fate may also be suffered by JPEG who is at risk to dissolve in the flood caused by the apprentices, as in Goethe’s ballade.

A disclaimer is needed at this point: this sort of idea has nothing to do with the proposal to elevate the MPEG Working Group (WG) to Subcommittee (SC) status presented in Which future for MPEG. The elevation to SC status seeks to change the WG envelope to an SC envelope, keeping the inside – the work and its organisation – exactly the same. The other seeks to upset a working machine thinking that the changes will work, much like Goethe’s sorcerer’s apprentice thought he could handle the magic broom.

A couple of expressions come to mind. The first is “elephant in a china shop”. The effects of the apprentices’ proposal will be exactly this: you will need to merge proud and accomplished people serving different constituencies; operating in environments of largely different complexity in terms of projects, number of people and industry; with different business models and approaches; operating in differently competitive environments; with different 30 years of history and experiences… After the elephant has entered the shop, forget finding any piece of chinaware intact.

The second expression is “A camel is a horse designed by a committee”. Unfortunately, this is not a joke but the harsh reality of some environments where people with a lot of self-importance operate in areas where they have little or no competence or experience. MPEG is mostly free from such people. Indeed, the sorcerers’ apprentices’ proposal comes mostly from people who left MPEG long time ago.

Some effects of the apprentices’ proposal

I could write a long list of negative effects, but let’s limit it to four.

  1. The MPEG brand. The proposed interpenetration will kill the MPEG brand affecting thousands of companies and researchers. Today researchers use their “I belong to MPEG” as a status symbol supporting their research. Tomorrow they will lose both their status and funding. The same applies to the JPEG brand.
  2. The MPEG credibility. The proposed interpenetration will mix two groups who share only a little part of one thing: technology. Technology is important, however, designing the structure of an industrial standards group like MPEG on the basis of technology, instead of constituencies’ needs, wipes off the credibility built by thousands of MPEG experts in 30 years of well-considered efforts.
  3. The MPEG standards. The proposed interpenetration will alter the process by which MPEG standards are defined and developed. Industry will shy away from this new generation of self-styled “MPEG standards” because they will not fit their needs and will look elsewhere. The only sensible thing MPEG will be left with is the maintenance of the 180 standards that were produced by the real MPEG.
  4. The MPEG productivity. The proposed interpenetration will dramatically affect the number and quality of standards produced. One value of MPEG standards is the breadth and depth of their scope. More important, however, is the fact that MPEG standards are not independent specifications but are designed to work together thanks to the painstaking efforts of hundreds of MPEG experts from different areas.

The sorcerers have their hands tied

Decades after decade generations of MPEG sorcerers have learnt the magic, but they are not free. If the MPEG broom is wrongly used, there will be no sorcerers coming back to help the apprentices to undo their misdeeds. The apprentices may well moan die ich rief, die Geister, werd’ ich nun nicht los (I cannot get rid of the spirits I called), but no one will be capable to stop the MPEG broom gone crazy.

Those who care about MPEG have better make themselves heard. At stake there are trillions of USD year on year, billions of users, millions of workers and thousands of highly skilled researchers.

Posts in this thread

 

Does success breed success?

Introduction

Most readers will answer yes to the question asked in the title. Indeed, very often we see that success of a human organisation breeds success. Until, I mean, the machine that looked like it could produce results forever “seizes”. But don’t look elsewhere for the causes of failure: it’s not the machine, the causes are humans inside and/or outside.

In an age when things move fast and change, MPEG has been in operation for three decades. Its standards have achieved and continue achieving enormous success serving billions of human beings: consumers, service providers and manufacturers.

This article makes some considerations on the best way for MPEG success to breed success – unless success to breed failure is the goal. Apparently unrelated considerations are made in The Imperial Diet is facing a problem.

Recalling the MPEG story

MPEG started in 1988 as an “experts group” with the task to develop video coding standards for storage media at a rate of about 1.5 Mbit/s, like the compact disc (CD). This was because, in the second half of the 1980’s, the Consumer Electronics and telco industries imagined that interactive video – local or via the network – was a killing application.

Within 6 months MPEG had already started working on audio coding because – it looks obvious now, but it was not so obvious at that time – if you have video you also need audio and, if you do not compress stereo audio at 1.41 Mbit/s, the output bitrate of a CD, there will be no space left for video. In another 6 months MPEG had started working on “systems” aspects, those allowing a receiver to reproduce synchronised audio and video information.

These were the first steps in the MPEG drive to make standards that had no “holes” for implementors. Thanks to these efforts, the scope of use of MPEG standards, still within the scope of “coding of moving pictures and audio”, have expanded like wildfire: starting from coding of moving pictures at 1.5 Mbit/s and expanded to more video, audio, transport, protocols, API and more. With its standards, MPEG is handling all technologies that facilitate enhanced use of digital media.

The MPEG expansion is a joyous phenomenon that has created an expanding global brotherhood of digital media researchers – in industry and academia – for which MPEG and its standards are the motivation for more research. If research results are good, they can make their way into some MPEG standard.

MPEG needs a structure

Clearly you cannot have hundreds of people discussing such a broad scope of technologies at the same time and place. You can split the work because technologies can be considered independent up to a point. Eventually, however, like in a puzzle, all pieces have to find a place in the global picture. The MPEG structure has been implemented to allow the creation of ever more complex puzzles.

In its 31 years of activity MPEG has developed a unique organisation capable of channeling the efforts of thousands of researchers working at any one time on MPEG standards – only a fraction of which actually show up at MPEG meetings – into the suites of integrated standards that industry uses to churn out products and services worth trillions of USD a year.

The figure below depicts the MPEG structure from the viewpoint of the standard development workflow.

The MPEG workflow

Typically, new ideas come from members’ contributions, but can also be generated from inside MPEG. The Requirements group assesses and develops ideas and may go as far as to request “evidence” of existence and performance of technologies (Calls for Evidence – CfE) or actual “proposals” for fully documented technologies (Call for Proposals – CfP).

MPEG has never had a “constituency” because it develops horizontal standards cutting across industries. It has established liaisons with tens of industries and communities through their standards committees or trade associations. We call many of them as “client industries” in the sense that they provide their requirements to MPEG against which MPEG produces standards. At every meeting, several tens of input liaisons are received and about the same amount of output liaisons are issued.

Many CfPs cover a broad range of technologies that are within the competence of the different MPEG groups. The adequacy of submitted technologies is tested by the Test Group. The submitted proposals and the test results are provided to the appropriate technical groups – Systems, Video, Audio and 3D Graphics.

The Chairs group includes the chairs of all groups. It has the task to assess the progress of work, uncover bottlenecks, identify needs to discuss shared interests between groups and organise joint meetings to resolve issues.

An MPEG week is made of intense days (sometimes continuing until midnight). Coordinated work, however, does not stop when the meeting ends. At that time MPEG establishes tens of ad hoc groups with precise goals for collaborative development to be reported at the next meeting.

The Communication group has the task to keep the world informed of the progress of the work and to produce white papers, investigations and technical notes.

MPEG is not an empire

From the above, one may think that MPEG is an empire, but it is not. MPEG is a working group, the lowest layer of the ISO hierarchy, in charge of developing digital media standards. It formally reports to a Subcommittee called SC 29 but, as I have explained in Dot the i’s and cross the t’s, SC 29 has ended up with a laissez-faire attitude that has allowed MPEG to autonomously develop strategy, organisational structure and network of client industries. MPEG standards have given client industries the tools to make their analogue infrastructures digital and, subsequently, to leverage successive generations of standard digital media technologies to expand their business. With some success, one could say.

The MPEG organisation is robust. Virtually the same organisation has been in place since – 25 years ago – MPEG had an attendance of 300. Groups have come and gone and the structure currently in operation has been refined multiple times in response to actual needs. Changes have been effected, and there will be more changes in the future. However, they all have been and, as far as I can see, will be incremental adaptations, to perfect one aspect or another of the structure. With this structure, more than 150 standards have been produced, some of which have been wildly successful.

MPEG can count on three assets: the logic of the structure, the experience gained in all those years, its membership and its client industries. With these, MPEG success can breed more success in the years to come.

The Imperial Diet is facing a problem

I said before that MPEG is not an empire. In the imperial context of the Holy Roman Empire, MPEG could be defined as a Margraviate in charge of defending and extending a portion of the frontiers of the Empire. A Margraviate reported to a Kingdom who reported to the Imperial Diet.

Now, let’s suppose that the Imperial Diet has requested the S Kingdom to review the status of its two J and M Margraviates and propose a new arrangement. The main element in the decision is the size of the two Margraviates: 10% of the territory of the S Kingdom for the J Margraviate and 90% for the M Margraviate. Ruling out other fancy ideas, the S Kingdom has two options: request that the M Margraviate be elevated to Kingdom status or create a few smaller Margraviates inside the S Kingdom out of the M Margraviate.

There is a problem, though, if the M Margraviate is cut in smaller Margraviates: the Margraviates of the Holy Roman Empire are not domino game pawns. For decades the M Margraviate has fought hard extending its territory – hence the Holy Roman Empire’s territory – to lands that until then were occupied by unruly tribes. It has been successful in its endeavours because it had large armies with different skills: archers, knights, foot soldiers and more. By skillfully coordinating these specialised troops, the M Margraviate was able to conquer new lands and make them faithful fiefdoms.

But there is another important consideration: there are wild hordes coming from the steppes of Central Asia with a completely new warfare technique. Some armies of M Margraviate are having a hard time dealing with them, even though they are learning a trick or two to fight back.

How could the new armies of the different Margraviates created out of the M Margraviate defend – never mind extend – the frontier, when the S Kingdom does not know the territory, having lived all time in its castle, and has never led an army?

The Holy Roman Empire lasted 1,000 years. There is no doubt that the Imperial Diet would make the M Margraviate a Kingdom keeping its armies and structure unchanged. Warfare is a serious business and the effective defence of the frontiers is the priority.

Conclusions

Fortunately, today there is no Margraviates and Kingdoms anymore, much less the Holy Roman Empire. There are also no new territories to conquer by force of arms and there are no frontiers to defend against rebellious hordes.

I realise now that at the beginning of this article I have promised that I would make some considerations on the best way for MPEG success to breed success and not failure. Maybe I will do that next time.

Posts in this thread

 

 

Dot the i’s and cross the t’s

Introduction

In Book 2 of the Georgics, the Latin poet Virgil writes “Felix, qui potuit rerum cognoscere causas”. This maxim was true in 29 BC when the Georgics was written and remains true 2030 years later. For those who did not have the chance to study Latin, the verse means “Fortunate who could know the causes of things”.

Virgil’s maxim will be used in this article to draw some conclusions on current matters. Before getting to the things, however, I need to talk about the the causes. Those who cannot wait can jump to the conclusions, at their own risk.

The need for standards and standards bodies

At the climax of Belle Époque, Europe realised that a properly functioning industry needed standards. Britain – at that time for sure and, from now on, most likely not really part of Europe – was the first to establish an Engineering Standards Committee (1901).

In 1906 the most culturally advanced industrial field of the time – electrical technologies – was the first to recognise the need, not just for standards, but international ones, and established the International Electrotechnical Commission (IEC). Electrical technologies were second only to telecommunications (actually third, if we consider the Universal Postal Union) in recognising that need. Indeed, 41 years before governments had established the International Telegraph Union. But governments is one thing and private industry quite another.

The rest of the world took a quarter of a century and a war to realise the need for international standards. Finally in 1926 the International Federation of the National Standardising Associations (ISA) started, only to stop 16 years later when governments has other priorities (killing millions of people in WW II). In 1946 the idea was revived and the International Organisation for Standardisation (ISO) was created as a not-for-profit association. National Standards Associations (or National Bodies) – not governments – are represented in ISO.

How can you govern an international organisation that issues standards that are, yes, voluntary but, if you do not conform, you’re in a whole world of hurt? The answer is: hierarchy and scope. In ISO there are 4 layers (actually more, if you want to know how many read Who owns MPEG?): Technical Management Board (TMB), Technical Committees (TC), Subcommittee (SC) and Working Group (WG). Each entity is administered by a secretariat run by a National Body and has a scope that defines what the entity is expected and entitled to do.

Some scopes

Delimitation of territory is one of the most engaging human activities. According to Standards for computers and information processing, by T. B. Steel, Jr, page 103 et seqq. (in Advances in computers, Franz L. Alt and Morris Rubinoff (Editors) Volume 8) in 1967 the scope of TC 97 Data processing was: the standardisation of the terminology, problem definition, programming languages, communication characteristics, and physical (i.e. non electrical) characteristics of computers, and information processing devices, equipment and systems.

According to the same source, TC 97/SC 2 Characters set and coding at that time was about Standardisation of character sets, character meanings, the grouping of character sets into information, coded representation, and the identification of it for the interchange of information between data processing systems and associated equipments…

Typically working groups develop standards. They do so with a major constraint: decisions may only be made by “consensus”, defined as

general agreement where there is no sustained opposition to substantial issues by any important part of the concerned interests, in a process that seeks to take into account the views of all parties concerned.

The definition is supplemented by the note: consensus does not imply unanimity.

Obviously this text can only hint at the complexity of other environments where decisions are made not by consensus but by voting. One can image that these other environments are such that, in comparison, a horse-trading market is a place that boarding school pupils can safely visit.

JPEG and MPEG

In the 1980’s Videotex was a service telcos wanted to offer as a competing service to broadcasters’ Teletext service. One limitation of videotex and teletext, however, was that information could only be displayed with characters and rudimentary graphics (made as combinations of ad hoc characters). Telcos thought that videotex services could be enhanced by pictures transmitted at 64 kbit/s made available by Integrated Services Digital Network (ISDN).

In 1986 a joint group between TC 97 of ISO and SG XVIII of CCITT (ITU-T’s name of the time) was created to develop a compressed image format. As videotext was based on characters, TC 97/SC 2 was the natural place to develop that standard. SC 2 created WG 8 Coded representation of Picture and audio information. WG 8 hosted the Joint Photographic Coding Experts Group (JPEG).

Two years later, WG 8 created the Moving Picture Experts Group (MPEG), not joint with CCITT. In any case, if it had been joint, it would have been joint with SG XV, not with SG XVIII. There was nothing equivalent to the Treaty of Tordesillas, but in CCITT the digital world was divided between SG XVIII for audio and telematic services, and SG XV for video. MPEG – Coding for Moving Pictures and Audio – was a Copernican revolution.

Immediately, MPEG had a skyrocketing attendance: 100 members 18 months after its establishment and 200 members after two more years. That was because MPEG was working on such a high profile standard as digital television (actually only the baseband part of it, but that did not really matter).

Unlike other committees dealing with the “digital television”, who were populated by “advocates” accustomed to use “analogue” arguments to support their proposals, MPEG was populated by technical experts who made their cases with “digital” arguments in the framework of inflexibly digital Core Experiments rules. Some “advocates” did show up in the early MPEG-2 days, but they soon left never to come again.

The parent committee

In the years 1989-90-91, I had supported WG 8 Convenor’s bid to promote WG 8 to SC status (see here for more details) and in April 1990 the SC 2 plenary approved the following resolution:

JTC1/SC2 considering that

  1. The standardisation of the coded representation of picture, audio and multimedia information is considered to be one of the most important areas for standardization in the 1990’s;
  2. The work and scope of SC2/WG8 has expanded substantially beyond the scope of SC2;
  3. The work of SC2/WG8 has developed into a critical mass largely significant to warrant SC status;

Recommeds to JTC1

  1. To establish a new JTC1 Subcommittee for the purpose of developing standards in the area of Coded Representation of Picture, Audio and Multimedia Information;

It took another 18 months for SC 29, the entity WG 8 had morphed into, to hold its inaugural meeting.

A role for SC 29

In ISO a Subcommittee is part of the formal hierarchy. What was SC 29’s role ?

  1. Playground for “advocates”. Having found a hard time in MPEG, one could think that “advocates” should move to SC 29 to find a more consonant “breeding ground”. Indeed, SC decisions are made by voting, but only after a lot of “analogue haggling” in the hallways. This did not happen because, once MPEG had settled the algorithm, the standard was done, save the need to to cross some t’s and to dot some i’s. There could have been room for some “analogue discussions” on some business-related issues as profiles and levels. SC 29, however, was not the right place to hold such discussions because only MPEG experts could handle the technical aspects.
  2. Playground of large company representatives. At that time some ISO committees were populated by some national body representatives who worked for some large companies. They were interested in committees NOT to develop some standards and sent their representatives to act accordingly. But as fate would have it, in the years immediately following the establishment of SC 29 there was a serious economic crisis that forced some large companies to lay off those professional participants to cut “unnecessary” expenses.
  3. Strategic planner. In 1993, the time of John Malone’s “500 channels”, the Italian National Body proposed to investigate standardisation opportunities for content metadata (see here for more details). SC 29 established an ad hoc group to study the needs for users who wanted to find content in those 500 channels. One year later, however, the convenor of the ad hoc group reported that there had been no activity. MPEG then developed the suite of content metadata standards called MPEG-7.

No one should be surprised that, for the next 25 years, SC 29 held yearly meetings to discuss such strategic issues as progression of work items, consolidations and minor revisions, and liaisons. Of course with no “advocates” in sight.

MPEG as a virtual subcommittee

The space left empty by SC 29 was occupied by MPEG. Continuing its initial drive, MPEG developed a modus operandi that has allowed it to produce the integrated digital media standards that have changed and keep on changing the media industry.

The four figures below depict the main elements of MPEG’s modus operandi.

  1. Top-left depicts the adaptation of ISO’s standard development process to acquire technology elements suitable to the development of a standard and to verify that the standard developed matches the original requirements. More on this at How does MPEG actually work?
  2. Top-right depicts the industries contributing technologies (right-hand side), the means to acquire them, the assets accumulated in MPEG standards and the client/implementation industries (bottom of figure). More on this at The MPEG ecosystem.
  3. Bottom-left depicts the unfolding of the MPEG workflow: ad hoc groups; “MPEG week” with its components: plenaries, subgroups, break-out groups, joint meeting and chairs meetings; and creation of new ad hoc groups. More on this at Looking inside an MPEG meeting.
  4. Bottom-right depicts the integrated nature of MPEG standards. The parts are developed by different groups who come to agree on the glue that is needed to keep the parts independent and interworking. Moreon this  in Hamlet in Gothenburg: one or two ad hoc groups?

Conclusions

MPEG has been fortunate to have been able to operate in a paradise island for 31 years.

It has devised strategies and defined work plans. It has sought and established liaison with outside industries. It has added industries as members of the MPEG digital media community. It has called for technologies and integrated them into standards. It has been the ear industries could talk to to have their needs satisfied

All this while MPEG meetings have grown to 600 participants and “advocates” have been kept at bay.

The MPEG digital media community has thrived on a continuous flow of evolving standards with an impact measured in trillions of USD and billions of people.

In Paradise Lost John Milton writes: Better to reign in Hell, than to serve in Heaven.

In Paradise MPEG a reborn John Milton could write: Better to serve in Heaven than to reign in Hell.

SC 29 was kind enough to handle administration. Mindless industry elements should memorise Virgil’s maxim before they engage in their adventures.

Posts in this thread

 

The MPEG frontier

Introduction

MPEG has developed standards in many areas. One the latest is compression of DNA reads from high-speed sequencing machines and is now working on Compression of Neural Networks for Multimedia Content Description and Analysis.

How could a group who was originally tasked to develop standards for video coding for interactive applications on digital storage media (CD-ROM) get to this point?

This article posits that the answer is in the same driving force that pushed the original settlers on the East Coast of the North American continent to reach the West Coast. This article also posits that, unlike the ocean after the West Coast that put an end to the frontier and forced John F. Kennedy to propose the New Frontier, MPEG has an endless series of frontiers in sight. Unless, I mean, some mindless industry elements will declare that there is no longer a frontier to overcome.

The MPEG “frontiers”

The ideal that made MPEG experts work over the last 31 years finds its match in the ideal that defined the American frontier. As much as “the frontier”, according to Frederick Jackson Turner, was the defining process of American civilisation, so the development of a series of 180 standards for Coding of Moving Pictures and Audio that extended the capability of compression to deliver better and new services has been the defining process of MPEG, the Moving Picture Experts Group.

The only difference is that the MPEG frontier is a collection of frontiers held together by the title “Coding of Moving Pictures and Audio”. It is difficult to give a rational order in a field undergoing a tumultous development, but I count 10 frontiers:

  1. Making rewarding visual experiences possible by reducing the number of bits required to digitally represent video while keeping the same visual quality and adding more features (see Forty years of video coding and counting and More video with more features)
  2. Making rewarding audio experiences possible by reducing the number of bits required to digitally represent audio with enhanced user experiences (see Thirty years of audio coding and counting)
  3. Making rewarding user experiences by reducing the number of bits required by other non-audio-visual data such as computer-generated or sensor data
  4. Adding infrastructure components to 1, 2 and 3 so as to provide a viable user experience
  5. Making rewarding spatially remote or time-shifted user experiences possible by developing technologies that enable the transport of compressed data of 1, 2 and 3 (What would MPEG be without Systems?)
  6. Making possible user experiences involving combinations of different media
  7. Giving users the means to search for the experiences of 1, 2, 3 and 4 of interest to them
  8. Enabling users to interact with the experiences made possible by 1, 2, 3 and 4
  9. Making possible electronic commerce of user experiences made possible by 1, 2, 3, 4 5 and 6
  10. Defining interfaces to facilitate the development of services.

The table below provides a mapping between MPEG frontiers and MPEG standards, both completed and under development.

Legend: DA=MPEG DASH, CI=CICP, Io=IoMT

No end to the MPEG frontier

Thirty-one years and 180 standards later, MPEG has not accomplished its mandate, yet. That is not because it has not tried hard enough, but because there is an unceasing stream of new technologies providing new opportunities to better accomplish its mandate with improved user satisfaction.

While developing its standards, MPEG has substantially clarified the content of its mandate, still within the title of “Coding of Moving Pictures and Audio”. The following topics highlight what will be the main directions of work in the next few (I must say, quite a few) years to come.

More of the same

The quest for new solutions that do better or just simply different than what has been done in the past 31 years will continue unabated. The technologies that will achieve the stated goals will change and new ones will be added. However, old needs are not going to disappear because solutions exist today.

Immersive media

This is the area that many expect will host the bulk of MPEG standards due to appear in the next few years. Point Cloud Compression is one of the first standards that will provide immediately usable 3D objects – both static and dynamic – for a variety of applications. But other, more traditional, video based approached are also being investigated. Immersive Audio will also be needed to provide complete user experiences. The Omnidirectional MediA Format (OMAF) will probably be, at least for some time, the platform where the different technologies will be integrated. Other capturing and presentation technologies will possibly require new approaches at later stages.

Media for all types of users

MPEG has almost completed the development of the Internet of Media Things (IoMT) standard (in October 2019 part 1 will become FDIS, while parts 2 and 3 have already achieved FDIS in March 2019). IoMT is still an Internet of Things, but Media Things are much more demanding because the amount of information transmitted and processed is typically huge (Mbit/s if not Gbit/s) and at odds with the paradigm of distributed Things that are expected to stay unattended possibly for years. In the IoMT paradigm information is typically processed by machines. Sometimes, however, human users are also involved. Can we develop standards that provide satisfactory (compressed) information to human and machine users alike?

Digital data are natively unsuitable to processing

The case of Audio and Video compression has always been clear. In 1992/1994 industry could only thank MPEG  for providing standards that made it economically feasible to deliver audio-visual services to millions (at that time), and billions (today), of users. Awareness for other data types took time to percolate, but industry now realises that point clouds is an excellent vehicle for delivery of content for entertainment and of 3D environments for automotive; DNA reads from high-speed sequencing machines can be compressed and made easier to process; large neural networks can be compressed for delivery to millions of devices. There is an endless list of use cases all subject to the same paradigm: huge amount of data that can hardly be distributed but can be compressed with or without loss of information, depending on the application.

MPEG is now exploring the use case of machine tools where signals are sampled at a sampling rate > 40 kHz with 10 bit accuracy. In these conditions the machine tool generates 1 TByte/year. The data stored are valuable resources for machine manufactures and operators because they can be used for optimisation of machine operation, determination of on-demand maintenance and factory optimisation.

The password here is: industry 4.0. In order to limit the amount of data stored on the factory floor, a standard for data compression of machine tool data would be invaluable.

In you are interested in this new promising area please subscribe at https://lists.aau.at/mailman/listinfo/mpeg-mc.at and join the email reflector mpeg-mc@lists.aau.at.

Is MPEG going to change?

MPEG is going to change but, actually, nothing needs to change because the notions outlined above are already part of MPEG’s cultural heritage. In the future we will probably make use of neural networks for different purposes. This, however, is already a reality today because the Compact Descriptors for Video Analysis (CDVA) standard uses neural networks and many proposals to use neural networks in the Versatile Video Coding (VVC) standard have already been made. We will certainly need neural network compression but we are already working on it in MPEG-7 part 17 Neural Networks for Multimedia Content Description and Analysis.

MPEG has been working on MPEG-I Coded representation of immersive media for some time. A standard has already been produced (OMAF), two others (V-PCC and NBMP) have reached DIS level and two others have reached CD level (G-PCC and VVC). Many parallel activities are under way at different stages of maturity.

I have already mentioned that MPEG has produced standards in the IoMT space, but the 25 year old MPEG-7 notion of describing, i.e. coding, content in compressed form is just a precursor of the current exploratory work on Video Coding for Machines (reflector: mpeg-vcm@lists.aau.at, subscription: https://lists.aau.at/mailman/listinfo/mpeg-vcm).

In the Italian novel The Leopard, better known in the 1963 film version (director Luchino Visconti, starring Burt Lancaster, Claudia Cardinale and Alain Delon), the grandson of the protagonist says: “Se vogliamo che tutto rimanga come è, bisogna che tutto cambi” (if we want that everything stays the same, everything needs to change).

The MPEG version of this cryptic sentence is “if we want that everything changes, everything needs to stay the same”.

Guidance for the future

MPEG is driven by a 31-year long ideal that it has pursued using guidelines that it is good to revisit here while we are about to enter a new phase:

  1. MPEG standard are designed to serve multiple industries. MPEG does not – and does not want to – have a “reference industry”. MPEG works with the same dedication for all industries trying to extract the requirements of each without favouring any.
  2. MPEG standards are provided to the market, not the other way around. At times when de facto standards are popping up in the market, it is proper to reassert the policy that international standards should be developed by experts in a committee.
  3. MPEG standards anticipate the future. MPEG standard cannot trail technology development. If it did otherwise is would be forced to adopt solution that a particular company in a particular industry has already developed.
  4. MPEG standards are the result of a competition followed by collaboration. Competition is as the root of progress. MPEG should continue publishing its work plan so that companies can develop their solutions. MPEG will assess the proposals, select and integrate the best technologies and develop its standards in a collaborative fashion.
  5. MPEG standards thrive on industry research. MPEG is not in the research business, but MPEG would go nowhere is not constantly fed with research, in responses to Calls for Proposals and in the execution of Core Experiments.
  6. MPEG Standards are enablers, not disablers. As MPEG standards are not “owned” by a specific industry, MPEG will continue assessing and accommodating all legitimate functional requirements from whichever source they come.
  7. MPEG standards need a business model. MPEG standards has been successful also because those successful in contributing good technologies MPEG standard have been handsomely remunerated and could invest in new technologies. This business model will not be sufficient to sustain MPEG in its new endeavours.

Conclusions

Leonardo da Vinci, an accomplished performer in all arts and probably the greatest inventor of all times, lived in the unique age of history called Renaissance, when European literati became aware that knowledge was boundless and that they had the capability to know everything. Leonardo’s dictum “Homo sum, humani nihil a me alienum puto” (I am human, and nothing human I consider alien to me) well represents the new consciousness of the intellectual power of humans in the Renaissance age.

MPEG does not have the power to know everything – but it knows quite a few useful things for its mission. MPEG does not have the power to do everything – but it knows how to make the best standards in the area of Coding of Moving Pictures and Audio (and in a few nearby areas as well).

It would indeed be a great disservice if MPEG could not continue serving industry and humankind in the challenges to come as it has successfully done in the challenges of the last 31 years.

Posts in this thread

Tranquil 7+ days of hard work in Gothenburg

Introduction

Purpose of this article is to offer some glimpses of 7 (actually 12, counting JVET activity) days of hard work at the 127th MPEG meeting (8 to 12 July 2019) in Sweden.

MPEG 127 was an interesting conjunction of the stars because the first MPEG meeting in Sweden (Stockholm, July 1989) was #7 (111 binary) and the last meeting in Sweden (Gothenburg, July 2019) was #127 (111111 binary). Will there be a 255th (1111111 binary) meeting in Sweden in July 2049? Maybe not, looking at some odd – one would call suicidal – proposals for the future of MPEG.

Let’s first have the big – logistic, but emblematic – news. For a few years the number of MPEG participants has been lurking at the level of 500, but in Gothenburg the number of participants has crossed the 600 people mark for the first time. Clearly MPEG remains a highly attractive business proposition if it has mobilised such a huge mass of experts.

It is not my intention to talk about everything that happened at MPEG 127. I will concentrate on some major results starting from, guess what, video.

MPEG-I Versatile Video Coding (VVC)

Versatile Video Coding (VVC) reached the first formal stage in the ISO/IEC standards approval process: Committee Draft (CD). This is the stage of a technical document that has been developed by experts but has not undergone any official scrutiny outside the committee.

The VVC standard has been designed to be applicable to a very broad range of applications, with substantial improvements compared to older standards but also with new functionalities. It is too early to announce a definite level of improvement in coding efficiency, but the current estimate is in the range of 35–60% bitrate reduction compared to HEVC in a range of video types going from 1080p HD to 4K and 8K, for both standard and high dynamic range video, and also for wide colour gamut.

Beyond these “flash news” like announcement, it is important to highlight the fact that, to produce the VVC CD, at MPEG 127 some 800 documents were reviewed. Many worked until close to midnight to process all input documents.

MPEG-5 Essential Video Coding (EVC)

Another video coding standard reached CD level at MPEG 127. Why is it that two video coding standards reached CD at the same meeting? The answer is simple: as a provider of digital media standards, MPEG has VVC as its top of the line video compression “product” but it is has other “products” under development that are meant to satisfy different needs.

One of them is “complexity”, a multi-dimensional entity. VVC is “complex” on several aspects. Therefore EVC does not have the goal to provide the best video quality money can buy, which is what VVC does, but a standard video coding solution for business needs that cover cases, such as video streaming, where MPEG video coding standards have hitherto not had the wide adoption that their technical characteristics suggested they should have.

Currently EVC includes two profiles:

  1. A baseline profile that contains only technologies that are over 20 years old or are otherwise expected to be obtainable royalty-free by a user.
  2. A main profile with a small number of additional tools, each providing significant performance gain. All main profile tools are capable of being individually switched off or individually switched over to a corresponding baseline tool.

Worth noting is the fact that organisations making proposals for the main profile have agreed to publish applicable licensing terms within two years of FDIS stage, either individually or as part of a patent pool.

MPEG-5 Low Complexity Enhancement Video Coding (LCEVC)

LCEVC is another video compression technology MPEG is working on. This is still at Working Draft (WD) level, but the plans call for achieving CD level at the next meeting.

LCEVC specifies a data stream structure made up of two component streams, a base stream decodable by a hardware decoder, and an enhancement stream suitable for software processing implem­entation with sustainable power consumption. The enhancement stream will provide new feat­ures, such as compression capability extension to existing codecs, lower encoding and decoding complexity. The standard is intended for on demand and live streaming applications.

It should be noted that LCEVC is not, stricto sensu, a video coding standard like VVC or EVC, but does cover the business need of enhancing a large number of deployed set top boxes with new capabilities without replacing them.

3 Degrees of Freedom+ (3DoF+) Video

This activity, still at an early (WD) stage, will reach CD stage in January 2020. The standard will allow an encoder to send a limited number of views of a scene so that a decoder can display specific views at the request of the user. If the request is for a view that is actually available in the bitstream, the decoder will simply display it. If the request is for a view that is not in the bitstream, the decoder will synthesise the view using all available information.

Figure 1 shows the effect of decreasing the number of views available at the decoder. With 32 views the image looks perfect, with 8 views there are barely visible artifacts on the tube on the floor, but with only two views artifacts become noticeable.

Of course this is an early stage result that will further be improved until the standard reaches Final Draft International Standard (FDIS) stage in October 2020.

Figure 1 – Quality of synthesised video as a function of the number of views

Video coding for machines

This is an example of work that looks like it is brand new to MPEG but has memories of the past.

In 1996 MPEG started working on MPEG-7, a standard to describe images, video, audio and multimedia data. The idea was that a user would tell a machine what was being looked for. The machine would then convert the request into some standard descriptors and use them to search in the data base where the descriptors of all content of interest had been stored.

I should probably not have to say that the descriptors had a compressed representation because moving data is always “costly”.

Some years ago, MPEG revisited the issue and developed Compact Descriptors for Visual Search (CVDS). The standard was meant to provide a unified and interoperable framework for devices and services in the area of visual search and object recognition.

Soon after CDVS, MPEG revisited the issue for video and developed Compact Descriptord for Video Analysis (CDVA). The standard is intended to achieve the goals of designing interoperable applications for object search, minimising the size of video descriptors and ensuring high matching performance of objects (both in accuracy and complexity).

As the “compact” adjective in both standards signals, CDVS and CDVA descriptors are compressed, with a user-selectable compression ratio.

Recently MPEG has defined requirements, issued a call for evidence and a call for proposals, and developed a WD of a standard whose long name is “Compression of neural networks for multimedia content description and analysis”. Let’s call it for simplicity Neural Network Representation (NNR).

Artificial neural networks are already used for extraction of descriptors, classification and encoding of multimedia content. A case in point is provided by CDVA that is already using several neural networks in its algorithm.

The efficient transmission and deployment of neural networks for multimedia applications require methods to compress these large data structures. NNR defines tools for compressing neural networks for multimedia applications and representing the resulting bitstreams for efficient transport. Distributing a neural network to billions of people may be hard to achieve if the neural network is not compressed.

I am now ready to introduce the new, but also old, idea behind video coding for machines. The MPEG “bread and butter” video coding technology is a sort of descriptor extraction: DCT (or other) coefficients provide the average value of a regions and frequency analysis, motion vectors describe how certain areas in the image move from frame to frame etc.

So far, video coding “descriptors” were designed to achieve the best visual quality – as assessed by humans – at a given bitrate. The question asked by video coding for machines is: “what descriptors provide the best performance for use by a machine at a given bitrate?”

Tough question for which currently there is no answer.

If you want to contribute to the answer, you can join the email reflector after subscribing here. MPEG 128 will be eager to know how the question has been addressed.

A consideration

MPEG is a unique organisation with activities covering the entire scope of a standard: idea, requirements, technologies, integration and specification.

How can mindless industry elements think they can do better than a Darwinian process that has shaped the MPEG machine for 30 years?

Maybe they think they are God, because only He can perform better than Darwin.

Posts in this thread

Hamlet in Gothenburg: one or two ad hoc groups?

In The Mule, Foundation and MPEG, the latest article published on this blog, I wrote: In 30 years of MPEG, and counting? I somehow referred to the MPEG Mule when I wrote “Another thirty years await MPEG, if some mindless industry elements will not get in the way”. We may be close to know the fate of the MPEG Mule.”

We are nowhere close to knowing the fate of MPEG and in this article I will tell another episode of the saga.

In Which future for MPEG? I exposed my ideas about the future of MPEG based on a very simple reasoning. MPEG has developed global industry-agnostic digital-media standards that have led the media industry from analogue to digital, given opportunities to develop new business models and enabled the continuous expansion of the media industry. This is not a tale of the past but a reality that continues today with sustained production of digital media standards. The proof is in the record attendance last week of more than 600 MPEG members in Gothenburg.

Finally, as I wrote in MPEG and ISO, even taxi drivers know MPEG, demonstrating that the name MPEG does not just refer to a technology hidden in devices no one knows about but is imprinted in people’s minds.

Next to my proposal to leverage such a unique organisation making official the strategic role that MPEG has played for the last 30 years, there are many other proposals that can be summarised as follows

The first of these other proposals says: JPEG and MPEG are two working groups in the parent Subcommittee (SC). The former is in charge of coding of images and the latter is in charge of coding of moving pictures. By making MPEG an SC, JPEG remains alone in the parent SC and there will be no more collaboration.

The problem of this argument is that, especially in the last few years, for whatever reasons, JPEG and MPEG have not collaborated. JPEG used to meet collocated with MPEG, but then decided to meet separately. This does not mean that MPEG has not worked for JPEG because it developed two standards for the transport of JPEG2000 and JPEG XS images on MPEG-2 Transport Stream (TS), the standard that transports digital television.

Starting from 1992 MPEG has developed 5 standards jointly with ITU-T Study Group 16 (SG16) and is now developing a 6th standard. Still ITU-T SG16 is not even part of ISO! Another example is that MPEG has developed 3 standards and is developing 3 more standards jointly with TC 276 Biotechnology. Here we are talking of an ISO Technical Committee whose mission is to develop standards for biotechnology that do not have anything to do with digital media (but the jointly developed standard – MPEG-G – is very much needed by TC 276 for their workflows)!

This proves that collaboration happens when there is a common interest, not because the parties in the collaboration belong to the same organisational structure. This a bureaucratic view of collaboration that is unfortunately prevalent in ineffective organisations. Indeed, for bureaucrats it is so difficult to understand the essence of a common problem across organisational borders, while it is so easy to understand what happens inside an organisation (if it is understood, I mean).

The second of these proposals is a further attempt at creating organisational bindings where none existed before because they were never needed. In a few words the proposal is: instead of becoming an independent SC of 600 members (larger than many Technical Committees) the MPEG subgroups should melt in the parent SC.

This proposal demonstrates that the proponents miss the basic understanding of what MPEG is. MPEG is an ecosystem of groups developing integrated standards whose parts can also be used independently. To achieve this result, MPEG has developed the organisation described in More standards – more successes – more failures.

Figure 1 – Independent parts making an integrated standard

The parts of an MPEG standard (Blue circles) are typically “owned” (i.e. developed) by different groups, but there is a need to provide a “glue” (red lines in Figure 1) between the different parts of a standard if the standard is to be used as an integrated whole. The glue is provided by MPEG subgroups assisted by ad hoc groups, breakout groups and joint meetings and orchestrated by studies made at chairs meetings.

Dumping the MPEG organisation to melt in the parent SC will lead to the loss of the glue that make MPEG standards uniquely effective and successful in the industry. The components of a disbanded MPEG will not deliver as before in the new environment. Sure, given time, a new structure can emerge, but it is vital that a new structure operate now at same level of performance of MPEG, not in some years. Competition to MPEG is at the gates.

The third of these proposals is to give the parent SC the role of strategic planning, technical coordination and external relations that MPEG has – successfully – carried out for the last 30 years. This proposal is so naïve that not many words are needed to kill it (in Japanese you would use the word 黙殺, literally meaning “killing with silence”). For 30 years the parent organisation has performed administrative functions and, as much as you cannot make a racehorse from a horse who has pulled a cart for years, because its master so decides, in the same way the parent SC cannot become a strategic planner, a technical coordinator or an external relation manager. After years a new structure and modus operandi can very well settle (MPEG did not become what it is in a day), but in the meantime the cohesion that has kept MPEG components together will wither never to come back again and industry will just spurn its standards.

The fourth and last proposal (in this article, because there are many more) comes from a Non-Performing Entity (NPE). Appoint a new parent committee chair, disbands what exists today and create a new organisation from scratch. Sure, if the intention is to keep with a leash a tame committee whose sole purpose is to add IP in standards without any consideration for their industrial value, this is an excellent proposal.

In Gothenburg these and other proposals were discussed. How to make progress? One proposal was to make two ad hoc groups: one studying the first, well documented, proposal and the other trying to put order in the patchwork of ideas parts of which I have described above. Another proposal was to create only one ad hoc group combining the mandates of the two.

The matter was discussed for hours. Hamlet had to be called from neighbouring Denmark to decide. Whose skull did he use?

Posts in this thread

The Mule, Foundation and MPEG

What do the three entities of the title have to do together?

The second entity is Isaac Asimov’s Foundation Trilogy,  the tale of an organisation, actually more than one, established by Hari Seldon, who had invented psychohistory. According to that fictional theory the responses of large human populations to certain stimuli will remain the same over time if conditions remain as planned. Then, according to Asimov, psychohistory can predict the main elements of the evolution of society over the centuries and the Foundation is the organisation created to make sure that the future remains as Hari Seldon had planned it.

The first element is the Mule, a character of the trilogy, a mutant that quickly conquers the Galactic Empire with the power of his mental capabilities. It is an element of that fictional society whose appearance Hari Seldon’s psychohistory could not predict. The Mule was not expected to appear, but did.

The third is the MPEG data compression – especially media – group I have been writing about for some time on this blog. a group whose appearance in the media industry could not be predicted because it was completely outside of the rules of that industry, maybe the best real-world equivalent of Hari Seldon’s psychohistory.

Which were those rules? At certain points in history, several discoveries were made that rendered a range of inventions possible. Very often the only merit of guy who made the invention was that he put together a process whose elements were either known or already “floating in the air”. Regularly the invention was patented and gave the inventor the right to exploit his invention for the number of years granted by the law of his country.

In spite of this often chaotic process, several media types converged to the same technology. The photographic industry settled on a limited number of film sizes and the cinematographic industry settled on a limited number of formats: number of frames per second and film sizes. The sound recorded on vinyl records that were played at a limited number of speeds. All this according  to a series of steps that could not individually be predicted, but whose general outcome could.

Use of magnetics and electronics allowed more effective recording and, more importantly, enabled the instantaneous transmission of sound and images to remote places. Here the chaos reigned supreme with a large and growing number of formats for sound and television, real time and stored. If there had been a Hari Seldon of the media industry he could have applied his psychohistory.

In the Media Empire yhe Foundation materialised as a growing number of standards organisations who Tried to keep some order in the field. Table 1 shows just those at the international level, but others popped up at regional, national and industry level.

Table  1 – Media-related standards committees (1980’s)

ITU-T Speech SG XV WP 1
Video SG XV WP 2
ITU-R Audio SG 10
Video SG 11
IEC Recording of audio SC 60 A
Recording of video SC 60 B
Audio-visual equipment TC 84
Receivers SC 12A and G
ISO Photography TC 42
Cinematography TC 36

In “Foundation”, Hari Seldon had anticipated a number of “crises”. In the Media Empire, too, one crisis was due, the advent of digital technologies. Normally, this crisis should have been absorbed by making some cosmetic changes while keeping the system unchanged.

This is not what happened in the Media Empire because The Mule appeared in the form of a wild group of experts banding together under the MPEG flag. In the early days their very existence was not even detected by the most sophisticated devices, but soon the Mule’s onslaught was unstoppable. In a sequence of strikes  the MPEG Mule conquered  the media Empire: interactive video on compact disc, portable music, digital audio broadcasting, digital televisions, audio and video on the internet, file format, common encryption, IP-based television, 3D Audio, streaming on the unreliable internet and more. Billions of people were lured, without complaint but with joy, into the new world.

The forces of the MPEG Mule have clearly triumphed over the forces of darkness and anarchy. The Mule – the ultimate outsider – has exploited the confusion and brought order to everybody’s satisfaction if not to the forces of the Foundation who have been .designing their comeback

What will then be the eventual fate of the MPEG Mule?

In the Foundation, the Mule is eventually wiped out, not because his powers disappear but because others learned some of the methods of the Mule and applied them for their own sake, i.e. to re-instate confusion.

In 30 years of MPEG, and counting? I somehow referred to the MPEG Mule when I wrote “Another thirty years await MPEG, if some mindless industry elements will not get in the way”.

We may be close to know the fate of the MPEG Mule.

Posts in this thread