Compression – the technology for the digital age

It is a common wisdom that ours is the digital age. Indeed, most of the information around us is in digital form and we can expect that what is not yet digital will soon be converted to it.

But is being digital what matters? In this paper I will show that being digital does not make information more accessible or easier to handle. Actually being digital may very well mean being less able to do things. Information becomes accessible (even liquid) and processable only if it is compressed.

Compression is the enabler of the evolving media rich communication society that we value.

A little bit of history

In the early 1960s the telco industry felt ready to enter the digital world. They decided to digitise the speech signal by sampling it at 8 KHz with a nonlinear 8 bits quantisation. Digital telephony exists in the network and it is no surprise that not many know about it because hardly ever was this digital speech compressed.

In the early 1980s Philips and Sony developed the compact disc (CD). Stereo audio was digitised by sampling the audio waveforms at 44.1 kHz with 16 bits linear and stored on a laser disc. This was a revolution in the audio industry because consumers could have an audio quality that did nor deteriorate with time (until the CD stopped playing altogether). Did the user experience change? Definitely. For the better? Some people, even today, disagree.

In 1980 ITU-T defined the Group 3 facsimile standard. In the following decades hundreds of million Group 3 digital facsimile devices were installed worldwide. Why? Because it cut transmission time of an A4 sheet from 6’ (Group 1), or 3’ (Group 2) to about 1’. If digital fax had not used compression, transmission with a 9.6 kbit/s modem (a typical value of that time) would have taken more than required by Group 1 analogue facsimile.

A digital uncompressed photo of, say, 1 Mpixels would take half an hour on a 9.6 kbit/s modem (and it was probably never used in this form), but with a compression of 60x it would take half a minute. An uncompressed 3’ CD track on the same modem would take in excess of 7 h, but compressed at 96 kbit/s would take about 30’. That was the revolution brought about by the MP3 audio compression that changed music for ever.

Digital television was specified by ITU-R in 1980 with luminance and the two colour difference signals sampled at 13 and 6.5 MHz, respectively, at 8 bits per sample. The result was an amazingly high bitrate of 216 Mbit/s. As such it never left the studio if not on bulky magnetic tapes. However, when MPEG developed the MPEG-2 standard capable of yielding studio-quality video compressed at a bitrate of 6 Mbit/s and packing 4 TV programs (or more) where there was just one analogue program became possible, TV was no longer the same.

The bottom line is that it is pleasantly good being digital, but it is so much practically better being compressed digital.

Video and audio generate a lot of bits

Digital video, in the uncompressed formats used so far, generates a lot of bits and new formats will continue doing so. This is described in Table 1 that gives the parameters of some of the most important video formats, assuming 8 bits/sample. High Definition with an uncompressed bitrate of ~829 Mbit/s is now well established and Ultra High Definition with an uncompressed bitrate of 6.6 Gbit/s is fast advancing. So-called 8k with a bitrate of 26.5 Mbit/s seems to be the preferred choice for Virtual Reality and higher resolution values are still distant, but may be with us before we are even aware of them.

Table 1: Parameters of video formats

  # Lines # Pixels/

line

Frame

frequency

Bitrate

(Gbit/s)

“VHS” 288 360 25 0.041
Standard definition 576 720 50 0.166
High definition 1080 1920 50 0.829
Ultra high definition 2160 3840 50 6.636
8k (e.g. for VR) 4320 7680 50 26.542
16k (e.g. for VR) 8640 15360 100 212.336

Compression has been and will continue being the enabler of all these video formats. The first format in Table 1 was the target of MPEG-1 compression which reduced the uncompressed 41 Mbit/s to 1.2 Mbit/s at a quality comparable to VHS. Today it may be possible to send 41 Mbit/s through the network, but no one would do it for a VHS type of video. They would possibly do it for a UHD Video, but in compressed form.

In its 30 years of existence MPEG has constantly pushed forward the limits of video compression. This is shown in the third column of Table 2 where it is possible to see the progress of compression over the years: going down in the column, every cell gives the additional compression provided by a new “generation” compared to the previous one.

Table 2: Improvement in video compression

Std Part name Base Scalable Stereo Depth Selectable viewpoint Date
MPEG-1 Video VHS 92/11
­ MPEG-2 Video SD -10% -15% 94/11
MPEG-4 Visual -25% -10% -15% 98/10
MPEG-4 AVC -30% -25% -25% -20% 5/10% 03/03
MPEG-H HEVC -60% -25% -25% -20% 5/10% 13/01
MPEG-I Immersive Video ? ? ? ? ? 20/10

The fourth column in Table 2 gives the additional (compared to the third column) saving offered by scalable bitstreams compared to simulcasting two bitstreams of the same original video at different bitrates (scalable bitstreams contain, in the same bitstream, two or more bitstreams of the same scene at different rates). The fifth column gives the additional (compared to the third column) saving offered by stereo coding tools compared to independent coding of two cameras pointing to the same scene. The sixth column gives the additional (compared to the fifth column) saving obtained by using depth information and the seventh column gives the additional (compared to the sixth column) cost of giving the viewer the possibility select the viewpoint. The last column gives the date of approval of the standard by MPEG and the last row refers to the next video coding standard under development of which 2020/10 is the expected time of approval.

Audio in MPEG had a parallel history. The first columns of Table 3 gives the different types of sound formats (full channels.effects channels) and the sampling frequency. The next columns give the MPEG standard, the target bitrate of the standard and the performance (as subjective audio quality) achieved by the standard at the target bitrate(s)

Table 3: Audio systems

  Sampling

freq. (kHz)

MPEG

Std

kbit/s Performance
Speech 8 64 Toll Quality
CD 44.1 1,411 CD Quality
Stereo 48 1-2-4 128 à 32 Excellent to Good
5.1 surround 48 2-4 384 à 128 Excellent to Good
11.1 immersive 48 H 384 Excellent
22.2 immersive 48 H 1,500 Broadcast Quality

More challenges for the future

There is no doubt that the future of entertainment is more digital data generated by sampling the electromagnetic and acoustic fields that we call light and sound, respectively. MPEG is investigating the digital representation of the former by acting in 3 directions.

The first direction of investigation uses “sparse sensing elements” capable of capturing both the colour of a point and the distance (depth) of the point from the sensor. Figure 1 shows 5 pairs of physical cameras shooting a scene and a red line of points at which sophisticated algorithms can synthesise the output of virtual cameras.

Figure 1:  Real and virtual cameras

The second direction of investigation uses “dense sensing elements” and comes in two variants. In the first variant each sensing element captures light coming from a direction perpendicular to the sensor  plane and, in the second illustrated by Figure 2, each “sensor” is a collection of sensors capturing light from different directions (plenoptic camera).

Figure 2: Plenoptic camera

The second variant of this investigation tries to reduce to manageable levels the amount of information generated by sensors that capture the light field.

Figure 3 shows the expected evolution of

  1. The pixel refresh rate of lght field displays (right axis [1]) and the bitrate (left axis) required for transmission when compressed with current HEVC technology (blue line)
  2. The available broadband and high-speed local networks bitrates (red/orange lines and left axis [2], [3]).

Figure 3: Sensor capabilities, broadband internet and compression
(courtesy of Gauthier Lafruit and Marek Domanski)

The curves remain substantially parallel and separated by a factor of 10. i.e. one unit in log10 (yellow arrow). In its MPEG-I project MPEG is working to provide  more compression. Actually, today’s bandwidth of 300 Mbps (green dot A) is barely enough to transmit video at 8 Gpix/s (109.9 on the right axis) for high-quality stereoscopic 8k VR displays at 120 frames/s compressed with HEVC.

In 2020 we expect light field displays that project hundreds of high-resolution images in various directions for glasses-free VR, similar to the Holodeck experience, to reach point B on the blue line. Again HEVC will not be able to transmit all data over broadband networks. However,,  this will be possible over local networks, since the blue line stays below the top orange line in in 2020.

The third direction of investigation uses “point clouds”. These are unordered sets of points in a 3D space, used in several industries, and typically captured using various setups of multiple cameras, depth sensors, LiDAR scanners, etc., or synthetically generated. Point clouds have recently emerged as representations of the real world enabling immersive forms of interaction, navigation, and communication.

Point clouds are typically represented by extremely large amounts of data, which is a significant barrier for mass market applications. MPEG is developing a standard capable of compressing a point cloud to levels that are compatible with today’s networks reaching consumers. This emerging standard compresses 3D data by leveraging decades of 2D video coding technology development and combining 2D and 3D compression technologies. This approach allows industry to take advantage of existing hardware and software infrastructures for rapid deployment of new devices for immersive experiences.

Targeted applications for point cloud compression include immersive real-time communication, six Degrees of Freedom (6 DoF) where the user can walk in a virtual space, augmented/mixed reality, dynamic mapping for autonomous driving, and cultural heritage applications.

Dense sensing elements are also applicable to audio capture and coding. Wave Field Synthesis is a technique in which a grid or array of sensors (microphones) are spaced at least as closely as one-half the highest wavelength in the sound signal (1.7 cm for 20 kHz). Such an array can be used to capture the performance of a symphony orchestra, where the array is placed between the orchestra and the audience in the concert hall. When that captured signal is played out to an identically placed set of loudspeakers, the acoustic wave field at every seat in the concert hall can be correctly reproduced (subject to the effects of finite array size). Hence, with this technique, every seat in the “reproduction” hall has the exact experience as with the live performance.

Even ancient digital media need compression

The genome is a special type of program designed to run on a special type of computer – the cell. The program is “written” with an alphabet of 4 symbols (A, T, G and C) and physically carried as four types of nucleobases called adenine, thymine, guanine and cytosine on a double helix created by the bonding of adenine and thymine, and cytosine and guanine (see Figure 4).

Figure 4: The double helix of a genome

Each of the cells of a human (~ 37 trillion for a weight of 70 kg) carries a hopefully intact copy of the “program”, pieces of which it runs to synthesise proteins for its specific needs (troubles arise when the original “program” lacks the instructions to create some vital proteine or when copies of the “program” have changed some vital instructions).

The genome is the oldest – and naturally made – instance of digital media. The length of the human “program” is significant: ~ 3.2 billion base pairs (bp), equivalent to ~800 MBytes.

While a seemingly simple entity like a cell can transcript and execute the program contained in the genome or parts of it, humans have a hard time even reading it. Devices called “sequencing machines” do the job, but rather lousily, because they are capable of reading only random fragmentf of the DNA and of outputting the values of in a random order. Figure 5 depicts a simplified description of the process, which also shows how these “reads” must be aligned, using a computer program, against a “reference” DNA sequence to eventually produce a re-assembled DNA sequence.

Figure 5: reading, aligning and reconstructing a genome

Sequencing machines also generate a number associated to each base pair that corresponds to the quality of the reading process of each base pair of the fragment. Therefore, a sequencing machine typically provides as many numbers as base pair for each fragment. However, to be sure that each random fragment sufficiently “covers” each part of the genome and that reading errors that occur randomly are corrected by redundant correct reads, the amount of fragments generated in average can exceed the total amount of a full genome by a factor called “coverage”. This means that reading a DNA sample with a coverage of 200 may generates about ~1.5 TBytes

Transporting, processing and managing this amount of data is very costly. There are some ASCII formats for these genomic data: FastQ, a container of non-aligned reads and Sequence Alignment Mapping (SAM), a container of aligned/mapped reads. Zip is applied to reduce the size of both FastQ and SAM, generating zipped FastQ blocks and BAM (Binary version of SAM). However the compression performance remains poor, data access awkward and maintenance of the formats a problem.

MPEG was made aware of this situation and, in collaboration with ISO TC 276 Biotechnology, is developing the MPEG-G standard for genome compression that it plans to approve in October 2018 as an FDIS.

Unlike BAM where data with different statistical properties are put together and then zip-compressed, MPEG uses a different and more effective approach:

  1. Genomic data are represented with statistically homogeneous descriptors
  2. Metadata associated to classified reads are represented with specific descriptors
  3. Individual descriptors sub-sets are compressed achieving a compression in the range of 100x
  4. Descriptors sub-sets are stored in Access Units for selective access via standard API.

Therefore MPEG-G not only saves storage space (from ~1.5 TBytes down to ~15 GBytes) and therefore transmission time, but also makes genome processing more efficient with an estimated improvement of selective data access times of ~100. Additionally MPEG-G promises, as for other MPEG standards, to provide in the future more efficient technologies in a controlled process.

It may be time to update the current MPEG logo from what it has been so far

to a new logo

There are no limits to the use of compression

There is another type of “program” that is acquiring more and more importance. The technology is called neural networks and is more than 50 years old. The technology find more usages and even appears – as Artificial Intelligence – in ads directed to the mass market.

As a human neuron, the neural network element of Figure 6 collect inputs from different elements, processes them and, if the result is above a threshold, generates an activation signal.

Figure 6: An element of a neural network

As in the human brain it is also possible to feed back the output signal to the source (see Figure 7).

Figure 7: Recurrency in neural networks

Today, neural networks for image understanding consist of complex configurations and several million weights and it is already possible to execute specific neural-network based tasks on recent mobile devices. Why should we then not be able to download to a mobile device a neural network that best solves a particular problem?

The answer is yes but if the number of neural network-based applications increases, if more users want to download them to their devices and if the applications grow in size to solve ever more complex problems, compression is needed and will play the role of enabler of a new age of mobile computing.

MPEG is currently investigating if and how it can use its compression expertise to develop technologies that efficiently represent neural networks of several million neurons used in some of its standards under development.

A bright future, or maybe not

Implicitly or explicitly, all big social phenomena like MPEG are based on a “social contract”. Which are the parties to and what is the “MPEG social contract” about?

When MPEG started, close to 30 years ago, it was clear that there was no hope of developing decently performing audio and video compression standards after  so many companies and universities had invested for many decades in compression research. So, instead of engaging in the common exercise of dodging patents, MPEG decided to develop its standards having as goal the best performing standards, irrespective of the IPR involved. The deal offered was a global market of digital media products, services and applications offering interoperability to billions of people generating hefty royalties to patents holders.

MPEG did keep its part of the deal. Today MPEG-based digital media enable major global industries, allow billions of people to interoperate and provide royalties worth billions of USD (NB: ISO/IEC and MPEG have no role in  royalties or the agreements leading to them).

Unfortunately some parties have decided to break the MPEG social contract. The HEVC standard, the latest approved video compression standard, was approved in January 2013 but, close to 5 years later, there is no easy way to get a licence to practice that standard.

What I am going to say should not be interpreted in a legal sense. Nevertheless I am terribly serious about it. Whatever are the rights granted to patent holders by the laws, depriving billions of people, thousands of companies and hundreds of patent holders of the benefits of a standard like HEVC and, presumably, other future MPEG standards, is a crime against humankind.

Acknowledgements

I would like to thank the MPEG subgroup chairs, activity chairs and members for making MPEG what it is recognised for – the source of digital media standards that have changed the world for the better.

Hopefully they will will have a chance to continue doing so in the future as well.

References

[1] “Future trends of Holographic 3D display,” http://www-oid.ip.titech.ac.jp/holo/holography_when.html
[2] “Nielsen’s law of internet bandwidth,” https://www.nngroup.com/articles/law-of-bandwidth/
[3] “Bandwidth growth: nearly what one would expect from Moore’s law,” https://ipcarrier.blogspot.be/2014/02/bandwidth-growth-nearly-what-one-would.html

 

 

 

 

 

 

 

 

 

 

On my Charles F. Jenkins Lifetime Achievement Award

In a press release today the Academy of Television Arts and Sciences announces that I have been selected to receive the Charles F. Jenkins Lifetime Achievement Award, “a special engineering honor to an individual whose contributions over time have significantly affected the state of television technology and engineering”.

I should be happy to see the recognition of 30 years of work dedicated to making real the vision of humans finally free to communicate without barriers and sharing more and more rewarding digital media experiences. Still, I need to make a few remarks.

The first remark is that my endeavours were driven by the hand of God and that tens, hundreds and thousands of people have made MPEG what it is recognised for: the originator of standards that have changed the lives of billions of people for the better.

The second remark concerns the word “lifetime” in the name of the award. This sort of implies that my professional lifetime has been observed in its entirety. I hereby communicate that I do not intend to retire anytime soon.

The last and most important remark concerns two necessary conditions for the success of MPEG standards. MPEG demonstrably achieves the first – technical excellence – but those in charge of the second – commercial exploitability – perform less and less. Indeed MPEG approved the HEVC standard in January 2013 (56 months ago!), but prospective users must negotiate with 3 different patent pools and a host of individual patent holders to get a licence. There are standards that will never see the light or, if they will do, will not be used because the standards organisations have been unable to update their processes from the time they dealt with standards for nuts and bolts.

I did my best to reverse this trend by raising awareness on these problems. Vested interests have stopped me, depriving billions of people and various industries of the benefits of new MPEG standards.

I am happy to receive this Charles F. Jenkins Lifetime Achievement Award – for what it means for the past – but with a sour taste for the future, the only thing that matters.

Standards for the present and the future

It is hard to talk sensibly to the general public about standards. It is a pity because standards are important as they ensure e.g. that nuts match with bolts, paper sheets feed into printers, music files play on handsets and a lot more.

One reason is that standards are one of the most ethereal things on Earth as they concern interfaces between systems.  Another reason is that the many industries created by human endeavour have developed their own customs: what is a must in one industry can be anathema in another. Yet another is the fact that standards are often the offspring of innovation, often generating flows of money that can be anything from a trickle to a swollen river.

Standards used to have a direct impact on industry, but only rarely on end users, at most only on a small portion of them. Recently, however, Information and Communication Technologies have become so pervasive, affecting companies by the thousands and people by the billions, that standards underpinning the industry have assumed unseen impact and visibility.

One of the most egregious cases is the ISO/IEC standard called High Efficiency Video Coding (HEVC). Work on this standard started in January 2010 and ended with the first release exactly 3 years later. As of July 2017 there are 3 patent pools (one representing 35 patent holders) and a number of companies (not represented by any patent pool) all claiming to have Intellectual Property (IP) on the standard.

It is no surprise that most people do not even know about HEVC because it is seldom – if ever – used in audio-visual services, and this 4 and a half years after industries could implement the standard – 18 months longer than it took to develop the standard itself. And some people say that standardisation takes too long!

This situation creates three clear losers:

  1. Companies that have contributed their technologies to the standard do not get the benefits of their investment;
  2. Companies that would be ready to use – in products, services and applications – HEVC because it performs better (by 60%) than Advanced Video Coding (AVC) currently in use are practically prevented from using it;
  3. End users are deprived of their right to get better or new services, or simply services where it was not possible to have them before.

If there is market failure when the good/service allocation is not efficient because one can imagine a different situation where many individuals are better-off without making others worse-off, then we are in front of a market failure.

Or maybe not. According to recent news Apple has announced that they will support HEVC in High Sierra (macOS) and iOS 11. One expects that a company as important as Apple does not make such an announcement if they do not have their back well covered.

But is this a big news? It depends on how you look at it.

  • Actually not so big, because major handset manufacturers are reportedly already installing HEVC chips in their handsets. So the Apple news is the software equivalent of a déjà vu and we are in in front of a market failure.
  • If the news is as big as some people claim it is, then we are forced to conclude that only a company worth 800 B$ can get the licence required to exercise the HEVC standard. So we are in front of market success.

Maybe not, or maybe yes. If we are in front of market success, we have sacrificed a major principle of international standardisation enshrined in the ISO/IEC Directives: standards must be accessible to everybody on a nondiscriminatory basis on reasonable terms and conditions. Everybody of the size of Apple Inc., I mean.

The problem of these well-intentioned rules is that they were developed at a time when patents relevant to a standard were typically held by one company. Even with tens of MPEG-2 and AVC patent holders, things were still under control because there was one patent pool and a limited number of patent holders outside. However, in HEVC we are dealing with close to 100 patent holders grouped in 3 patent pools and a significant number of patents holders outside. HEVC is not the exception, but the rule, in this and future standards.

The sentence underlined does not imply that it is always necessary to pay in order to access a standard. If an amount has to be paid, it should be the same for all. If access is free it should be free for all.

The devil, they say, is in the details. Per ISO/IEC Directives a patent holder is not obliged to disclose which patents are relevant to a technology proposed for a standard. This is not ideal but acceptable if the patent holder intends to license the technologies contributed for a fee, because precise identification of relevant patents will be part of the development of licensing terms with a now well-honed process.

If, however, access to the standard is intended to be free of charge, such “blanket” declarations should not be acceptable because the committee developing a standard has no means to remove the technology. Declarations may come from companies that have more patents than employees and there is no process to develop licensing terms.

It should also not be acceptable that patent holders make patent declaration where they declare they own relevant patents that they do not intend to licence them. Again the committee developing the standard has no means to remove the infringing technology.

These problems have been identified and brought to an appropriate level in ISO/IEC. Is anything going to happen? Don’t count on it. At the meeting where the problems were presented, delegates from a handful of countries disputed the process that brought the matter to the attention of the committee, but no discussion could take place on the substance of the matter.

Something is rotten in the state of Denmark, and some are determined to keep it rotten.

Personal devices and persons

USA President Obana is reported saying (NYT 2016/03/12): “If, technologically, it is possible to make an impenetrable device or system, where the encryption is so strong that there is no key, there is no door at all, then how do we apprehend the child pornographer? How do we disrupt a terrorist plot?”

Anwer with another question: “How do we make a child pornographer or a terrorist talk if he does not want to?”

Egg and chicken – tax and expenses

After decades of funding the most fancy and unproductives aspects of the welfare state by raising taxes, politicians have discovered that too much is too much. So cutting taxes has become the mantra of right- and left-wing politicians alike.

There is one problem, though. Citizens have become unresponsive (i.e. they don’t believe anymore to “tax cuts” promises).

One suggestion to politicians in need of recovering citizens’ confidence: instead of saying “I will cut this tax”, say “I will cut this expense”.

Getting out of the mess

The Christian religion explains the mess of the world we live in with the Original Sin that has destroyed the good nature that would otherwise be in us.

The Original Sin cannot be undone but Baptism and adherence to the Religion’s precepts promise to make us reborn people.

How can this help sorting out the European mess experienced of these days? Here, too, we have an original sin: greedy Greek politicians bent on lighting new debts from greedier bankers bent on dispensing the banks’ assets as if they were candies for children.

Alas, that original sin cannot be undone and there is no baptism redeeming people. There are two precepts, though, that can help people’s rebirth.

  1. If a bank manager recklessly lends money to a country and the loan goes sour, the manager pays. There is no room for excuses like “the country showed bogus accounts” because 1/10 of the diligence bank managers put when a small enterprise requests a loan should be enough to detect holes in the accounts of a country.
  2. If a bank risks failing because its managers have recklessly rented money to greedy politicians, the bank fails. No socialising of losses. Notionally the state can decide to rescue the bank, but bank shareholders get the reward the deserve for their lack of vigilance (aka connivance): nothing.

Leonardo’s views on European Parliament’s “Google’s vote”

It is hard to legislate when the matter at hand is constantly changing. This is the case of the recent recommendation of the European Parliament  (EP) to introduce regulation of internet search and break up Google.

It is not the first time public authorities break up companies: they did it in 1911 with Standard Oil Co. Inc. But that decision was more than a century ago and was about a company that was controlling too many “atoms” (actually molecules) that people were interested in.

In the second half of the ’90s public authorities realised that Microsoft was controlling too much of the “bit processing” going around. Luckily in 2000 Microsoft escaped Standard Oil’s fate and was “just” forced to give users the possibility to select a different browser than Internet Explorer in their Windows OS. This obligation only applied to Microsoft and not to other “bit processing” companies.

In the current case we are dealing with a company that controls, in EP’s opinion, too many “information bits” so why not breaking it up? As users can already select other search engines, a Microsoft-like solution is probably not the right one.

There is a solution: if we define a standard format that search engines must use when presenting search results to users, there would be room for intelligent software to nitpick exactly the information of interest, using not just the output of one, but of many search engines.

Of course this obligation should not be imposed on a single company –why punish a company for its success? – but on all companies running search services.

So the conclusion, valid for a world of information handling companies, is: don’t break up companies (left-hand side of the figure), force them to expose interfaces to the information they handle (right-hand side of the figure).

Business-layers

Leonardo closes the 39th DMP meeting, Strasbourg (FR) – 2014/10/25

The 39th DMP meeting reviewed the Web TVOS Framework API and produced v1.1 of the specification.

For GA40 the following contributions are expected

  1. A drawing explaining operation of the specification in a client-server mode
  2. A rationale and a list of changes/extensions effected to the TVAnytime specification to produce this specification
  3. A study on what makes this specification different from other specifications. In case the differences are small metadata schema identification will be supported, if they are large a revised version of the current specification will probably be developed without support to metadata schema identification.

Leonardo closes the 38th DMP meeting, Sapporo (JP) – 2014/07//10

The 38th DMP meeting reviewed submissions and created a workplan for its 3rd series of DMP specifications that aims to define a comprehensive system view of media services that are based on a main broadcasting service supplemented by interactive and personalised services delivered via the internet. The specification will include a general model of Hybrid-Delivery Media Services (HDMS) and their stakeholders, HDMS requirements, system-wide interfaces and user application interfaces.