What it is
MPAI – Moving Picture, Audio and Data Coding is an unaffiliated, non-profit, international association with the mission to develop data coding standards, predominantly using Artificial Intelligence. The MPAI Manifesto summarises the MPAI distinctive features.
The main MPAI target is “data coding” that MPAI defines as the transformation of data from a given representation to an equivalent one more suited to a specific application, e.g., compression or semantics extraction.
How it works
MPAI develops standards using a rigorous process (Figure 1).
Figure 1 – The MPAI standards development process
A standard project evolves in 6 stages:
- Stages 0-1-2 – Interest Collection, Use Case and Functional Requirements – are open to non-members.
- Stage 3 – Commercial Requirements – is developed by Principal Members in the form of a Framework Licence that defined the business model without values (dollars, percentage etc.) The Framework Licence is intended to facilitate the eventual development – outside of MPAI – of the final licence(s). A member can upgrade its membership from Associate to Principal at any time.
- Stages 4-5 – Call for Technologies and Standard Development – are carried out by all MPAI members on an equal footing. Anybody may submit a response to a Call. However, the submitter of an accepted proposed technology must join MPAI.
- Stage 6 is for the formal adoption as an MPAI Standard by Principal Members.
Progression of a project from one stage to the next is approved by the General assembly.
What it does
Currently, MPAI has 4 standards at stage 5, 4 at stage 2, 1 at stage 1 and 1 at stage 0 (Figure 2.
Figure 2 – The MPAI work plan (June 2021)
The inside of MPAI standards
MPAI standards do not assume that MPAI systems are monolithic entities, rather that MPAI systems have an internal architecture (Figure 3 and Figure 4).
|Figure 3 – An MPAI AI Module (AIM)||Figure 4 – The MPAI AI Framework (AIF)|
Figure 3 is an instance of an AI Module (AIM), the unitary element of MPAI standards, that extracts the emotion expressed by a human face and the meaning intended by the face.
Figure 4 is an instance of the MPAI-standardised AI Framework (AIF, the rightmost standard in Figure 2) that can execute workflows of AIMs.
MPAI standardises the function (e.g., extract emotion and meaning) and the input/output data formats of an AIM but is silent on how the AIM produces output data from input data.
Therefore, MPAI standards enable interoperability of AIMs executed in an AIF. MPAI believes that competing developers striving to provide more performing proprietary while still interoperable AIMs will create horizontal markets of AI modules that build on and further promote AI innovation.
Where we are
Currently, MPAI is developing the AI Framework (MPAI-AIF) standard and 3 “application” standards (1st-level bullet) each containing Use Cases (2nd-level bullet):
- Context-based Audio Enhancement (MPAI-CAE)
- Emotion-Enhanced Speech (EES)
- Audio Recording Preservation (ARP)
- Enhanced Audioconference Experience (EAE)
- Audio-on-the-go (AOG)
- Multimodal Communication (MPAI-MMC)
- Conversation with Emotion
- Multimodal Question-Answering
- Personalised Automatic Speech Translation (PST)
- Compression and Understanding of Industrial Data (MPAI-CUI)
- Company Performance Prediction (CPP)
MPAI and ethics
For each Use Case, MPAI intends to develop
- A Conformance Testing standard, i.e., processes, methods and data that allow a user to ascertain that an implementation is technically correct
- A Performance Testing standard, i.e., processes, methods and data that allow a user to ascertain that an implementation is ethically correct