MPAI springs forward to an intense 2022

Established on 30 September 2020, MPAI spent the first 3 months giving itself a structure ensuring the execution of its mission “develop Artificial Intelligence (AI)-based data coding standards”.

Its first full year of operation – 2021 – has been engaging but rewarding:

5 Technical Specifications (TS) have been approved and released in the following domains:
- Finance.
- Human-machine communication.
- Audio enhancement.
- AI Framework
- Ecosystem Governance.
The Company Performance Assessment TS was complemented by 3 additional specifications:
- Reference Software (RS). a conforming implementation of the TS,
- Conformance Testing (CT), to test that an implementation is technically correct and provides an adequate user experience
- Performance Assessment (PA), to assess implementation reliability and trustworthiness.

A goal can be declared as reached only if the next goal is known, and the purpose of this post is to disclose exactly that.

The AI Framework (AIF), depicted in Figure 1, is a cornerstone of the MPAI architecture.

Figure 1 – The AI Framework (AIF) Reference Model and its Components

The AIF
- Is Operating System-independent.
- Has a local and distributed component-based Zero-Trust architecture.
- Can create AI Workflows (AIW) made of elementary units called AI Modules (AIM).
- Can access validated AIWs and AIMs by interfacing to the MPAI Store.
- Can execute in a range of computing environments: from MCUs to HPCs.
- Can interact with other AIFs operating in proximity.
- Supports Machine Learning functionalities.
Its AIMs
- Encapsulate components to abstract them from the development environment.
- Call the Controller via standard interfaces.
- Can be AI-based or data processing-based.
- Can be in software or in hardware.

2022 MPAI Goal #1: AI Framework (MPAI-AIF)

Development of the Reference Software (RS).
Development of the Conformance Testing.

MPAI has already developed 3 application oriented Technical Specifications: MPAI-CAE (Enhanced audio), MPAI-CUI (Company Performance Prediction) and MPAI-MMC (Multimodal human-machine conversation). It total there are 10 AIWs and some 20 AIMs (several of them are used in different AIWs).

An active MPAI generates an ecosystem with the following actors:

MPAI develop standards.
Implementers develop MPAI standard implementations
Users access such implementations.

MPAI is all about facilitating a market of AI applications. Releasing standards enables a market but does not ensure that the market is functional. How can a user be sure that an implementation is secure, technically correct, unbiased? Note that by “user” we do not necessarily mean an end user, but also an app developer (i.e., AIW) who may need an AIM and does not have the resources or the competence to answer the 3 questions.

In its Governance of the MPAI Ecosystem TS, MPAI has envisaged two more players:

Performance Assessors who assess that implementations are reliable and trustworthy.
The MPAI Store where uploaded implementations are:
1. Checked for security
2. Tested for conformance
3. Posted to the Store with a clear indication of level of performance.

Note that MPAI appoints Performance Assessors, and establishes and controls the MPAI Store, a not-for-profit commercial entity.

Figure 2 depicts the operation of the MPAI Ecosystem.

Figure 2 – The MPAI Ecosystem and its Governance

2022 MPAI Goal #2: Governance of the MPAI Ecosystem (MPAI-GME)

Design the MPAI Store corporate structure
Design and operate the MPAI Store
Develop and run the MPAI Store IT service
Design and operate the Performance Assessor network.

In 2020 MPAI has developed 3 application oriented TSs:

Compression and Understanding of Industrial Data (MPAI-CUI) with 1 use case.

Multimodal Conversation (MPAI-MMC) with 5 use cases.

Context-based Audio Enhancement (MPAI-CAE) with 4 use cases.

Figure 3 depicts the reference model of the Company Performance Prediction Use Case.

	AI-based Company Performance Prediction measures the performance of a Company by providing Default Probability, Organisational Model Index, and Business Discontinuity Probability of the Company within a given Prediction Horizon using the Company’s Governance, Financial and Risk data
Figure 3 – The Company Performance Prediction CUI-CPP) Reference Model

MPAI-CUI includes the Reference Software (RS), Conformance Testing (CT) and Performance Assessment (PA) Specifications of the AI-based Company Performance Prediction (CPP).

2022 MPAI Goal #3: Compression and Understanding of Industrial Data (MPAI-CUI)

Integration of the RS in MPAI-AIF
Submission of RS to MPAI Store
Development of Version 2 (extension of functionality of existing AIMs and new AIWs to support more risks).

Multi-modal conversation (MPAI-MMC) uses AI to enable human-machine conversation emulating human-human conversation in completeness and intensity. It includes 5 Use Cases: Conversation with Emotion, Multimodal Question Answering, Unidirectional Speech Translation, Bidirectional Speech Translation and One-to-Many Unidirectional Speech Translation.

The figures below show the reference models of the MPAI-MMC Use Cases.

	Conversation with Emotion (CWE) enables a human to holds an audio-visual conversation using audio and video with a computational system that is impersonated by a synthetic voice and an animated face, both expressing emotion appropriate to the emotional state of the human.
Figure 4 – Conversation with Emotion
	Multimodal Question Answering (MQA) enables a user to request information using speech concerning an object the user displays and to receive the requested information from a computational system via synthetic speech.
Figure 5 – Multimodal Question Answering
	Unidirectional Speech Translation (UST) allows a user to select a language different from the one s/he uses and to get a spoken utterance translated into the desired language with a synthetic voice that optionally preserves the personal vocal traits of the spoken utterance.
Figure 6 – Unidirectional Speech Translation
	Bidirectional Speech Translation (BST) allows a human to hold a dialogue with another human. Both speech their own language and their translated speech is a synthetic speech that optionally preserves their personal vocal traits.
Figure 7 – Bidirectional Speech Translation
	One-to-Many Speech Translation (MST) enables a human to select a number of languages and have their speech translates to the selected languages using a synthetic speech that optionally preserves their personal vocal traits.
Figure 8 – One-to-Many Speech Translation

Currently, only the MPAI-MMC TS is available. Thereforethe

2022 MPAI Goal #4 for Multimodal Conversation (MPAI-MMC)

Development of the RS of the 5 Use Cases, integration in AIF and submission to the Store
Development of the CT specification of the 5 Use Cases
Development of the PA specification of the 5 Use Cases
Development of Version 2 that includes extension of functionality of existing AIMs and new AIWs, some coming from projects under development such as MPAI-CAV (Connected Autonomous Vehicles) and MPAI-MCS (Mixed-reality Collaborative Spaces).

The 4 use cases considered are: Emotion Enhanced Speech, Audio Recording Preservation, Speech Restoration System and Enhanced Audioconference.

The figures below shows the reference models of the MPAI-CAE Use Cases. Note that an Implementation is supposed to run in the MPAI-specified AI Framework (MPAI-AIF).

	Emotion-Enhanced Speech (EES) enables a user to indicate a model utterance or an Emotion to obtain an emotionally charged version of a given utterance. In many use cases, emotional force can usefully be added to speech which by default would be neutral or emotionless,
Figure 9 – Emotion Enhanced Speech
	Audio Recording Preservation (ARP) Use Case enables a user to create of digital copies of a digitised audio of open-reel magnetic tapes suitable for long-term preservation and for correct play back of the digitised recording (restored, if necessary).
Figure 10 – Audio Recording Preservation
	Speech Restoration System (SRS) enables a user to restore a Damaged Segment of an Audio Segment containing only speech from a single speaker. No filtering or signal processing is involved. Instead, replacements for the damaged vocal elements are synthesised using a speech model.
Figure 11 – Speech Restoration System
	Enhanced Audioconference Experience (EAE) enables a user to improve the auditory quality of audioconference experience by processing speech signals recorded by microphone arrays and provide speech signals free from background noise and acoustics-related artefacts .
Figure 12 – Enhanced Audioconference Experience

Currently, only the MPAI-CAE TS is available. Therefore

MPAI Goal #5 in 2022 is further development of MPAI-CAE

Development of RS of the 4 Use Cases, integration in AIF and submission to the Store
Development of the CT specification of the 4 Use Cases
Development of the PA specification of the 4 Use Cases
Development of Version 2 that will include extension of functionality of existing AIMs and new AIWs, some coming from projects under development such as MPAI-CAV (Connected Autonomous Vehicles) and MPAI-MCS (Mixed-reality Collaborative Spaces).

MPAI has 7 projects at different levels of development. For each of these a Goal is assigned.

2022 MPAI Goal #6 in 2022 is development of MPAI-SPG

TS, RS, CT, PA of Server-based Predictive Multiplayer Gaming

2022 MPAI Goal #7 for Connected Automotive Vehicles (MPAI-CAV)

TS, RS, CT, PA of Connected Automotive Vehicles. This will include interactions with MPAI-MMC and MPAI-CAE

2022 MPAI Goal #8 for Mixed-reality Collaborative Spaces (MPAI-MCS)

TS, RS, CT, PA of Mixed-reality Collaborative Spaces. This will include interactions with MPAI-MMC and MPAI-CAE

2022 MPAI Goal #9 for Integrative Genomic/Sensor Analysis (MPAI-GSA)

TS, RS, CT, PA of Integrative Genomic/Sensor Analysis

2022 MPAI Goal #10 for AI-Enhanced Video Coding (MPAI-EVC)

The AI-Enhanced Video Coding (MPAI-EVC) Evidence Project will continue toward reaching the goal of 25% improvement over MPEG-5 EVC

2022 MPAI Goal #11 for AI-based End-to-End Video Coding (MPAI-EEV)

AI-based End-to-End Video Coding (MPAI-EEV) will continue harnessing the potential of an unconstrained approach ti AI-based Video Coding.

2022 MPAI Goal #12 for Visual Object and Scene Description (MPAI-OSD)

Visual Object and Scene Description (MPAI-OSD) will continue collecting use cases where visual information coding is required.

Cookie	Duration	Description
_pk_id.5.1b16	13 months	Used to store a few details about the user such as the unique visitor ID
_pk_ses.5.1b16	30 minutes	Short lived cookies used to temporarily store data for the visit
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Necessary" category .
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

You Might Also Like

An Introduction to the MPAI Metaverse Model Architecture – Part III

MPAI outlines plans for the MPAI Store Foundation

An overview of MPAI Metaverse Model (MPAI-MMM) – Architecture

Notice