Q: What is the main objective of MPAI?
A: There are many languages in the world, but if we want to reach all interested people we have to use a universally recognised language. The same thing happens with data: the more people understand the format the more value they have.
The definition of a universal language for music – the MP3 standard – led to a revolution that many – because they did not know the world before – cannot understand the extent. The same goes for video of which there were, up to the mid-90s, perhaps as many formats or sub-formats as there were countries in the world. Of course, today’s world of media would not exist if the crazy idea that each country is entitled to define its own format still prevailed.
MPAI aims to bring the same revolution to the world of data by specifying standard data formats that rely on AI. Data is increasingly processed with artificial intelligence techniques; therefore standards should take the peculiarity of artificial intelligence into account. As with the media, billions of citizens of the world will enjoy the benefits.
Q: Who can participate and contribute to the MPAI project?
A: MPAI is open to all legal entities who can contribute to the development of standards for data coding achieved, mainly but not exclusively, using artificial intelligence. An internal committee at MPAI evaluates membership applications based on a few simple criteria. MPAI, however, is also open to the participation of physical persons in the phases concerning the presentation of proposals and the definition of the functional requirements of a standard.
Q: How does the standardisation process in MPAI work, starting from the market needs to arrive at standards?
A: As I said, MPAI gives anyone the opportunity to submit proposals for standards and to contribute to the definition of their functional requirements. However, a different argument must be made for the definition of another important parameter of a standard – the commercial requirements – that is, when and at what conditions will I be able to use the standard? The answer to this question is not obvious because there are MPEG standards whose usage licences is not well defined or have become available many years after the standard was approved. Defining business requirements is the responsibility of MPAI’s core members, but anyone can propose technologies in response to MPAI’s Call for Technologies. If a non-member proposal is accepted, they must become a member. The development of a standard is therefore open to all MPAI members.
Q: With its recently published Multimodal Conversation (MPAI-MMC) V2 Call for Technologies seems particularly interesting in the field of machine learning applications for conversation analytics and especially sentiment analysis because it introduces Emotion, Cognitive State and Attitude (Personal Status) to enhance the conversation between humans and machines represented by avatars displaying Personal Status. Could you briefly describe some possible use cases and objectives of this call?
A: Back in September 2021, one year after the founding of MPAI, three standards were published, one of which concerned the human-machine conversation (Multimodal Conversation, MPAI-MMC). One use case concerns a machine that “understands” not only the speech but also the emotion contained in the voice and the face of the human the machine is talking to and responds with a sentence and an avatar face both displaying a pertinent emotion
In this year’s call, ambition is significantly greater because the concept of “emotion” has been extended to “personal state” which includes, in addition to emotion, cognitive state, and attitude of an entity that can be a human or a machine. Personal status can be extracted through a module called “personal status extraction” and provides an answer to “how much my text, my voice, my face and my gestures transmit my degree of knowledge of the subject I am talking about and my attitude towards the interlocutor?” The personal status of a machine can be manifested using a module that we call “personal status display” able to synthesise a speaking avatar that utters a sentence expressing a given personal status.
With these two modules and a series of other technologies, MPAI intends to support three use cases: a human talking to a machine about the objects contained in a room, a group of humans talking with an autonomous vehicle and a virtual video conference in which the participants are represented by avatars seated around a table and express the characteristics and movements of the participants. In the last use case, a virtual secretary understands what the avatars say along with their personal status and converts everything into a summary.
Q: Where are we regarding the definition of data coding standards for AI? Do you think we will reach a sufficient level to allow the construction of an ecosystem of AI applications that can really exploit standardised data coding protocols?
So far, we have approved 5 standards. Four are “technical”: context-based audio enhancement, multimodal conversation, probability of a company’s failure based on its data, and a standard environment for running AI applications. This last standard MPAI-AIF is at the basis of the other three because MPAI standards do not define monolithic, but component-based applications. An AI application is a workflow consisting of modules of which an MPAI application standard defines the functions and interfaces, i.e., the data that passes through them. A user of the standard can build their own application workflow by putting together modules from different sources that can work together because they conform with the “standard”. It is the Lego approach adapted to component-based applications. For this to be practically possible, governance of the ecosystem generated by MPAI standards is required. This is the goal of the fifth MPAI standard MPAI-GME that sets the rules to convert the idea of the user who builds a workflow and makes it work from a good dream to reality. A fundamental governance element is the MPAI Store, recently established as a non-profit organisation, from which a user can download not only complete applications – as they can do today from app stores – but also the components of the application they need coming from diverse sources. If the scenario of composing an application from components ceases to be a dream, then those who develop MPAI modules do not necessarily have to be economic giants to bear the development costs of monolithic AI applications, but they may very well be small and medium-sized companies specialising in potentially much narrower areas, a possibly better suited to innovation. In addition, the MPAI Store deals with module distribution.
This is the new world that is emerging, made possible by MPAI standards.
Join MPAI – Join the fun – Build the future!