Research, standards and thoughts for the digital world

Earlier posts by categories:

MPAI MPEG ISO

An overview of Multimodal Conversation V2

The goal of the Multimodal Conversation (MPAI-MMC) standard is to provide technologies that enable a human-machine conversation that is more human-like, richer in content, and able to emulate human-human conversation in completeness and intensity. By learning from human interaction, machines can improve their “conversational” capabilities in the two main phases of conversation: understanding of the meaning of an element and the generation of a pertinent response. Multimodal Conversation Version 2 achieves this goal by providing, among other technologies, a new…

Continue ReadingAn overview of Multimodal Conversation V2

An Introduction to the MPAI Metaverse Model Architecture – Part I

This is the first of a series of posts that illustrate Call for Technologies: MPAI Metaverse Model – Architecture , a document inviting interested parties to submit comments to and proposals for Use Cases and Functional Requirements: MPAI Metaverse Model – Architecture . "MPAI Metaverse Model Architecture” is the first Metaverse Architecture standard ever attempted by a standards body. Before starting, let's clarify why is MPAI, a standards body developing standards for AI-based data coding, engaged in the “metaverse”? The…

Continue ReadingAn Introduction to the MPAI Metaverse Model Architecture – Part I

A standards body for AI-based data coding

Data is information converted to bits. To know what the bits mean, however, we must know the format, i.e., how information is represented or coded in bits. Data with an unknown format has little value and with a known format has a value that is inversely proportional to the effort required to convert it to an understandable format. Therefore, to be “the New Oil of the Digital Economy”, data should have a standard format. The way international bodies are organised…

Continue ReadingA standards body for AI-based data coding

Imperceptibility, Robustness, and Computational Cost in Neural Network Watermarking

Introduction Research efforts, specific skills, training and processing can cumulatively bring the development costs of a neural network anywhere from a few thousand to a few hundreds of thousand dollars. Therefore, the AI industry needs a technology to ensure traceability and integrity not only of a neural network but also of the content generated by it (so-called inference). Faced with a similar problem, the digital content production and distribution industry has considered watermarking as a tool to insert a payload…

Continue ReadingImperceptibility, Robustness, and Computational Cost in Neural Network Watermarking

The MPAI 2022 Calls for Technologies – Part 3 (Neural Network Watermarking)

Research, personnel, training and processing can bring the development costs of a neural network anywhere from a few thousand to a few hundreds of thousand dollars. Therefore, the AI industry needs a technology to ensure traceability and integrity not only of a neural network, but also of the content generated by it (so-called inference). The content industry facing a  similar problem, has used watermarking to imperceptibly and persistently insert a payload carrying, e.g., owner ID, timestamp, etc. to signal the ownership of a content item. Watermarking…

Continue ReadingThe MPAI 2022 Calls for Technologies – Part 3 (Neural Network Watermarking)

The MPAI 2022 Calls for Technologies – Part 2 (Multimodal Conversation)

Processing and generation of natural language is an area where artificial Intelligence is expected to make a difference compared to traditional technologies. Version 1 of the MPAI Multimodal Conversation standard (MPAI-MMC V1), specifically the Conversation with Emotion use case, has addressed this and related challenges: processing and generation not only of speech but also of the corresponding human face when both convey emotion. The audio and video produced by a human conversing with the machine represented by the blue box…

Continue ReadingThe MPAI 2022 Calls for Technologies – Part 2 (Multimodal Conversation)