Text-to-Speech Powered by MCUs and MPUs

This demonstration features Text-to-Speech (TTS) on the i.MX RT700 microcontroller unit (MCU) and the i.MX 8M and i.MX 9 microprocessing unit (MPU) families. This speech synthesis enables AI assistants and enhances user experiences. For MPUs, the TTS model is based on variational Inference with adversarial learning for end-to-end Text-to-Speech (VITS) architecture. TTS can process multiple voices including different accents, registers and genders depending on the language. The frequency sampling rate is 16 kHz or 22 kHz, depending on the model. With the i.MX RT700 MCU, the TTS performance is neural processing unit (NPU)-driven.

Explore NXP's voice processing solutions for consumer and industrial applications.

Key Highlights

Understand the basics of TTS technology
Grasp the role of MPUs and MCUs in TTS performance
Watch a real example of TTS processing including a detailed diagram
Visualize TTS functionality powered by MCUs/MPUs

Resources

続きを読む閉じる