Xiaomi, a consumer electronics and smart manufacturing company with smartphones and smart hardware connected by an IoT platform at its core, unveiled its latest application of advanced algorithms and self-developed speech technology to the accessibility field. The spontaneous style Text-To-Speech technology, which is developed by Xiaomi AI Lab, is used to generate a unique and customized voice for a user with speech disorders.
This user can now communicate with others by using “his own voice”, instead of a typical monotonous electronic voice. As a part of the “Own My Voice” pre-research project led by Xiaomi Technical Committee, this successful attempt demonstrates Xiaomi’s commitment to “Tech for Good” and to achieve its mission of “let everyone in the world enjoy a better life through innovative technology”.
Why did Xiaomi launch this project?
Xiaomi cares about people, and endeavors to fulfill their diverse needs through technology innovation. It discovered the desire of many users with speech disorders to own their unique voices for daily communication, and established the “Own My Voice” project team to invite a user with speech disorders as the voice recipient. Zhu Xi, Technology Committee topic convener on Tech for Good, Xiaomi Corporation, said, “We are excited to explore multiple values that technology innovation brings to us, such as responding to users’ demands for the self-identity and the construction of identity.”
How did Xiaomi carry out the project?
In order to generate the most suitable and personalized voice for the recipient, the project team recruited more than 200 volunteers within Xiaomi to donate their voices. They used the voiceprint matching algorithm to match the features of volunteers’ donated voices with those of the recipient’s voice. Through this approach, they found the most suitable voice as the basic sound of voice reference for the recipient. In consideration of personalization and privacy protection, the chosen real voice was manipulated with complex acoustic modification to form a new and original sound of voice.
Next, they used spontaneous style Text-To-Speech technology to train AI model, making this new voice gradually gain a natural rhythm and intonation that can truthfully express the emotion and the tone of a human.
The “Own My Voice” project combines a variety of most advanced algorithms with Xiaomi’s self-developed speech technology to ensure the specificity, safety, and high genuineness of the synthesized voice, creating a new idea on customized speech synthesis for users with speech disorders.
What is the significance of the project?
The backbone of this project is a group of speech technology experts from Xiaomi AI Lab. Since 2017, they have published 37 papers on speech in the proceedings of top international conferences, such as the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). The success of “Own My Voice” mainly depends on spontaneous style Text-To-Speech technology developed by them.
The spontaneous style Text-To-Speech technology essentially makes the synthesized voice like a real human in its intonation, pause, speed, and other features. This replaces the monotonous and unnatural feeling of the electronic voice with a more natural one. Currently, this technology applies to many smart devices equipped with Xiaoai, the AI voice assistant of Xiaomi. The “Own My Voice” project showcases that spontaneous style Text-To-Speech technology can also be widely adopted in accessibility areas and improve user experience.
Zhu Xi added, “If we notice and address the needs of minority groups at an early stage, the process of technology diffusion could be greatly shortened. This allows the benefits of new technologies to become accessible to users with special needs without delay.”
Moving forward, Xiaomi will continue receiving feedback from the voice recipient, and further study the feasibility of this project in a wider range. Xiaomi will keep empowering accessibility through cutting-edge technology, endeavoring to fulfill people’s diverse needs through technology innovation.