This page was generated programmatically; to read the article at its original source, you can visit the link below:
https://www.technologyreview.com/2025/01/15/1109994/metas-new-ai-model-can-translate-speech-from-more-than-100-languages/
and if you wish to remove this article from our website, please get in touch with us
“Meta has excelled in offering a variety of features they support, such as text-to-speech, speech-to-text, and even automatic speech recognition,” states Chetan Jaiswal, a computer science professor at Quinnipiac University, who was not a part of the study. “The sheer volume of languages they are supporting is an extraordinary accomplishment.”
According to the researchers, human translators remain an essential element of the translation process as they can navigate various cultural contexts and ensure that the intended meaning is successfully conveyed from one language to another. This stage is crucial, notes Lynne Bowker from the University of Ottawa’s School of Translation & Interpretation, who did not participate in Seamless. “Languages reflect cultures, and cultures possess their unique ways of understanding things,” she remarks.
In fields such as medicine or law, machine translations require thorough verification by a human, she emphasizes. Without proper checks, misinterpretations may occur. For instance, when Google Translate was utilized to interpret public health details regarding the covid-19 vaccine from the Virginia Department of Health in January 2021, it translated “not mandatory” in English to “not necessary” in Spanish, altering the entire significance of the communication.
AI models have significantly more data to train on in certain languages compared to others. Consequently, current speech-to-speech models may effectively translate a language like Greek into English, where ample examples exist, but fail to translate from Swahili to Greek. The developers behind Seamless sought to address this issue by preparing the model using millions of hours of spoken audio in various languages. This initial training enabled it to identify common linguistic patterns, facilitating the processing of less frequently spoken languages since it already had some foundational understanding of how spoken language is designed to sound.
The system is open-source, which the researchers anticipate will motivate others to expand upon its existing functionalities. However, some express doubt regarding its potential usefulness compared to existing options. “Google’s translation model is not as open-source as Seamless, but it’s significantly faster and more responsive, and it comes at no cost for academics,” states Jaiswal.
The most thrilling aspect of Meta’s system is that it suggests the potential for instantaneous interpretation across languages in the foreseeable future—akin to the Babel fish in Douglas Adams’ renowned novel The Hitchhiker’s Guide to the Galaxy. SeamlessM4T operates quicker than current models, yet is not instantaneous. Nevertheless, Meta asserts to possess a newer version of Seamless that matches the speed of human interpreters.
“While having this type of delayed translation is acceptable and beneficial, I believe simultaneous translation will be even more advantageous,” remarks Kenny Zhu, director of the Arlington Computational Linguistics Lab at the University of Texas at Arlington, who has no ties to the recent research.
This page was generated programmatically; to read the article at its original source, you can visit the link below:
https://www.technologyreview.com/2025/01/15/1109994/metas-new-ai-model-can-translate-speech-from-more-than-100-languages/
and if you wish to remove this article from our website, please get in touch with us