Dagmawi Babi
Meta open sourced another new AI model today that they call Massively Multilingual Speech. It can identify more than 4,000 spoken languages and will make it easier for people to connect and access information in their language. And apparently it can do that…
The Massively Multilingual Speech (MMS) project scales speech technology to 1,100-4,000 languages using self-supervised learning with wav2vec 2.0.
Compared to OpenAI Whisper, the multilingual ASR model supports 11x more languages but has less than half the average error rate on 54 languages of FLEURS. The model is also trained on a fraction of the labeled data (45K vs. 680K hours).
Paper
• https://research.facebook.com/publications/scaling-speech-technology-to-1000-languages/
Blog
• https://ai.facebook.com/blog/multilingual-model-speech-recognition/
Code/models
• https://github.com/facebookresearch/fairseq/tree/main/examples%2Fmms
#MMS #AI #OSS
@Dagmawi_Babi
Compared to OpenAI Whisper, the multilingual ASR model supports 11x more languages but has less than half the average error rate on 54 languages of FLEURS. The model is also trained on a fraction of the labeled data (45K vs. 680K hours).
Paper
• https://research.facebook.com/publications/scaling-speech-technology-to-1000-languages/
Blog
• https://ai.facebook.com/blog/multilingual-model-speech-recognition/
Code/models
• https://github.com/facebookresearch/fairseq/tree/main/examples%2Fmms
#MMS #AI #OSS
@Dagmawi_Babi