Mozilla’s Common Voice project released a large voice dataset last week. Common Voice is an open source collection of transcribed recordings and metadata for voice app and voice-enabled device designers to use.
The Common Voice project boasts 5.5 million clips in 54 languages over 7,226 hours. Mozilla wants Common Voice users to integrate the data with its DeepSpeech toolkit of voice and text models.
Volunteers upload recorded clips of themselves speaking to the Common Voice project. Then, the transcribed sentences are collected in a voice database under the CC0 license. This allows developers to use the clips sans costs and copyright restrictions.
Common Voice aims to fill gaps left by common voice tech apps, which are often critiqued for not being trained on diverse datasets representing a range of accents, inflections, and languages. Along with its recent update to Common Voice, Mozilla also improved DeepSpeech’s speed of recognition recently.