michabbb<p>🌍 <a href="https://social.vivaldi.net/tags/MOSEL" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MOSEL</span></a>: Multilingual Open-Source European Languages Dataset</p><p>• 📊 950,000 hours of <a href="https://social.vivaldi.net/tags/speech" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>speech</span></a> data covering 24 official EU languages<br>• 🎙️ Includes up to 441K hours of unlabeled speech from <a href="https://social.vivaldi.net/tags/VoxPopuli" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>VoxPopuli</span></a> and <a href="https://social.vivaldi.net/tags/LibriLight" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LibriLight</span></a><br>• 🤖 Transcribed using <a href="https://social.vivaldi.net/tags/Whisper" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Whisper</span></a> large v3 <a href="https://social.vivaldi.net/tags/ASR" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ASR</span></a> model<br>• 🏷️ Covers both labeled and unlabeled <a href="https://social.vivaldi.net/tags/speechcorpora" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>speechcorpora</span></a><br>• 📜 Released under <a href="https://social.vivaldi.net/tags/CCBY40" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CCBY40</span></a> license for <a href="https://social.vivaldi.net/tags/opensource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>opensource</span></a> use<br>• 🧠 Designed for training <a href="https://social.vivaldi.net/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://social.vivaldi.net/tags/speechrecognition" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>speechrecognition</span></a> models</p><p>Key features:<br>• Diverse language coverage<br>• Large-scale dataset<br>• Open-source compliant<br>• Includes pseudo-labeled data<br>• Supports <a href="https://social.vivaldi.net/tags/NLP" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>NLP</span></a> and <a href="https://social.vivaldi.net/tags/machinelearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machinelearning</span></a> research</p><p>Learn more: <a href="https://huggingface.co/datasets/FBK-MT/mosel?s=09" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">huggingface.co/datasets/FBK-MT</span><span class="invisible">/mosel?s=09</span></a></p>