mastouille.fr est l'un des nombreux serveurs Mastodon indépendants que vous pouvez utiliser pour participer au fédiverse.
Mastouille est une instance Mastodon durable, ouverte, et hébergée en France.

Administré par :

Statistiques du serveur :

650
comptes actifs

#speechrecognition

0 message0 participant0 message aujourd’hui

🌟 Excited to share Thorsten-Voice's YouTube channel! 🎥 🗣️🔊 ♿ 💬

Thorsten presents innovative TTS solutions and a variety of voice technologies, making it an excellent starting point for anyone interested in open-source text-to-speech. Whether you're a developer, accessibility advocate, or tech enthusiast, his channel offers valuable insights and resources. Don't miss out on this fantastic content! 🎬

follow hem here: @thorstenvoice
or on YouTube: youtube.com/@ThorstenMueller YouTube channel!

www.youtube.comAvant d'accéder à YouTube
#Accessibility#FLOSS#TTS

I'm exploring ways to improve audio preprocessing for speech recognition for my [midi2hamlib](github.com/DO9RE/midi2hamlib) project. Do any of my followers have expertise with **SoX** or **speech recognition**? Specifically, I’m seeking advice on: 1️⃣ Best practices for audio preparation for speech recognition. 2️⃣ SoX command-line parameters that can optimize audio during recording or playback.
github.com/DO9RE/midi2hamlib/b #SoX #SpeechRecognition #OpenSource #AudioProcessing #ShellScripting #Sphinx #PocketSphinx #Audio Retoot appreciated.

GitHubGitHub - DO9RE/midi2hamlibContribute to DO9RE/midi2hamlib development by creating an account on GitHub.

🌍 #MOSEL: Multilingual Open-Source European Languages Dataset

📊 950,000 hours of #speech data covering 24 official EU languages
🎙️ Includes up to 441K hours of unlabeled speech from #VoxPopuli and #LibriLight
🤖 Transcribed using #Whisper large v3 #ASR model
🏷️ Covers both labeled and unlabeled #speechcorpora
📜 Released under #CCBY40 license for #opensource use
🧠 Designed for training #AI #speechrecognition models

Key features:
• Diverse language coverage
• Large-scale dataset
• Open-source compliant
• Includes pseudo-labeled data
• Supports #NLP and #machinelearning research

Learn more: huggingface.co/datasets/FBK-MT

huggingface.coFBK-MT/mosel · Datasets at Hugging FaceWe’re on a journey to advance and democratize artificial intelligence through open source and open science.

Gibt es aktuell eine gut funktionierende und anwendungsfreundliche Möglichkeit, Text direkt in ein #Libreoffice-Dokument zu #diktieren?
Lokale Lösungen! Keinesfalls Cloud.

Ich kenne den Weg, eine Audio-Datei mit #Whisper zu transkribieren. Das ist super, nutzt mir aber aktuell gerade recht wenig. Ich bräuchte was, wo der Prozess schon während des Sprechens läuft ...so wie bei Dragon und ähnlichen "Diktiersystemen", die es mal gab (und von denen ich nicht weiß, ob sie noch existieren).

A répondu dans un fil de discussion

Last week, as part of my #PhD program at the #ANU School of #cybernetics, I gave my final presentation, which is a summary of my methods and #research findings. I covered my interview work, the #dataset documentation analysis work I've been doing and my analysis work around #accents in @mozilla's #CommonVoice platform.

There were some insightful and thought-provoking questions from my panel and audience members, and of course - so many ideas for future research inquiry!

A huge thanks to my panel, chaired so well by Professor Alexandra Zafiroglu, to Dr Elizabeth Williams, my meticulous, methodical and always-encouraging Primary Supervisor, and to my co-supervisors Dr Jofish Kaye and Dr Paul Wong 黃仲熙 for their deep expertise in #HCI and #data respectively.

Similarly, a huge thank you to my #PhD cohort - Charlotte Bradley, Tom Chan, Danny Bettay and Sam Backwell - as well as the other cohorts in the School - for your encouragement and intellectual journeying.