mastouille.fr @admin

0 message0 participant0 message aujourd’hui

**Debby** @debby@hear-me.social · 23 mai *

Excited to share Thorsten-Voice's YouTube channel!

Thorsten presents innovative TTS solutions and a variety of voice technologies, making it an excellent starting point for anyone interested in open-source text-to-speech. Whether you're a developer, accessibility advocate, or tech enthusiast, his channel offers valuable insights and resources. Don't miss out on this fantastic content!

follow hem here: @thorstenvoice
or on YouTube: https://www.youtube.com/@ThorstenMueller YouTube channel!

www.youtube.comAvant d'accéder à YouTube

#Accessibility #FLOSS #TTS

**IT News** @itnewsbot@schleuss.online · 18 mai

18 mai

IT News @itnewsbot@schleuss.online

Christmas Comes Early With AI Santa Demo - With only two hundred odd days ’til Christmas, you just know we’re already feeling... - https://hackaday.com/2025/05/18/christmas-comes-early-with-ai-santa-demo/ #artificialintelligence #speechrecognition #speechsynthesis #santaclaus #libpeer #openai #llm #ai

Hackaday · 18 maiChristmas Comes Early With AI Santa DemoWith only two hundred odd days ’til Christmas, you just know we’re already feeling the season’s magic. Well, maybe not, but [Sean Dubois] has decided to give us a head start with …

**Richard Emling (DO9RE)** @tschapajew@metalhead.club · 1 mai

1 mai

Richard Emling (DO9RE) @tschapajew@metalhead.club

I'm exploring ways to improve audio preprocessing for speech recognition for my [midi2hamlib](https://github.com/DO9RE/midi2hamlib) project. Do any of my followers have expertise with **SoX** or **speech recognition**? Specifically, I’m seeking advice on: Best practices for audio preparation for speech recognition. SoX command-line parameters that can optimize audio during recording or playback.
https://github.com/DO9RE/midi2hamlib/blob/main/tests/speech_menu.sh #SoX #SpeechRecognition #OpenSource #AudioProcessing #ShellScripting #Sphinx #PocketSphinx #Audio Retoot appreciated.

GitHubGitHub - DO9RE/midi2hamlibContribute to DO9RE/midi2hamlib development by creating an account on GitHub.

**Pyrzout** @jos1264@social.skynetcloud.site · 19 févr.

19 févr.

Pyrzout @jos1264@social.skynetcloud.site

Be Careful What You Ask For: Voice Control https://hackaday.com/2025/02/19/be-careful-what-you-ask-for-voice-control/ #speechrecognition #computerspeech #voicecommand #Featured #Rants #rants

Hackaday · 19 févr.Be Careful What You Ask For: Voice ControlWe get it. We also watched Star Trek and thought how cool it would be to talk to our computer. From Kirk setting a self-destruct sequence, to Scotty talking into a mouse, or Picard ordering Earl Gr…

**Doug Holton** @dougholton@mastodon.social · 10 févr. *

10 févr. *

Doug Holton @dougholton@mastodon.social

Vibe is an #OpenSource desktop client (mac, windows, linux) for locally running Whisper to more accurately transcribe or caption videos & audio https://thewh1teagle.github.io/vibe/ Source code: https://github.com/thewh1teagle/vibe/ Easier to use than what I was using before (WhisperDesktop). Default settings use the medium Whisper model, which has been good enough in my experience.
#Accessibility #A11y #AI #SpeechRecognition #EdTech

**The Conversation U.S.** @TheConversationUS@newsie.social · 5 févr.

5 févr.

The Conversation U.S. @TheConversationUS@newsie.social

Speech recognition systems struggle with accents and dialects, risking problems in critical fields like healthcare and emergency services. Imagine calling 911 and the AI used to screen out non-emergency calls can’t understand you.

A Spanish language professor explains: https://theconversation.com/sorry-i-didnt-get-that-ai-misunderstands-some-peoples-words-more-than-others-239281 #AI #speechrecognition

The Conversation‘Sorry, I didn’t get that’: AI misunderstands some people’s words more than othersSpeaking with an AI bot can be amusing and even helpful – if it understands you. How well AIs do that is a matter of whose speech they’ve been trained on.

**Mike Kuketz** @kuketzblog@social.tchncs.de · 5 févr.

5 févr.

Mike Kuketz @kuketzblog@social.tchncs.de

#UnplugTrump - Tipp5:
Verabschiede dich von Alexa und anderen Sprachassistenten, die deine Gespräche mithören und auswerten. Nutze stattdessen eine datenschutzfreundliche Alternative wie OpenVoiceOS, ein Open-Source-Sprachassistent, der von einer aktiven Community weiterentwickelt wird und auf einem RaspberryPi läuft. So behältst du die Kontrolle über deine Daten.

RaspberryPi, der mit OpenVoiceOS bespielt ist. Daneben steht eine Lautsprecher-Box, die von jemandem mit einem USB-Kabel verbunden wird.

#Alexa #OpenVoiceOS #Sprachassistent

**beetle_b** @beetle_b@mastodon.xyz · 6 janv.

6 janv.

beetle_b @beetle_b@mastodon.xyz

Using LLMs to clean up the output of speech recognition has been a game changer for me in the past year:

https://blog.nawaz.org/posts/2023/Dec/cleaning-up-speech-recognition-with-gpt/

Note: I've improved my workflow compared to that post. I should write a followup.

blog.nawaz.orgCleaning Up Speech Recognition with GPT

#gpt #chatgpt #llm

**deconspray** @deconspray@mastodon.social · 20 oct. 2024

20 oct. 2024

deconspray @deconspray@mastodon.social

Microsoft partners with Be My Eyes to enhance AI inclusivity for the blind and low vision community.

https://buff.ly/40htLlw

#Accessibility #AI,#Disabilities #Disability #GenerativeAI #Inclusion #Neurodiversity #ResponsibleAI #Speech #SpeechRecognition

Microsoft On the Issues · 17 oct. 2024Disability Data: Improving Representation to Drive AI InnovationMicrosoft partners with Be My Eyes to enhance AI inclusivity for the blind and low vision community. This collaboration aims to improve AI accuracy and reduce bias by incorporating high-quality, disability representative data. Learn more about Microsoft's commitment to responsible and inclusive AI technology.

**michabbb** @michabbb@vivaldi.net · 5 oct. 2024

5 oct. 2024

michabbb @michabbb@vivaldi.net

#MOSEL: Multilingual Open-Source European Languages Dataset

• 950,000 hours of #speech data covering 24 official EU languages
• Includes up to 441K hours of unlabeled speech from #VoxPopuli and #LibriLight
• Transcribed using #Whisper large v3 #ASR model
• Covers both labeled and unlabeled #speechcorpora
• Released under #CCBY40 license for #opensource use
• Designed for training #AI #speechrecognition models

Key features:
• Diverse language coverage
• Large-scale dataset
• Open-source compliant
• Includes pseudo-labeled data
• Supports #NLP and #machinelearning research

Learn more: https://huggingface.co/datasets/FBK-MT/mosel?s=09

huggingface.coFBK-MT/mosel · Datasets at Hugging FaceWe’re on a journey to advance and democratize artificial intelligence through open source and open science.

**Kathy Reid** @KathyReid@aus.social · 10 sept. 2024

10 sept. 2024

Kathy Reid @KathyReid@aus.social

Absolutely the fuck not.

#SpeechRecognition #Advertising #Surveillance

https://therecord.media/ford-patent-application-in-vehicle-listening-advertising

therecord.mediaFord seeks patent for tech that listens to driver conversations to serve adsA Ford Motoer Company patent application filed in February and published last month proposes software that would monitor in-car conversations and other data to help serve up advertisements.

**Thorsten Rochelmeyer** @thorsten4future@climatejustice.social · 4 sept. 2024

4 sept. 2024

Thorsten Rochelmeyer @thorsten4future@climatejustice.social

Gibt es aktuell eine gut funktionierende und anwendungsfreundliche Möglichkeit, Text direkt in ein #Libreoffice-Dokument zu #diktieren?
Lokale Lösungen! Keinesfalls Cloud.

Ich kenne den Weg, eine Audio-Datei mit #Whisper zu transkribieren. Das ist super, nutzt mir aber aktuell gerade recht wenig. Ich bräuchte was, wo der Prozess schon während des Sprechens läuft ...so wie bei Dragon und ähnlichen "Diktiersystemen", die es mal gab (und von denen ich nicht weiß, ob sie noch existieren).

#Linux #STT #SpeechToText

**dan** @billgoats@bitbang.social · 17 août 2024

17 août 2024

dan @billgoats@bitbang.social

Orange you glad you bought a Power Macintosh? #RetroComputing #FrogFind #SpeechRecognition

**IT News** @itnewsbot@schleuss.online · 30 mai 2024

30 mai 2024

IT News @itnewsbot@schleuss.online

CH32V003 Provides Ultra Cheap Speech Recognition - Speech recognition was once the stuff of science fiction, but it’s now possible wi... - https://hackaday.com/2024/05/30/ch32v003-provides-ultra-cheap-speech-recognition/ #speechrecognition #microcontrollers #ch32v003

Hackaday · 30 mai 2024CH32V003 Provides Ultra Cheap Speech RecognitionSpeech recognition was once the stuff of science fiction, but it’s now possible with relatively modest hardware. Just how modest, you ask? How about a 10 cent microcontroller? [Brian Smith] h…

**Inautilo** @inautilo@mastodon.social · 29 avr. 2024

29 avr. 2024

Inautilo @inautilo@mastodon.social

#Development #Pitfalls
Images as the first thing in a button or link · How they can bother users of assistive technology https://ilo.im/15ynlu

_____
#Image #Button #Hyperlink #Accessibility #AssistiveTechnology #ScreenReader #SpeechRecognition #WebDev #Frontend #HTML

tempertemper Web DesignImages as the first thing in a button or linkIf the text of an interactive element like a button or link is preceded with an accessible image, we’ve probably got an accessibility problem.

A répondu dans un fil de discussion

**jacqueline** @jacqueline@fosstodon.org · 10 avr. 2024 *

10 avr. 2024 *

jacqueline @jacqueline@fosstodon.org

@heiseonline

"(...) AI: Meta promises open source language model Llama 3 next month

The once again open-source AI language model Llama 3 will be published shortly. Meta announces different versions with different capabilities.

https://www.heise.de/news/KI-Meta-verspricht-Open-Source-Sprachmodell-Llama-3-fuer-den-naechsten-Monat-9679860.html

#ArticialIntelligence #AI #MetaPlatforms #OpenSource #SpeechRecognition #news (...)"

heise online · 10 avr. 2024KI: Meta verspricht Open-Source-Sprachmodell Llama 3 für den nächsten MonatPar Frank Schräer

**Zeroun ⏚ :** @zeroun@mastodon.tedomum.net · 28 mars 2024

28 mars 2024

Zeroun ⏚ : @zeroun@mastodon.tedomum.net

Salut,
Vous avez un soft de reconnaissance vocale (#SpeechRecognition) à me conseiller sur gnu/lINUX ?

**Inautilo** @inautilo@mastodon.social · 12 mars 2024

12 mars 2024

Inautilo @inautilo@mastodon.social

#Development #Reviews
Alt text for CSS generated content · A closer look at Safari's new accessible ‘content’ fallback https://ilo.im/15y89w

_____
#AltText #Accessibility #ScreenReader #SpeechRecognition #Browser #WebDev #Frontend #HTML #CSS

**Kathy Reid** @KathyReid@aus.social · 20 nov. 2023

20 nov. 2023

Kathy Reid @KathyReid@aus.social

Last week, as part of my #PhD program at the #ANU School of #cybernetics, I gave my final presentation, which is a summary of my methods and #research findings. I covered my interview work, the #dataset documentation analysis work I've been doing and my analysis work around #accents in @mozilla's #CommonVoice platform.

There were some insightful and thought-provoking questions from my panel and audience members, and of course - so many ideas for future research inquiry!

A huge thanks to my panel, chaired so well by Professor Alexandra Zafiroglu, to Dr Elizabeth Williams, my meticulous, methodical and always-encouraging Primary Supervisor, and to my co-supervisors Dr Jofish Kaye and Dr Paul Wong 黃仲熙 for their deep expertise in #HCI and #data respectively.

Similarly, a huge thank you to my #PhD cohort - Charlotte Bradley, Tom Chan, Danny Bettay and Sam Backwell - as well as the other cohorts in the School - for your encouragement and intellectual journeying.

Kathy Reid presenting her #PhD final presentation.

Results from Kathy Reid's survey of #ML practitioners

Kathy Reid's work in assessing the Whisper #ASR engine

#PhDlife #milestone #voiceAI

**unfa** @unfa@mastodon.social · 23 oct. 2023

23 oct. 2023

unfa @unfa@mastodon.social

Common Voice is a project by Mozilla to build an extensive ethically-sourced dataset of spoken word in various languages to help push forward open-source voice recognition technology like DeepVoice (also by Mozilla).

I just recorded a dozen or so sentences :)

https://commonvoice.mozilla.org/

commonvoice.mozilla.orgMozilla Common Voice

#OpenSource #SpeechRecognition #SpeechToText

Recherches récentes

Options de recherche

Administré par :

Statistiques du serveur :

#speechrecognition