Mastouille

0 message0 participant0 message aujourd’hui

IT NewsNew Grok AI model surprises experts by checking Elon Musk’s views before answering - An AI model launched last week appears to have shipped with ... - <a href="https://arstechnica.com/information-technology/2025/07/new-grok-ai-model-surprises-experts-by-checking-elon-musks-views-before-answering/" rel="nofollow noopener noreferrer" translate="no" target="_blank">https://arstechnica.com/information-technology/2025/07/new-grok-ai-model-surprises-experts-by-checking-elon-musks-views-before-answering/</a> <a href="https://schleuss.online/tags/machinelearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#machinelearning</a> <a href="https://schleuss.online/tags/simonwillison" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#simonwillison</a> <a href="https://schleuss.online/tags/aiassistants" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aiassistants</a> <a href="https://schleuss.online/tags/jeremyhoward" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#jeremyhoward</a> <a href="https://schleuss.online/tags/aialignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aialignment</a> <a href="https://schleuss.online/tags/aibehavior" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aibehavior</a> <a href="https://schleuss.online/tags/aisearch" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aisearch</a> <a href="https://schleuss.online/tags/elonmusk" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#elonmusk</a> <a href="https://schleuss.online/tags/twitter" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#twitter</a> <a href="https://schleuss.online/tags/biz" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#biz</a>⁢ <a href="https://schleuss.online/tags/grok" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#grok</a> <a href="https://schleuss.online/tags/xai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#xai</a> <a href="https://schleuss.online/tags/ai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#ai</a> <a href="https://schleuss.online/tags/x" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#x</a>

IT NewsResearchers concerned to find AI models hiding their true “reasoning” processes - Remember when teachers demanded that you "show your work" in school? Some ... - <a href="https://arstechnica.com/ai/2025/04/researchers-concerned-to-find-ai-models-hiding-their-true-reasoning-processes/" rel="nofollow noopener noreferrer" translate="no" target="_blank">https://arstechnica.com/ai/2025/04/researchers-concerned-to-find-ai-models-hiding-their-true-reasoning-processes/</a> <a href="https://schleuss.online/tags/largelanguagemodels" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#largelanguagemodels</a> <a href="https://schleuss.online/tags/simulatedreasoning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#simulatedreasoning</a> <a href="https://schleuss.online/tags/machinelearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#machinelearning</a> <a href="https://schleuss.online/tags/aialignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aialignment</a> <a href="https://schleuss.online/tags/airesearch" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#airesearch</a> <a href="https://schleuss.online/tags/anthropic" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#anthropic</a> <a href="https://schleuss.online/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aisafety</a> <a href="https://schleuss.online/tags/srmodels" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#srmodels</a> <a href="https://schleuss.online/tags/chatgpt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#chatgpt</a> <a href="https://schleuss.online/tags/biz" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#biz</a>⁢ <a href="https://schleuss.online/tags/claude" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#claude</a> <a href="https://schleuss.online/tags/ai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#ai</a>

IT NewsResearchers astonished by tool’s apparent success at revealing AI’s hidden motives - In a new paper published Thursday titled "Auditing language models for hid... - <a href="https://arstechnica.com/ai/2025/03/researchers-astonished-by-tools-apparent-success-at-revealing-ais-hidden-motives/" rel="nofollow noopener noreferrer" translate="no" target="_blank">https://arstechnica.com/ai/2025/03/researchers-astonished-by-tools-apparent-success-at-revealing-ais-hidden-motives/</a> <a href="https://schleuss.online/tags/largelanguagemodels" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#largelanguagemodels</a> <a href="https://schleuss.online/tags/alignmentresearch" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#alignmentresearch</a> <a href="https://schleuss.online/tags/machinelearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#machinelearning</a> <a href="https://schleuss.online/tags/claude3" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#claude3</a>.5haiku <a href="https://schleuss.online/tags/aialignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aialignment</a> <a href="https://schleuss.online/tags/aideception" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aideception</a> <a href="https://schleuss.online/tags/airesearch" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#airesearch</a> <a href="https://schleuss.online/tags/anthropic" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#anthropic</a> <a href="https://schleuss.online/tags/chatgpt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#chatgpt</a> <a href="https://schleuss.online/tags/chatgtp" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#chatgtp</a> <a href="https://schleuss.online/tags/biz" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#biz</a>⁢ <a href="https://schleuss.online/tags/claude" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#claude</a> <a href="https://schleuss.online/tags/ai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#ai</a>

Sciences, Flute 🌍 :verified:AI alignment is making sure it hallucinates unsurprising clichés.<a href="https://hachyderm.io/@evacide/114032149970802087" rel="nofollow noopener noreferrer" translate="no" target="_blank">https://hachyderm.io/@evacide/114032149970802087</a><a href="https://piaille.fr/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AI</a> <a href="https://piaille.fr/tags/alignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#alignment</a> <a href="https://piaille.fr/tags/aialignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aialignment</a>

Europe Says<a href="https://www.europesays.com/1624898/" rel="nofollow noopener noreferrer" target="_blank">https://www.europesays.com/1624898/</a> AI agents are the next big thing. What are they? <a href="https://pubeurope.com/tags/Activision" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Activision</a> <a href="https://pubeurope.com/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AI</a> <a href="https://pubeurope.com/tags/AIAlignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AIAlignment</a> <a href="https://pubeurope.com/tags/ArtificialIntelligence" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#ArtificialIntelligence</a> <a href="https://pubeurope.com/tags/Blackwell" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Blackwell</a> <a href="https://pubeurope.com/tags/Chatbots" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Chatbots</a> <a href="https://pubeurope.com/tags/ChatGPT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#ChatGPT</a> <a href="https://pubeurope.com/tags/ComputationalNeuroscience" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#ComputationalNeuroscience</a> <a href="https://pubeurope.com/tags/Cybernetics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Cybernetics</a> <a href="https://pubeurope.com/tags/DanielVassilev" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#DanielVassilev</a> <a href="https://pubeurope.com/tags/GenerativeArtificialIntelligence" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#GenerativeArtificialIntelligence</a> <a href="https://pubeurope.com/tags/Hopper" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Hopper</a> <a href="https://pubeurope.com/tags/HopperChips" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#HopperChips</a> <a href="https://pubeurope.com/tags/JensenHuang" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#JensenHuang</a> <a href="https://pubeurope.com/tags/MarkZuckerberg" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#MarkZuckerberg</a> <a href="https://pubeurope.com/tags/Meta" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Meta</a> <a href="https://pubeurope.com/tags/Microsoft" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Microsoft</a> <a href="https://pubeurope.com/tags/MicrosoftCopilot" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#MicrosoftCopilot</a> <a href="https://pubeurope.com/tags/Nvidia" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Nvidia</a> <a href="https://pubeurope.com/tags/OpenAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#OpenAI</a> <a href="https://pubeurope.com/tags/Quartz" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Quartz</a> <a href="https://pubeurope.com/tags/RebeccaGreene" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#RebeccaGreene</a> <a href="https://pubeurope.com/tags/Regal" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Regal</a> <a href="https://pubeurope.com/tags/Relevance" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Relevance</a> <a href="https://pubeurope.com/tags/RelevanceAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#RelevanceAI</a> <a href="https://pubeurope.com/tags/Roku" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#Roku</a>

jordanWith all the <a href="https://mastodon.jordanwages.com/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AI</a> alignment problems that need to be solved these days, <a href="https://mastodon.jordanwages.com/tags/philosophy" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#philosophy</a> majors should be seeing record numbers of <a href="https://mastodon.jordanwages.com/tags/employment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#employment</a>. Golden age.<a href="https://mastodon.jordanwages.com/tags/deepthoughts" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#deepthoughts</a> <a href="https://mastodon.jordanwages.com/tags/jobs" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#jobs</a> <a href="https://mastodon.jordanwages.com/tags/aialignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aialignment</a> <a href="https://mastodon.jordanwages.com/tags/alignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#alignment</a>

Mark Abraham“We need to do empirical experiments on how these things try to escape control,” Hinton told @andersen. “After they’ve taken over, it’s too late to do the experiments.” @TheAtlantic @OpenAI <a href="https://mastodon.world/tags/aialignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aialignment</a> <a href="https://mastodon.world/tags/ai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#ai</a>

William Gunn<a href="https://social.coop/@judell" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@judell</a> The lesswrong <a href="https://mastodon.social/tags/aialignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aialignment</a> crowd just might have a point about inner and outer objectives not necessarily being aligned.

Digital Humanities Uni PotsdamThe <a href="https://hcommons.social/tags/DH2023" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#DH2023</a> closing keynote Claire Fernandez <a href="https://eupolicy.social/@CFerKic" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@CFerKic</a> of <a href="https://eupolicy.social/@edri" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@edri</a> has a good point here.. <a href="https://hcommons.social/tags/aiethics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aiethics</a> <a href="https://hcommons.social/tags/generativeai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#generativeai</a> <a href="https://hcommons.social/tags/aialignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aialignment</a> <a href="https://hcommons.social/tags/sustainableai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#sustainableai</a> <a href="https://hcommons.social/tags/ChatGPT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#ChatGPT</a>

Hobson Lane<a href="https://mstdn.social/@rysiek" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@rysiek</a> <a href="https://pleroma.pch.net/users/woody" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@woody</a> The first step in controlling or regulating AI is predicting what it will do next. ( <a href="https://mstdn.social/tags/AIControlProblem" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AIControlProblem</a> <a href="https://mstdn.social/tags/AISafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AISafety</a> <a href="https://mstdn.social/tags/AIAlignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AIAlignment</a> - <a href="https://en.m.wikipedia.org/wiki/AI_alignment" rel="nofollow noopener noreferrer" target="_blank">https://en.m.wikipedia.org/wiki/AI_alignment</a> )And to predict what a system will do next you have to first get good at explaining why it did what it did the last time.The smartest researchers think we're decades away from being able to explain deep neural networks. So LLMs & self driving cars keep doing bad things.<a href="https://mstdn.social/tags/AIExplainability" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AIExplainability</a> - <a href="https://en.wikipedia.org/wiki/Explainable_artificial_intelligence" rel="nofollow noopener noreferrer" target="_blank">https://en.wikipedia.org/wiki/Explainable_artificial_intelligence</a>

Erik WesselHow an organization handles <a href="https://mstdn.party/tags/aiethics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aiethics</a> is an audition for how they will handle the problems of <a href="https://mstdn.party/tags/aisafety" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aisafety</a> and <a href="https://mstdn.party/tags/aialignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aialignment</a> further down the road. If you can’t be bothered to let take seriously the concrete concerns of your ethics team before deploying products, why would you take seriously the much more complicated and novel risks of <a href="https://mstdn.party/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AI</a> alignment that AI safety experts worry about?<a href="https://www.washingtonpost.com/technology/2023/03/30/tech-companies-cut-ai-ethics/" rel="nofollow noopener noreferrer" target="_blank">https://www.washingtonpost.com/technology/2023/03/30/tech-companies-cut-ai-ethics/</a>

Roban Hultman KramerAnyway, I keep meaning to write up a blog post on “falsehoods I have believed about measuring model performance” touching on <a href="https://sigmoid.social/tags/AppliedML" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AppliedML</a> issues related to <a href="https://sigmoid.social/tags/modelEvaluation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#modelEvaluation</a>, <a href="https://sigmoid.social/tags/metrics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#metrics</a>, <a href="https://sigmoid.social/tags/monitoring" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#monitoring</a>, <a href="https://sigmoid.social/tags/observability" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#observability</a>, and <a href="https://sigmoid.social/tags/experiments" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#experiments</a> (<a href="https://sigmoid.social/tags/RCTs" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#RCTs</a>). The cool kids would call this <a href="https://sigmoid.social/tags/AIAlignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AIAlignment</a> in their VC pitch decks, but even us <a href="https://sigmoid.social/tags/NormCore" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#NormCore</a> ML engineers have to wrestle with how to measure and optimize the real-world impact of our models.

Nathaniel Virgo<a href="https://mathstodon.xyz/tags/introduction" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#introduction</a>I'm an associate professor at ELSI in Tokyo. I'm into <a href="https://mathstodon.xyz/tags/ComplexSystems" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#ComplexSystems</a>, <a href="https://mathstodon.xyz/tags/ArtificialLife" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#ArtificialLife</a>, <a href="https://mathstodon.xyz/tags/OriginOfLife" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#OriginOfLife</a> and <a href="https://mathstodon.xyz/tags/AppliedCategoryTheory" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#AppliedCategoryTheory</a>.Lately I'm really into the question of "what is an agent" and the foundations of Bayesian reasoning and decisition making. This means my interests overlap quite a bit with the <a href="https://mathstodon.xyz/tags/aialignment" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#aialignment</a> crowd, although my main motivation is understanding where agency came from in biology.

Recherches récentes

Options de recherche

Administré par :

Statistiques du serveur :

#aialignment