#reinforcementlearning hashtag - Bluesky

7 hours ago

Alibaba's New FIPO Algorithm Doubles AI Reasoning Depth Alibaba's Qwen team has released FIPO, a reinforcement learning algorithm that doubles AI reasoning depth by weighting tokens based on downstream impact.

winbuzzer.com/2026/04/05/a...

Alibaba's New FIPO Algorithm Doubles AI Reasoning Depth

#AI #Alibaba #Qwen #LLMs #ReinforcementLearning #AIModels #MachineLearning #AIModelDevelopment #ChinaAI #AlibabaCloud

1 0 0 0

aaronlemke.bsky.social

@aaronlemke.bsky.social

1 day ago

Finally taking the plunge to learn how to train my own neural networks. Claude is a patient teacher, answering all my questions as we go. I built out this dashboard to monitor training runs and keep an eye on my cloud GPUs. #AI #ReinforcementLearning #ClaudeCode

Finally taking the plunge to learn how to train my own neural networks. Claude is a patient teacher, answering all my questions as we go. I built out this dashboard to monitor training runs and keep an eye on my cloud GPUs.

#AI #ReinforcementLearning #ClaudeCode

1 0 0 0

Shane O'Mara

@shanewriter.bsky.social

1 day ago

the (forgotten) ancestry of reinforcement learning, part 1 skinner, hebb, and what lies within

latest post: the (forgotten) ancestry of reinforcement learning: skinner, hebb, and what lies within

Some thoughts on why the originator of reinforcement learning is overlooked today...

www.brainpizza.com/p/the-forgot...

#reinforcementlearning #bfskinner #dohebb #psychskyence #neuroskyence

8 5 0 1

EkasCloud – Personalized Training Platform

@ekascloud.bsky.social

1 day ago

Reinforcement Learning in Everyday Apps: The Silent Revolution Reinforcement Learning in Everyday Apps: The Silent Revolution Introduction: The AI You Use Without Knowing It When people think about artificial intelligence, they often imagine humanoid rob...

Reinforcement Learning in Everyday Apps: The Silent Revolution
www.ekascloud.com/our-blog/rei...
#ReinforcementLearning #ArtificialIntelligence #MachineLearning #AI #DeepLearning #AIApplications #SmartApps #TechInnovation #FutureOfTech #Automation #DataScience #IntelligentSystems #AIRevolution

1 0 0 0

EkasCloud – Personalized Training Platform

@ekascloud.bsky.social

6 days ago

Post from EkasCloud Online Courses - YouTube Reinforcement Learning in Everyday Apps — The Silent Revolution #ReinforcementLearning #MachineLearning #ArtificialIntelligence #AIApps #SmartTechnology #Tec...

Reinforcement Learning in Everyday Apps — The Silent Revolution
www.youtube.com/post/UgkxMzf...
#ReinforcementLearning #MachineLearning #ArtificialIntelligence #AIApps #SmartTechnology #TechInnovation #FutureTech #DataScience

0 0 0 0

Daniel Bösch

@encrux.bsky.social

1 week ago

Just set up my own blog and wrote the first post about teaching AI to play my favorite childhood video game: boesch.dev/posts/ddnet-... #machinelearning #AI #reinforcementlearning

1 0 0 0

Rationality & Society

@rss-journal.bsky.social

1 week ago

Sven Banisch, @eckolb.bsky.social et al. built an #ABM where #ReinforcementLearning users choose #SocialMedia platforms based on social approval and #diversity. They show a #SocialDilemma: #EchoChambers can arise even when users prefer diversity, making everyone worse off.

🔗 doi.org/10.1177/1043...

3 2 0 0

ByteJournal

@bytejournal.bsky.social

1 week ago

"Unlock the secrets of autonomous systems with Explainable RL 🤖! Boost performance, trust, and accou

"Unlock the secrets of autonomous systems with Explainable RL 🤖! Boost performance, trust, and accountability. Learn more! #ExplainableAI #ReinforcementLearning #AI"

🔗 bytejournal.online/blog/explainable-reinfor...

0 0 0 0

SiegeLord

@siegelordex.bsky.social

2 weeks ago

Got the robot to lift its feet in sim. Was going to try it on the real robot, but it stopped detecting my IMU after I removed an unused display: turns out that display was the only thing with a pull-up resistor to make the I2C bus work.

#robotics #reinforcementlearning

0 0 0 0

Gerrit Eicker

@eicker.bsky.social

2 weeks ago

Chinese #AIstartup #MiniMax has released its new proprietary LLM, M2.7, which is designed to power #AIagents and third-party tools. The model is notable for its #selfevolving capabilities, handling 30-50% of its own #reinforcementlearning workflow.…

0 0 0 0

Transactions on Artificial Intelligence

@transaai.bsky.social

2 weeks ago

#newpaper #TAI #ReinforcementLearning Combining constraint-aware offline learning with runtime safety filtering provides a practical pathway toward safe and effective RL-based clinical decision support systems. Please view the full article at www.sciltp.com/journals/tai...

0 0 0 0

AI Daily Post

@aidailypost.com

2 weeks ago

MiniMax M2.7 just turned RL research into a co‑pilot—self‑evolving, polyglot code, and handling 30‑50% of the workflow. Could this be the first step toward GPT‑5‑level automation? Dive in to see how it reshapes ML engineering. #MiniMaxM27 #SelfEvolvingAI #ReinforcementLearning

🔗

2 0 0 0

Shawn Hymel

@shawnhymel.bsky.social

2 weeks ago

I’m considering making a video series that teaches #ReinforcementLearning using a 2-wheel balancing bot. Would you be interested in learning that? If you've done RL, what frameworks do you recommend?
👇
shawnhymel.com/3219/an-idea...

#edgeAI #AI #embedded #robotics #education

8 1 2 0

Raphaël Fonteneau

@raphfont.bsky.social

2 weeks ago

Today, we welcome Dr. Ir. @adrienbolland.bsky.social in the context of the Reinforcement Learning class for a lesson about policy gradient methods. Many thanks to Adrien for sharing his knowledge about these methods that have enabled many successful implementations! #ReinforcementLearning

3 1 0 0

Markus Wulfmeier

@mwulfmeier.bsky.social

2 weeks ago

Review request:
As usual for the time of year, I'll be looking for #IROS2026 reviewers. Highly interesting stack of papers on #ReinforcementLearning #Sim2Real #RewardLearning #LLMs #DataEfficiency #RoboticManipulation

Reach out with your ID or papercept registered mail address and background.

1 0 0 0

Claas Voelcker

@cvoelcker.bsky.social

2 weeks ago

cookie monster is sitting at a table with a tray of food and the words choices written on it Alt: cookie monster is sitting at a table with a tray of food and the words choices written on it

Following advice by the always-wise @eugenevinitsky.bsky.social , I am trying to get back into the habit of blogging (again) ✏️!

Featuring today's post: How to pick an RL algorithm for your problem cvoelcker.de/blog/2026/ch... Please share and give feedback!

#reinforcementlearning

30 4 2 2

allPhoto Bangkok / Advanced Ventures

@allphotobangkok.bsky.social

3 weeks ago

These aren’t AI firms, they’re defense contractors. We can’t let them hide behind their models From Gaza to Iran, the pattern is the same: precision weapons, chosen blindness, and dead children. The cost of failing to regulate AI warfare is already too high

AI warfare's cost is high with precision weapons, chosen blindness, and civilian casualties. The 'fog procedure' exemplifies this dangerous trend.
www.theguardian.com/us-news/ng-interactive/2...
#AI #AIethics #MachineLearning #ReinforcementLearning #AIS...

2 0 0 0

Ezgi Korkmaz

@ezgikorkmaz.bsky.social

3 weeks ago

✨Two single author papers accepted to ICLR 2026!✨

Truly excited to present these results at #ICLR2026 !
@iclr-conf.bsky.social #ICLR26 #ReinforcementLearning

0 1 0 0

FierceMind

@ostroumni.bsky.social

3 weeks ago

🚀 Google discovered:

AI agents learn to COOPERATE on their own when trained against diverse and unpredictable opponents!

#AI #GoogleAI #MultiAgent #ReinforcementLearning #LLM #AISystems

1 0 0 0

FierceMind

@ostroumni.bsky.social

3 weeks ago

🚀 Google discovered:

AI agents learn to COOPERATE on their own when trained against diverse and unpredictable opponents!

#AI #GoogleAI #MultiAgent #ReinforcementLearning #LLM #AISystems

1 0 0 0

AI Daily Post

@aidailypost.com

3 weeks ago

Google’s new research shows AI agents can team up and outsmart unpredictable opponents using standard RL and decentralized training. Curious how GRPO drives cooperative strategies? Dive in! #AIAgents #ReinforcementLearning #MultiAgentLearning

🔗 aidailypost.com/news/google-...

0 0 0 0

Awesome Agents

@awesomeagents.bsky.social

3 weeks ago

16 Open-Source RL Libraries, One Shared GPU Bottleneck A Hugging Face survey of 16 open-source reinforcement learning libraries finds the entire ecosystem has converged on async disaggregated training to fix a single brutal bottleneck: GPU idle time during long rollouts.

16 Open-Source RL Libraries, One Shared GPU Bottleneck

awesomeagents.ai/news/huggingface-async-r...

#HuggingFace #ReinforcementLearning #OpenSource

1 0 0 0

Alexis Kirke

@alexiskirke.bsky.social

3 weeks ago

Image

I discovered this thought-provoking paper about RoboPocket - a new way to boost robot learning with real-time feedback from your phone. No fancy gear needed! See link below. #robotics #reinforcementlearning #humantech
https://arxiv.org/abs/2603.05504

0 0 0 0

David @ InnoVirtuoso

@innovirtuoso.bsky.social

3 weeks ago

🚀 Check out "The AI That Learned to Play with Itself" — researchers let a neural network play a game against copies of itself! 🤖💥 It discovered strategies humans hadn’t thought of! Talk about self-improvement! 🔄 #AI #ReinforcementLearning #MindBlown

3 0 0 0

Winbuzzer

@winbuzzer.com

4 weeks ago

winbuzzer.com/2026/03/05/d...

New Databricks KARL RAG Agent Promises 33% Cost Reduction vs. Claude Opus 4.6

#AI #Databricks #DatabricksKARL #Anthropic #Claude #GenerativeAI #MachineLearning #AIAgents #EnterpriseAI #RAG #KARL #ReinforcementLearning

0 0 0 0

Winbuzzer

@winbuzzer.com

4 weeks ago

OpenAI VP Joins Anthropic After Pentagon Deal Backlash OpenAI VP Max Schwarzer has joined Anthropic, citing trusted colleagues and shared values, hours after backlash over OpenAI's Pentagon military AI deal.

winbuzzer.com/2026/03/06/o...

OpenAI's Post Training Lead Max Schwarzer Joins Anthropic After Pentagon Deal Backlash

#AI #ChatGPT #Anthropic #Claude #OpenAI #MaxSchwarzer #Pentagon #ReinforcementLearning

0 0 0 0

Wahnsinnwissen.de

@wahnsinnwissen.bsky.social

1 month ago

Richard S. #KünstlicheIntelligenz #LernenausErfahrung #ReinforcementLearning #RichardSutton #Sprachmodelle
wahnsinnwissen.de/?p=1124

0 0 0 0

Awesome Agents

@awesomeagents.bsky.social

1 month ago

OpenClaw-RL Lets You Train a Personal AI Agent Just by Talking to It Gen-Verse's new open-source framework uses asynchronous reinforcement learning to personalize LLMs through natural conversation - no labeling, no datasets, just feedback.

OpenClaw-RL Lets You Train a Personal AI Agent Just by Talking to It

awesomeagents.ai/news/openclaw-rl-persona...

#Openclaw #ReinforcementLearning #OpenSource

2 0 2 0

Ezgi Korkmaz

@ezgikorkmaz.bsky.social

1 month ago

✨Two single author papers accepted to ICLR 2026!✨

Truly excited to present these results at #ICLR2026 !

@iclr-conf.bsky.social #ICLR26 #DeepRL #ICLR #ReinforcementLearning

0 0 0 0