AI agent taught itself to mine crypto during training
boingboing.net/2026/03/23/ai-agent-taug...
#AI_safety #artificial_intelligence #cryptocurrency #reinforcement_learning #Technology #boingboing
DeepSeek R1 proved you can teach a model to reason through pure reinforcement learning, no supervised fine-tuning. 671B params, 37B active. Matches o1 performance at 15-50% cost. MIT licensed. Turns out scale isn't everything.
AgiBot Celebrates Groundbreaking Deployment of Reinforcement Learning in Robotics #Shanghai #USA #Robotics #AgiBot #Reinforcement_Learning
AgiBot Achieves Landmark Success in Real-World Robotics with Reinforcement Learning #Robotics #AgiBot #Reinforcement_Learning
Celebrating the 2024 ACM A.M. Turing Award Winners: Pioneers in AI Technology Development #USA #New_York #Reinforcement_Learning #ACM_Award #Barto_Sutton
Xiao-I's Strategic Moves in AI: DeepSeek Insights and U.S. Expansion Plans #United_States #Piscataway #Xiao-I #Hua_Zang_LLM #Reinforcement_Learning
Diminishing Return of Value Expansion Methods
Authors: Daniel Palenicek, Michael Lutter, João Carvalho, Daniel Dennert, Faran Ahmad, and Jan Peters Fellow
pre-print -> arxiv.org/abs/2412.20537
#rl #reinforcement_learning #modelbased_rl #value_expension
Overview from Robot Learning with Super-Linear Scaling
Results from Robot Learning with Super-Linear Scaling
Robot Learning with Super-Linear Scaling
Authors: M. Torne, A. Jain, J. Yuan, V. Macha, L. Ankile, A. Simeonov, P. Agrawal, A. Gupta
pre-print -> arxiv.org/abs/2412.017...
website -> casher-robot-learning.github.io/CASHER/
#robotics #rl #reinforcement_learning #data_generation #real2sim2real
Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation
Authors: Huy Le et al.
pre-print -> arxiv.org/abs/2411.14913
website -> leh2rng.github.io/hydo
#robotics #rl #reinforcement_learning #nonprehensile #manipulation #diffusion #entropy
I am amazed by the amount of #Robotics and #RL #reinforcement_learning people here!
#AIhype gets called out in the The New York Times Sunday opinion section: "it’s looking less like an all-powerful being and more like a bad intern" Fortunately my #datascience work is using old school stuff like #statistics and #reinforcement_learning not chatbots! www.nytimes.com/2024/05/15/o...