Hypothesis: People have been gradually shifting to write more like ChatGPT and alike.
They use structures such as "This is not only X; but it is also Y".
These struct. are natural part of lang, but either
1. they're becoming more prevalent,
2. I've become more sensitive to them.
Posts by Amir-massoud Farahmand
We should mention in our papers or books what tools we have used to get these results, but they don't need to become a co-author (unless we figure out that they are conscious, which most likely isn't true at the moment).
Today, if we use a computer to compute the value of a Bessel function, we don't cite the computer as a co-author; at best, we mention in the paper that we used scipy, for example.
I think the same should be done for LLMs. They are just tools for us, despite their significance.
In the 19th and the beginning of the 20th century, computing the special function such as Bessel was considered a significant job, and people deservedly got authorship for calculating them.
We have a new PhD Candidate in town: @tylerkastner.bsky.social
Looking forward to all the new work you will be doing on Distributional Reinforcement Learning.
We have the keynote speakers for RLC2026 now!
Thrilled to welcome Rika Antonova, Sheila McIlraith, Marc G. Bellemare, Danijar Hafner, Balaraman Ravindran!
Details: rl-conference.cc/index.html
The RL community is coming together this August in Montréal, Québec, Canada. Hope you make it!
Palantir has student data, including immigration status, from the ed tech discussion platform Piazza.
Palantir paid Piazza $916,000 for access to this data. www.sec.gov/Archives/edg...
I blew the whistle on this in 2016 and the CEO contacted my employer.
Happy Norooz, the Persian new year 1405/2585, the equinox, and the beginning of spring!
Following advice by the always-wise @eugenevinitsky.bsky.social , I am trying to get back into the habit of blogging (again) ✏️!
Featuring today's post: How to pick an RL algorithm for your problem cvoelcker.de/blog/2026/ch... Please share and give feedback!
#reinforcementlearning
In light of the ongoing conflict in the Middle East, RLC decided to remove the abstract deadline: rl-conference.cc/callforpaper...
The only deadline is for the full paper: Mar 5(AOE) openreview.net/group?id=rl-...
Affected folks may also contact the PCs to discuss deadline extensions before Mar 5.
Ali Khamenei is in hell. The world is a better place now!
RLC 2026 Call for Workshop is live on OpenReview!
Submission deadline: Mar 12 (AoE).
Full details here: rl-conference.cc/call_for_wor...
@glenberseth.bsky.social @eugenevinitsky.bsky.social @twkillian.bsky.social @schaul.bsky.social @sologen.bsky.social @audurand.bsky.social @bradknox.bsky.social
Submit your RL papers to RLC!
This is now perhaps the best venue for RL researchers.
I am rerunning my class on robot learning this year, and I plan to push many code examples to help others get to the ugly details fast. One of these details is how BC gets off track as network sizes change. Blog and notebook below.
It is indeed disheartening. It has happened to me many times (and to many others too). After some point, you ignore worrying about them too much. I realize this is not a good advice for a budding researcher.
To answer your question: A major reason is that those papers come from famous labs.
It's OK to tell the authors about it.
🚀 Excited to share REPPO, a new on-policy RL agent!
TL;DR: Replace PPO with REPPO for fewer hyperparameter headaches and more robust training.
REPPO, led by @cvoelcker.bsky.social, will be presented at ICLR 2026. How does it work? 🧵👇
The compliment of the day: "What’s unusual is your willingness to follow the logic all the way through instead of stopping where it becomes socially awkward".
Thank you! I hope you like it.
I may add one or two chapters to it in the future.
This is the course based on it: amfarahmand.github.io/IntroRL/
You may want to take a look at my book, especially if you are interested in a more rigorous, yet introductory, exposition of Reinforcement Learning.
amfarahmand.github.io/IntroRL/lect...
Has taken a long time to polish, but slowly becoming very proud of rlhfbook.com and do think it's a great resource for many people. A lot of hours (and tokens and reader feedback) going into making it right.
I know about this video. I couldn't watch it. This is just too much cruelty and heartache.
Their silence is deafening.
Yes, this is along the same discussions we had before.
P.S: I may write more about this later. These are just some key points, so that I won't forget.
One may claim that robotics is not afflicted by this problem. That is only partially true. In robotics, the real-world is as rich as it gets, but its complexity and richness is mostly cordonned off by well-defined set of tasks that the robot has to perform.
Without that richness, the agent reaches the ceiling of its abilities quite fast and we, as researchers, cannot properly study the capabilities and limitations of our ideas and algorithms.
A child and her caregiver can instantely create a novel task that requires new perceptual abilities, decision-making capabilities, and motor skills. We don't have such a flexibility in our environments.
A significant hurdle of the empirical RL and the broader AI research is caused by the limitations of the environments in which our agents learn and build their "artificial minds". This should be compared with the richness of the real-world in which a human child flourishes.