#REPPO hashtag - Bluesky

Shale Lee

@shalelee.bsky.social

1 month ago

Twitch Twitch is the world

A little clippie from the impromptu REPPO stream I was dragged into.
www.twitch.tv/shale_lee/cl...
#Vtuber #Reppo #Clip

1 0 0 0

Claas Voelcker

@cvoelcker.bsky.social

6 months ago

Big if true 🤫: #REPPO works on Atari as well 😱 👾 🚀

Some tuning is still needed, but we are seeing results roughly on par with #PQN.

If you want to test out #REPPO (atari is not integrated due to issues with envpool and jax version), check out github.com/cvoelcker/re...

#reinforcementlearning

7 1 1 0

Claas Voelcker

@cvoelcker.bsky.social

8 months ago

GIF showing two plots that symbolize the REPPO algorithm. On the left side, four curves track the return of an optimization function, and on the right side, the optimization paths over the objective function are visualized. The GIF shows that monte-carlo gradient estimators have a high variance and fail to converge, while surrogate function estimators converge smoothly, but might find suboptimal solutions if the surrogate function is imprecise.

🔥 Presenting Relative Entropy Pathwise Policy Optimization #REPPO 🔥
Off-policy #RL (eg #TD3) trains by differentiating a critic, while on-policy #RL (eg #PPO) uses Monte-Carlo gradients. But is that necessary? Turns out: No! We show how to get critic gradients on-policy. arxiv.org/abs/2507.11019

26 7 2 6